A few weeks ago, Chinese software company Baidu released key parts of a key artificial intelligence/ speech recognition algorithm into the realm of open source, following in the footsteps of Facebook and Google last year.
Baidu is currently one of the large tech companies in the world, despite being limited to the Chinese internet and receiving little recognition outside of its home country. Baidu was founded by Robin Li in 2000, and has provided a variety of products and services, including a Mandarin-compatible search engine, their own version of Wikipedia, and a keyword-searchable discussion forum.
Baidu’s open-sourced machine learning application, called “Warp-CTC”, is based on an artificial intelligence approach called connectionist temporal classification, which the company claims helped to expand its natural language processing capability.
But how will this application improve the deep learning process? MIT Technology Review describes Warp-CTC as “an improved implementation of a deep-learning algorithm developed some time ago that’s been designed to run very quickly on the latest computer chips.” This implementation occurs through recurrent neural networks, or RNNs for short. These RNNs are perfect for operating in “loud” data environments, and can cut through the noise and glean patterns in overwhelming complex sets of data. The RNNs often act as the key code for translating speech into text and back, performing machine translation and inductive reasoning; this AI can also create hand-written statements, as detailed by Chloe Olewitz in Techly.
With respect to open-sourcing machine learning software, Baidu is late to the game, trailing not only Google and Facebook, but also Microsoft and IBM. Why are tech giants peeling off layers of their machine learning software for free? When asked this question, Baidu’s research team stated,
“We want to make end-to-end deep learning easier and faster so researchers can make more rapid progress…….We want to start contributing to the machine learning community by sharing an important piece of code that we created.”
There is also a financial incentive for the Chinese search engine; releasing these programs does not cost the company anything and often provides it with additional knowledge and research for little personal cost (in machine learning, training data is often more valuable than software). Democratizing machine learning allows companies like Baidu, Facebook, and Google to build and improve their product with any and all developments that third parties discover while using the program.
In addition, the more developers who accept Baidu’s tools as the “industry standard,” the higher Baidu rises as a premier data science player, and the more potential talent it’s likely able to attract. This matters, because the demand for data scientists far exceeds the supply, a trend that will likely increase over the next five years.
The software has already been proven useful in Deep Speech 2 (DS2), a developmental speech recognition program that Baidu developed in December 2015. DS2 would allow AI programs to understand and interpret English as well as Mandarin; however, other companies have expressed an interest in the code, including Nervana (acquired by Intel), a data analysis and computation startup.
While it’s impossible to know which open source tools will take hold in the domains of industry and research, Baidu has finally thrown it’s hat into the ring. May the best tech giant win.