Episode Summary: While we’ve featured quite a few companies that use and implement AI systems, we’ve more rarely gone behind the scenes with companies or consultants providing AI-related services to companies. In this week’s episode, we talk with Machine Learning Consultant Charles Martin, a data scientist and machine learning expert who has done freelance consulting on machine learning systems at companies including eBay, GoDaddy, and Aardvark. In this interview, Charles talks about the areas in AI that he believes are ripe for implementation in a business context, and where he sees businesses getting AI ‘wrong’ before getting to the hard work of implementing systems that work for them.
Expertise: Machine learning and data science
Recognition in Brief: Charles Martin is a freelance consultant in machine learning, data science, and software development. He received his PhD in Theoretical Chemistry from the University of Chicago, and was one of two nationwide NSF Fellows. He has worked on machine learning (ML) systems running in production at companies like GoDaddy, Aardvark (Google), eBay, and Blackrock. In 2011, he used machine learning to help Demand Media become the first $1B IPO since Google. He has have over 15 years of commercial data science, software engineering, and machine learning experience, and is a full stack developer for web, object-oriented, and numerical programming.
Current Affiliations: Chief data scientist at Calculation Consulting
Machine Learning Trends in Business
In the late 90s when the internet formally emerged, companies like Google and Yahoo (and several others) were offering something new – broad search engines. Though he’d had previous experience working with early machine learning systems, Charles Martin “cut his teeth” doing machine learning at eBay, where one of the primary uses of search is to find what’s most relevant to the user based on their past search queries and purchases. As a consultant in the machine learning industry today, Martin is starting to see more companies identifying search as a tool that they need as part of their online presence, whether it’s Walmart.com imitating Amazon or a new kind of customized engine, like one specifically for auto parts (for example).
Any good search box requires machine learning, and the right algorithm is key to getting people to actually use the platform. “It’s not just about finding insights in data; you’re trying to figure out how to show the right information, but not so much that it gets stale and not so random that it seems like spam,” says Martin. This is even more prevalent on mobile, which can only show a handful of searches at one time. Google’s had a monopoly-like hold on the search engine field for some time.
“When I was at eBay, there was a saying, ‘No Google products’…it’s like a monopoly; you can’t run a business where you have one supplier or one other vendor controlling you…so there’s a real interest in developing external resources for that (better search engines).”
Charles mentions Amazon elastic search, which (though high performance) lacks a sophisticated level of machine learning and in his words is nothing more than a ‘look up table’. “It doesn’t have the reinforcement learning feedback loops to build a search platform that people expect,” he says. A smart engine system is all the more necessary when using a mobile device or voice search because there’s not as much flexibility at present to enter complex queries.
With a well-designed search system, users need to be able to enter in information, and the system needs to be able to learn from what you’ve done and correlate this information with behaviors of similar users; then, the system needs to provide personalized results information and the right amount of variety, a complicated problem that goes well beyond statistics and enters into the realm of handling (and refining) biases. Recommender systems, like those found on Amazon and Netflix, are also in high demand by more businesses who want to be able to recommend products to customers. Martin notes that the inherent challenge in building these systems is that they need to be customer- and business-specific.
When Charles worked at eHow, his team used machine learning to figure out how to make SEO into a science. “We were able to reverse engineer search and figure out what people were searching for, predicting their search demand,” says Charles. “If you know what people are searching for, then you can front run and create ads.” The other part of the success equation is the product recommender system, which Charles also worked on at eHow.
“People would search fairly random things in Google and get different results, a classic chaotic system…every time you entered into Google you would get a different result, and this would pour into eHow and we had a recommender that could figure out what people really wanted to know,” says Martin. There’s usually one dominant thing or search result that people want, says Charles, but people search for that need or desire using a variety of different terms and phrases, which is why it’s difficult to get master search relevance.
Another machine learning trend that Martin is seeing is the rise of image recognition, in large part thanks to the research efforts of Facebook. “I had a customer who suggested that he wanted to make a dating site and wanted (the system) to be able to recognize ethnicity in pictures, so all of a sudden image recognition and classification are being intermingled with recommender systems,” says Charles.
It’s clear that the “big guys” are driving the trends that other businesses are interested in implementing. “The consumer experience has become so good that businesses are trying to replicate this internally, or for customers the great ones we have on Google, Netflix, Amazon, etc.,” says Martin. He notes that these ideas are being pitched by individual product managers and vice presidents, not by those in board rooms.
“People are seeing this occur in their everyday activity, using Siri on their phone or using Facebook, and just recognizing ‘why can’t we do this as a company?’”
How do We Even the Playing Field?
As a consultant in the machine learning systems sphere, Martin has a unique ‘behind the scenes’ perspective in how these systems get created and implemented for a range of businesses. Once the brainstorming and blueprinting process has taken place, Martin’s job as a machine learning consultant is to design and build a working a demo for the end machine learning product. The usual expectation is that if a business comes prepared with the data and ideas, Martin and his team can get an initial demo out in three weeks, can get a working demo to management in three months, and launch the final working system in six to eight months, though a place like eBay might take two years because of the scale.
The development of open-source machine learning tools has started shifting access from what was once in the hands of only a few to more businesses who are curious about how it could align with or help meet company objectives. Martin described how his job as a consultant in the industry has also changed:
“I had to do (machine learning systems) 10 years ago by using academic code…if I wanted to do machine learning I had to build my own support vector machines; now, you have companies offering fairly stable open-source technologies, like Python and Spark…they’re simple but becoming more accessible to enterprise.”
Charles think there’s a huge interest by midsize and larger businesses in using machine learning, but radical shifts in how these systems operate and are built require radical shifts in strategy and mindset. Building a division of machine intelligence as an internal division with companies is becoming more important, as is having leadership that understands the potential of machine learning technologies.
One of the first things that Martin believes has to happen in order for machine learning to really take off in mainstream business is the condensing of machine learning technology into some platform stability. “People have to figure out what platforms to use. There have always been products like Oracle that were offered internally but no one knew how to use it because it was buried inside a database…the industry has to start stabilizing on the stacks that they use i.e. is it going to be Spark, Python (what TensorFlow is based on), etc., and I don’t think it’s happened yet,” says Charles.
The hope, says Martin, is that one of these platforms will eventually become standard, and that there will be deployment systems and consultants who will come in and train employees on how to use the technology. Of course, with the billions of investment money pouring into the machine learning industry by the likes of Elon Musk, Google, Facebook, and other big names, it’s almost inevitable the industry will continue to see big changes in new technologies and a more radical restructuring of industries.
Image credit: Charles Martin