Broadly, artificial intelligence involves a machine doing something that only a human would be able to do. That said, computer scientists disagree on if certain computing capabilities from several years ago still constitute AI. Nowadays, many of these capabilities might just be called software.
The modern resurgence in AI interest has been fueled by advancements in a very specific kind of computing: machine learning. We often use artificial intelligence and machine learning interchangeably here at Emerj, but many computer scientists like to separate the two. There is (and likely always will be) a debate in the field over what exactly constitutes artificial intelligence. Some computer scientists do not consider computing capabilities AI unless they involve machine learning.
These scientists may continue to shift their parameters for artificial intelligence until artificial general intelligence, or AGI, is achieved. The development of AGI—the ability for a computer to perform any intellectual task a human could—is the goal for many computer science researchers, but it may take many years to achieve it, and it deserves its own article for another time.
One thing researchers seem to agree on is that machine learning falls under the umbrella field of artificial intelligence in some way, which itself falls under the computer science discipline. Deep learning, a topic for a later article, is a subset of machine learning. This conceptualization is illustrated by NVIDIA below:
Yoshua Bengio, one of the most preeminent deep learning researchers of the last two decades, provided us his own definition of machine learning:
Machine learning research is part of research on artificial intelligence, seeking to provide knowledge to computers through data, observations, and interacting with the world. That acquired knowledge allows computers to correctly generalize to new settings.
Although machine learning dominates the AI mindshare today, AI was once approached in a very different way.
Expert Systems and Earlier Approaches to Artificial Intelligence
Prior to machine learning advancements in the late 2000s and early 2010s, AI interest revolved around an entirely separate computing ability. Expert systems dominated the AI mindshare in the 60s and 70s. Developers sought to mimic human thought and decision-making by conceptualizing it as a series of if-then statements. An expert system is, in essence, a large web of if-then scenarios through which a query is filtered to achieve some preprogrammed end result. The if-then statements behind an expert system are hard-coded into the software. As such, the AI will respond to certain inputs the same way every time.
These if-then scenarios need to be informed by, fittingly, domain experts if the resulting software is to have any practical use in industry. In order to build an expert system to know what to do when presented with a certain infectious disease, for example, developers need to somehow base the software’s if-then scenarios on what an expert on infectious diseases might do in when presented with an infectious disease.
For example, a developer could interview 40 different healthcare experts on infectious diseases and ask them a series of questions regarding symptoms and treatments and hard-code their responses into an expert system. This requires a lot of forethought and planning on the part of the software developers. They need to work with domain experts to list out all of the possible questions someone might ask about a specific topic and then coax out all of the possible answers to those questions. If they fail to account for a question or answer, the expert system would not be able to provide an accurate answer to a user’s question.
Another example might involve customer support tickets. An expert system could be built on the following if-then scenario: “If the body of the email contains the word ‘refund,’ then route the ticket to the refund ticket bucket.” This certainly seems like a reasonable rule, and it indeed will likely route the majority of refund tickets to the appropriate bucket. That rule does not account for support tickets in which customers talk about refund-related concepts or use refund-related phrases without using the word “refund.”
A customer might say, “I’m calling my bank if you don’t call me back.” A human support agent with context on the business might know that tickets like this usually involve a customer not knowing the charge on their account is for the yearly subscription service they signed up for. The agent might also know that in almost all circumstances, the customer wants a refund on that charge. An expert system-based software is never going to be able to refund those tickets into the refund bucket.
The Limitations of Expert Systems
Theoretically, someone with context on the business’ customer support tickets would be able to relay information about this scenario to the developer building the expert system before they build it. The if-then rule might be something like, “If the body of the email contains the word ‘bank,’ then route the ticket to the refund ticket bucket.”
However, if the business only recently started selling their subscription service, its expert systems-based customer support software would not be able to adapt to the kinds of tickets that are coming into the system and making vague references to the subscription service such as the example above. Those tickets just wouldn’t be routed into the refund bucket until the business contacted a developer to update the software with another if-then rule.
Working around this limitation is impractical, and this was the largest reason why interest in expert systems (and as a result, AI in general), waned for a period known as the “AI Winter.”
Machine Learning and Neural Networks
With the advent of the Internet, vast quantities of data ranging from online purchases to insurance claims payments became digitally available. Data is now the norm, and even the smallest firms store their data in digital formats.
Machine learning is a way of getting computers to mimic human thought and decision-making in a way entirely different than expert systems. If a human had the ability to store, access, and make sense of billions of data points in their brains with which they could make decisions, they might make those decisions very differently than the way we make decisions now; in any case, a decision made on more information and context is in the vast majority of cases better than one that is made on less information and less context.
In a nutshell, machine learning models make decisions on billions of data points. They make sense of that data and turn it into likelihoods that fuel their outputs. This is very different from expert systems that only have one output per if-then rule, one “then” per “if.” More importantly, machine learning models are built to adapt to new, unexpected inputs. An expert system won’t know what to do with a refund ticket that doesn’t fall into its rules for refund tickets, but over time a machine learning model can start routing “I’m calling my bank” into the refund bucket in response to human feedback.
The Adaptability of Machine Learning
If a human indicates to the model when its routing is correct or incorrect, it can use that feedback to inform the likelihoods on which it bases its ticket routing. Although we advise against personifying AI, it essentially asks itself “what is the likelihood that this ticket should be routed into the refund bucket?” whenever it is presented with a support ticket. If it determines a high likelihood, the ticket will be routed to the refund bucket. If it determines a low likelihood, the model might be programmed to flag the ticket for review by a human agent.
This adaptability is the key difference between machine learning and expert systems, and it’s why some computer scientists don’t consider expert systems and other computing abilities AI anymore. It’s also the basis for Stanford’s definition of machine learning: “the science of getting computers to act without being explicitly programmed.”
One example of this adaptability is Netflix’s recommendation engine. When a new user on the platform signs in from a location in Oklahoma for the first time, the recommendation engine has very little data on that user apart from their location, which it might obtain from the user’s IP address. Netflix does, however, have several million data points describing other users from Oklahoma. The recommendation engine might use that data to make general assumptions about what this new user might want to see based on past interactions with similar users.
As the user continues to interact with Netflix, the data on which shows they choose to watch, when they pause those shows or stopping watch them entirely, and which shows they watch in succession informs the machine learning model on what shows to recommend the user. The model responds to the user’s interactions and adapts to their preferences. The user’s data also informs recommendations for others users with similar preferences and of similar demographics as the first.
At its core, machine learning involves training a machine on large volumes of data to pick up on patterns within that data that it might use to determine the likelihood of success with a certain output.
The Limitations of Machine Learning
Machine learning has its limitations, and in fact, it is worse than expert systems when it comes to one core concept: interpretability.
One can follow a series of if-then rules backward to figure out how an expert system resulted in a particular output. This allows developers to fix these rules if it turns out their answer, their “then,” is incorrect. Expert systems are highly transparent, which is helpful, even necessary in some domains.
If a patient asks their doctor why they diagnosed them with an illness and that doctor made their diagnosis based on the output of an expert system, the doctor can answer that question. They could theoretically read through the expert system’s if-then rules that lead to its output, to the patient’s diagnosis.
This is not the case for machine learning models, which are much more complex than if-then trees. The neural network behind the machine learning model might look as follows:
If a doctor makes their diagnosis based on the output of a machine learning model, they won’t be able to explain themselves to the patient. A machine learning model provides outputs based on patterns it coaxes out itself within a dataset. A human provides the machine learning algorithm the data without any context, and the algorithm provides some outcome it determines based on patterns humans, at present time, can’t discern.
The machine learning model may have recommended a diagnosis for the patient based on any number of data points. It may have been because of the anomalies in a patient’s CT scans. It may just as well have been because patients of their demographics, with their first name, and with their history of insurance claims are more likely to be diagnosed with the particular illness than other people. There’s no way for the doctor to confirm or deny either.
This problem is what is referred to as the “black box” of AI. Machine learning models can make predictions and recommendations by finding patterns in data at a scale humans are incapable of, but no one can explain how or why the model makes those predictions and recommendations. There is no transparency. This is a major problem in some industries, as we discuss in our report, Applying Artificial Intelligence in B2B and B2C – What’s the Difference? The black box is such a concern for computer science researchers, that Geoffrey Hinton, known as the “Godfather of AI,” even suggests “throw[ing] it all away and [start]ing again.”
Key Takeaways for Business Leaders
The vast majority of the AI solutions business leaders might consider and that we cover here at Emerj are indeed machine learning solutions. Business leaders can use AI in conversation and expect that their data scientists will understand they are are referring to machine learning. Expert systems are generally considered AI from a historical perspective, but computing capabilities that were developed prior to the late 80s are generally not what people are referring to when they talk about artificial intelligence.
It’s possible that in a decade or two, machine learning itself will face a similar fate, relegated to computer science history as a computing capability that served its purpose at the time but ultimately gave way to something more sophisticated and perhaps more interpretable. Alternatively, machine learning might not be abandoned, but instead become so ubiquitous that it’s no longer referred to as AI.
Business leaders can think of expert systems and machine learning as existing at opposite ends of the AI spectrum. Right now, developers don’t often build expert systems when they set out to build AI solutions; they build machine learning models. They are two distinct methods of going about achieving the same goal of artificial intelligence: getting computers to do intellectual tasks traditionally reserved for humans. Machine learning and expert systems are subsets of AI, which is a subset of computer science as a whole.
Header Image Credit: Slate