When trying to delve deeper into machine learning, you often come across the metaphor of a “black box”. At the same time, there are a lot of conjecture and confusing definitions about this concept, so it’s quite hard to figure out what’s really going on.
So, let's break down what does a black box in machine learning mean, what situations black box models work best for, and what issues may be connected with this concept.
What is Black Box in Machine Learning?
The “black box” concept was first mentioned during the development of cybernetics and behaviorism and referred to a system that could be observed from the point of view of its inputs and outputs, without knowing anything about its internal mechanisms.
To better understand this concept, let’s take a look at the Skinner Box experiment, a tool meant to study learned behavior.
You get a box with input elements like switches and buttons, as well as output elements in the form of lights being turned on or off. As soon as you feed the set of inputs into the box, you’ll be able to see the corresponding outputs without seeing anything inside the box. Even if you could see how everything inside the box works, you’d have a hard time understanding why each component is placed where it is, and why it does what it does.
To understand this peculiar relationship, we need to focus on the fact that the components follow strict rules that dictate their individual behavior, while the general behavior of the whole system arises from their interactions.
Getting back to black boxes in artificial intelligence (AI) or machine learning (ML) systems, let’s consider why the black box is relevant here:
- In terms of closed-source software, users have no access to its internal workings and can only monitor inputs and outputs. Because they know nothing about why the software acts the way it does, the apparent algorithms and program systems all look like black boxes to the users.
- The “black box” metaphor is also relevant to machine learning in terms of human vision. As long as we can ask people to explain their actions, human behavior may seem quite transparent. However, given that people sometimes cannot explain their behavior even to themselves, it’s fair to say that they don’t always know the real cause of their actions. This is the reason why humans themselves are black boxes from the point of view of machine learning.
In What Cases Do Black Box Machine Learning Models Work Best?
Now that you know a little bit about the concept of a black box, let’s now find out in what situations you can justify using black box machine learning.
- It is the best when it comes to complex tasks and high difficulty level of feature engineering in some types of solutions, e.g. with image processing.
- It outperforms other models — including the white box, or even science-based ones — in applications like weather forecasting, genomics, and stock trading. Using black box model is justified in the short term and in case the company owns it, hence is ready to take responsibility for incorrect predictions. While taking hedge funds as an example, we’ll see that they make trading decisions leveraging the black box model they own.
- It is worth using when the comparative cost of failure is relatively low while the cost of success is pretty high. Let’s take targeted ads: if people don’t want the ad, it’s not going to cost you an arm and a leg. On the other hand, if you manage to reach the target audience with your ad, it drives traffic to your business site, improving your business results.
- Black box makes sense when it helps us learn new points of view on the problem, especially if its internal workings don’t play the key role. Surely, it doesn’t apply to such tasks as the forecasting of business revenues since it is particularly important to understand what interconnections will lead to a certain result.
What Issues are Connected With the Black Box?
Despite a lot of benefits AI and ML bring to everyday life, it’s still going through some growing pains. Aside from the problem of bias, artificial intelligence also faces the black box problem illustrated in the following:
- Machine learning is often called a black box because the processes between the input and output are not transparent at all: the only things people can observe are how the data is entered and what the final decisions are. As the neural network becomes more complex when the number of nodes increases, the model itself becomes less and less transparent.
- Because people have no idea of how AI makes decisions and can’t view its internal workings, they lose confidence in the model they can’t fully control. Consequently, the lack of trust usually leads to many AI failures.
How to Resolve These Issues?
Since the issues that occur within artificial intelligence or machine learning models may cause harm when the algorithms are applied to the critically important tasks, they require an immediate solution. Here’s how one can resolve the black box problems:
- Carefully design the ML system to make it more transparent and let the users analyze why the system takes certain decisions.
- Implement systematic governance practices by arranging the hidden hypothesis, managing ML algorithms more strictly, checking on the compilation, ensuring open-source algorithms and training data, etc.
- Use external tools to monitor how the ML system works. One such tool is ATMSeer presented by MIT, which lets users see how an automated machine learning system works.