Machine learning has become a hot topic in math and computer science over the past few years. Tons of resources have begun to pop up to help aspiring machine learning researchers to get familiar with the basic concepts of machine learning. New books, textbooks, and video courses have made such a complicated subject more accessible than ever, centralizing knowledge about the field and its possibilities.

However, the fact still remains that machine learning is hard, and there remains a lot of work to be done before machine learning brings about the promised benefits of super AIs that solve the world’s problems.

Why Machine Learning is So Challenging

Machine learning is difficult for a couple reasons that we’ll examine in more detail throughout this article. First, however, it’s useful to conceptualize what machine learning entails. The basic problems of machine learning revolve around getting computers to make good decisions, examine situations, and use context to provide good answers.

You’ve probably seen basic machine learning in action when Facebook suggests friends, YouTube suggests videos, or Amazon suggests products you might like. These algorithms are fairly straightforward as they use a few variables about the things you’ve liked in the past to suggest new things you might also like.

However, machine learning gets more challenging when there are more variables and interpretation the computer has to comprehend. The more we ask computers to actually think, the harder the problem gets.

The Coding is Not the Hard Part

The actual coding of the algorithms is not the hard part. Of course, if you’re interested in working in machine learning, a strong background in computer science will be essential, and an advanced degree is helpful. However, we understand how to write the code that powers machine learning algorithms, and we have open source frameworks and tools that make it possible.

The Math is Tricky, But Not Impossible

Depending how deep down the machine learning rabbit hole you want to go, you’ll need advanced math in order to understand the algorithms being proposed in today’s academic journals. However, an understanding of linear algebra, calculus, and statistics common in most computer science and engineering programs is usually enough to get started in machine learning.

Debugging is Much Harder

Debugging is a problem for any programmer. But unlike in computer programming, debugging a machine learning algorithm is not as simple as identifying where the problem is coming from and fixing it. Stanford Researcher Zayd Enam gives a good explanation of why this is the case. In computer programming, there are essentially two ways in which code can go wrong, errors in implementation or errors in your original algorithm. Fixing a bug is the process of figuring out which of those went wrong and either correcting the code or rethinking your algorithm.

With machine learning, the machine is building its own understanding of the scenario, and it requires enough correct data to do that. So, you could have an issue of implementation or algorithm, but you could also have an error of how much correct data you provided the machine or your model could be weakly labeled.

When the machine doesn’t produce the correct result, it’s difficult to isolate what caused the bug. As a result, debugging in machine learning takes significantly longer than in traditional software development.

No Modularity Makes Any Change Exponential

Traditional software is usually built in modules that can be taken apart and tested to make sure that each independent part is running successfully. It’s therefore possible to make a small change to one part of the code without it affecting other parts of an otherwise functioning program. That’s not the case with machine learning. Gil Press calls this problem, “non-modularity.” Since machine learning projects aim to teach a computer how to do something, changing just one thing has repercussions throughout the entire system.

Human Bias of Data

Another major problem that makes machine learning so hard is human data. It’s hard to conceptualize, but human-collected data has a human bias. Recent news articles illustrate this to an extent, when AI’s learned racial and gender bias from human writings, or the Chicago Police Department used an algorithm to create a “Heat List” of people likely to commit violent crime in the city.

Less newsworthy are all the suboptimal ways that humans collect data or measure what they want to measure instead of what needs to be measured. This leads AIs to draw false or skewed conclusions about the world because the data humans supplied was not comprehensive.

The Benefits of Machine Learning

Despite the challenges, machine learning holds the potential to revolutionize the way we understand and interact with the world. It helps us understand data, make predictions, and solve problems. The future of machine learning is cooperation between humans and machines to increase our capacity to think about, understand, and appreciate the world around us.

Image Source: Adobe Stock