A Machine Learning Primer

By FOSSlife Team, 21 December, 2020

The terms artificial intelligence and machine learning are everywhere in today’s computing landscape. But, what do these concepts really mean, and why are they important? In this Resource article, we’ll provide a brief introduction, in which we focus on machine learning and provide examples of how it affects our daily lives.

What Is Machine Learning?

Machine learning (ML) is a branch or subset of artificial intelligence (AI), so let’s start there. “At the birth of the field of AI in the 1950s, AI was defined as any machine capable of performing a task that would typically require human intelligence,” explains Nick Heath at ZDNet.

“AI systems will generally demonstrate at least some of the following traits: planning, learning, reasoning, problem solving, knowledge representation, perception, motion, and manipulation and, to a lesser extent, social intelligence and creativity,” Heath says.

If we build on that understanding, then, machine learning, according to IBM, is “a form of AI that enables a system to learn from data rather than through explicit programming.” According to Carnegie Mellon University, it’s about “machines improving from data, knowledge, experience, and interaction.”

So, through the practice of machine learning, or application of machine learning techniques, the aim is to improve machines’ ability to recognize patterns and make predictions, and, thus, make related recommendations and take logical action.

This predictive process powers many of the most popular services we use today. Google, Facebook, Twitter, Netflix, and YouTube all incorporate machine learning techniques to analyze user data, identify patterns in activity, and tailor recommendations.

Machine learning has applications in many other fields as well, including medicine, banking, autonomous vehicles, and natural language processing. For example, Paypal uses machine learning to analyze millions of transactions and flag anomalies to prevent fraud and detect money laundering, says Leah Davidson. “Machine learning can also predict hypoxaemia (low oxygen levels) during surgery and recognize cardiovascular risk factors like high blood pressure, age, and smoking in retinal images,” she says.

How Does Machine Learning Work?

Put simply, Heath says, “machine learning is the process of teaching a computer system how to make accurate predictions when fed data.”

Such predictions involve, for example, identifying a piece of fruit in a photo, such as a banana or an apple. The key feature of machine learning, he says, “is that a human developer hasn't written code that instructs the system how to tell the difference between the banana and the apple. Instead a machine-learning model has been taught how to reliably discriminate between the fruits by being trained on a large amount of data.”

According to IBM, there are four main approaches to machine learning, which are detailed on the website:

Supervised learning
Unsupervised learning
Reinforcement learning
Deep learning

Regardless of the approach used, the training process involves massive amounts of data along with machine-learning algorithms to find patterns in that data. This data, according to Technology Review, “encompasses a lot of things—numbers, words, images, clicks, what have you. If it can be digitally stored, it can be fed into a machine-learning algorithm.”

And, in many cases, this data is generated by users and collected by various companies. As the Stanford Encyclopedia of Philosophy points out, “The data trail we leave behind is how our ‘free’ services are paid for.”

Open Source Tools

“When you have a large data set on which you’d like to perform predictive analysis or pattern recognition, machine learning is the way to go,” says Serdar Yegulalp. “The proliferation of free open source software has made machine learning easier to implement both on single machines and at scale, and in most popular programming languages. These open source tools include libraries for the likes of Python, R, C++, Java, Scala, Clojure, JavaScript, and Go,” he says.

Among the open source machine learning tools you should know, according to Algorithmia, are:

The LF AI & Data Foundation, which supports open source projects within artificial intelligence, machine learning, deep learning, and data, has created an interactive landscape to provide a broader picture of the various tools and projects within the space.

Ethical Concerns

Machine learning offers certain advantages through the use of algorithms and models to predict outcomes. However, its use also raises ethical questions and concerns involving data privacy and inherent bias.

“These questions not only concern the possibility of harm by the misuse of data, but also questions of how to preserve privacy where data is sensitive, how to avoid bias in data selection, how to prevent disruption and ‘hacking’ of data, and issues of transparency in data collection, research and dissemination,” says Debra Satz.

“A decision-making algorithm,” notes Samuele Lo Piano, “will always be based on a formal system, which is a representation of a real system.” And, he says, “ it does not matter how complicated the algorithm may be (how many relations may be factored in), it will always represent one specific vision of the system being modelled.” Even the data on which the algorithm is trained does not represent an objective truth, because it depends on the context in which it has been produced, he says.

In any case, as IBM states, “the data scientists doing the work must ensure they are using the right algorithms, ingesting the most appropriate data (that is accurate and clean), and using the best performing models.”

Learn More

Understanding the principles, practices, and ethics of machine learning and artificial intelligence is the work of a lifetime, and, in this article, we have provided only a basic introduction to the topic. The resources included within the article as well as those below will help you learn more.