Chapter 2 — How Machines Learn

Posted Jan 26, 2026

By Saurav Nagpal

10 min read

🚫 What Machine Learning Is Not

In Chapter 1, Tensor Owl explained the basic concepts of Artificial Intelligence, covering what AI, ML, and DL are and introducing their different types.

The main question now is: how does a machine show intelligence?
To show intelligence, a machine must first learn. But how does a machine learn?

Imagine you are planning to watch a movie in the theatre with your wife. You look at the movies released this week and watch their trailers to understand their genre and theme. As you do this, you naturally compare them with movies she liked in the past and try to guess which one she might enjoy.
A machine does not understand ideas like genre or theme. It can only work with numbers.
So, if we assign number values—such as how much comedy or how much family drama a movie has—the machine can place movies on a scale and compare them.

Example

Let’s suppose you provide the details of previously watched movies to Tensor Owl. He arranges the information in a structured format, where each movie is represented using numerical values for comedy and family drama, along with the final reaction:

Movie Name	Comedy Scale (0–100)	Family Drama Scale (0–100)	Audience Reaction (Wife)
DDLJ	25	85	❤️ Loved
Hera Pheri	95	15	❤️ Loved
Dhoom	15	20	😢 Not Liked
3 Idiots	90	35	❤️ Loved
Baahubali	10	30	😢 Not Liked
Dangal	30	95	🔥 Loved

You ask Tensor Owl whether the new movie RRR will be a good choice for the weekend based on past preferences.
After watching the trailer, you share your impression that it is high on action, with little comedy and very little family emotion. Based on this, Tensor Owl assigns the values: Comedy 20 and Family Drama 10.

Tensor Owl creates a simple program with a few basic rules, where the values must fall within predefined ranges. Using these rules, the program judges that this movie will not be a good choice for the weekend.

This kind of rule-based decision-making is not machine learning.

🧠 What Machine Learning Really Means

In the earlier example, RRR was closer to Dhoom and Baahubali. Its values were similar, so fixed rules worked there. But real situations are not always so clear. What happens when a new movie does not fall neatly into any predefined range?

Imagine a new movie that feels balanced, with Comedy 50 and Family Drama 50. It sits near the centre of the scale. In this case, the rule-based program cannot decide easily.

To illustrate this problem, Tensor Owl draws all existing movies on the same graph and places the new movie alongside them using the same scale. The new movie sits near the center, where it does not clearly belong to any group, making the decision unclear.

When the movie sits in the middle, the program can no longer decide by direct comparison. There is no single past movie that clearly matches it.

To move forward, Tensor Owl looks at the movies that are closest to this new point. He draws a small circle around the new movie. If no movies fall inside the circle, the circle slowly grows. When a few nearby movies appear inside the circle, he stops. These nearby movies are called neighbours.

Once the nearby movies are identified, their past outcomes are observed and compared. If most of the nearby movies were loved, the new movie is expected to be liked as well. If most of them were not liked, the opposite conclusion is reached. The decision does not come from predefined rules or understanding, but from similarity.

This way of reasoning is very close to how humans make judgments. When something new appears, we rarely analyze it in isolation. Instead, we compare it with what we have already experienced and trust the pattern that emerges. Here, memory and intuition are replaced by distance and numbers.

Without realizing it, you have already learned your first machine learning algorithm — K-Nearest Neighbours (KNN). It works by looking at the closest examples and using their past outcomes to make a decision for something new. Here, K simply refers to how many nearby examples are considered before making that decision.

Instead of applying fixed algorithms with predefined rules, an algorithm learns patterns from past data and uses those patterns to make decisions on new data.

🧱 The Building Blocks of Machine Learning

At its simplest level, every machine learning system follows the same basic flow. Input is given to a model, and the model produces an output based on the algorithm it uses.

Let’s now look at each part of this flow in a little more detail.

Input: Data, Features, and Labels

Input is the information given to the system so it can make a decision or prediction. It describes the current situation.
In the movie example, this information includes numbers such as how much comedy or family drama a movie has.
These numbers are called features. Features describe the input in a simple, measurable way. Sometimes a feature is a single number, and sometimes multiple numbers are grouped together. A single number is called a scalar, and a group of numbers used together is often called a vector.

Labels are the outcomes we already know from the past. They tell us what actually happened. In the movie example, a label is whether a movie was liked or not liked. This information comes from past experiences, not from the prediction itself. Labels are not part of the current input. They are used as reference points to understand how good a prediction is.
Simply put, labels are the answers from the past that help guide learning.

Together, data, features, and labels form the input side of a machine learning system. They define what the system sees and what it is trying to learn from.

Model

The model is the heart of a machine learning system. It is the part that takes the input and produces an output.
It decides how the input features are used to make a prediction. Different models use different ways to do this, but the goal is always the same: to connect input to output in a useful way.
In the movie example, the model is what uses the features of a movie to decide whether it is likely to be liked or not. In our earlier case, this model followed the idea behind K-Nearest Neighbours (KNN), where decisions are guided by nearby examples.

Output

The output is the result produced by the model. It is the decision or prediction made after looking at the input.
In the movie example, the output is whether the movie is expected to be liked or not liked. This output is based on how the model interprets the input features.

Learning and Feedback

After the model produces an output, it can be compared with what actually happened. This comparison tells us whether the decision was correct or not.

The difference between the prediction and the actual outcome is called feedback. It shows how far the model’s decision was from the expected result.
Learning happens when this feedback is used to improve future decisions. Over time, as more examples are seen and more feedback is received, the model becomes better at making predictions.

Let’s suppose you still go for the movie with Comedy 50 and Family Drama 50, and it is actually liked by your wife. This new outcome now becomes part of the experience. The system can use this information the next time a similar movie appears.

In simple terms, a machine learning system improves by noticing its mistakes and reducing them gradually.

Loss Function — Measuring Mistakes

After a prediction is made, we need a way to understand how good or bad it was. This is where the idea of loss comes in.
Loss answers a simple question: how wrong was the prediction? If the predicted value is very far from what actually happened, the loss is high. If the prediction is close to the actual outcome, the loss is low.

For example, suppose the model predicts a liking score of 30 for a movie, but the actual liking score turns out to be 80. The difference between these two values is large, so the loss is high. If the model predicts a liking score of 75 and the actual score is 80, the difference is small, so the loss is low.
Loss acts like feedback. It does not tell the system what the correct answer should be, but it shows how far the prediction was from reality.

Learning = Minimizing Loss

This leads us to the core idea of machine learning. Learning is the process of reducing loss over time. Each time a prediction is made, loss is measured. The system then adjusts itself so that future predictions are less wrong than before. As this process repeats, the loss becomes smaller and decisions improve. Everything else in machine learning builds on this simple idea. The goal is always the same: make fewer mistakes next time.

🧾 Summary

Machine Learning is not about writing fixed rules; it is about learning patterns from data.
Rule-based programs work only when conditions are clear and predefined, but they fail when situations become uncertain or overlap.
Every machine learning system follows a simple flow: Input → Model → Output → Learning through feedback.
Input is represented using features, labels provide known outcomes from the past, and models connect inputs to predictions.
Learning happens by measuring mistakes using loss and gradually reducing those mistakes over time.

Summary – Voice Recording

🧩 Conclusion

In this chapter, Tensor Owl moved beyond rule-based thinking and explored what machine learning really means. He showed how fixed rules work only in simple situations, and why they fail when data overlaps or becomes uncertain. Instead, machine learning relies on past examples, similarity, feedback, and gradual improvement.

By walking through a simple movie example, we saw how machines make decisions by comparing new inputs with previous experiences, how models produce predictions, and how mistakes are measured and reduced over time.

With this understanding, we are now ready to move forward — from how machines learn to how learning happens in practice. The next chapter begins our journey into supervised learning, where machines learn from examples with known outcomes.

🔎 Recap & Reflection

An e-commerce system generates offer prices for customers using loyalty duration and past purchase amounts. Is this an example of machine learning?

The same e-commerce system updates its future offer prices after observing whether customers accept or reject previous offers. What best describes this change?

A system predicts house prices using area and number of rooms. Its predictions are often close but sometimes far from actual prices. What helps the system improve over time?

A navigation app initially gives poor route suggestions. After many trips, it starts suggesting faster routes for similar traffic conditions. What best explains this improvement?

A chatbot answers user questions. For one group of users, their responses are correct 50 times out of 100. For another group, responses are correct only 30 times out of 100. Why is this difference important?

A weather model predicts rain daily but does not record whether rain actually occurred. What key element of learning is missing?

A new movie lies close to several previously liked movies and a few disliked ones. The system predicts it will be liked. What factor most directly influenced this decision?

A model improves because its future predictions are closer to real outcomes. What is decreasing over time?

A model’s predictions remain poor even after many learning cycles. Later it is found that past outcomes were recorded incorrectly. What was most likely wrong?

Artificial Intelligence Storybook, Machine Learning, AI course

This post is licensed under CC BY 4.0 by the author.