What is Artificial Intelligence? A Beginner’s Guide

Part 10: Introduction to Linear Regression

Divyajot Kaur — Thu, 11 Jun 2026 17:03:47 GMT

So far in this series, we've explored datasets, preprocessing, train-test splitting, overfitting, underfitting, bias, and variance. Now it's time to dive into the algorithms that actually make predictions.

We'll begin with one of the simplest and most important supervised learning algorithms: Linear Regression.

Despite its simplicity, Linear Regression is widely used in real-world applications and serves as the foundation for many advanced machine learning techniques.

Let's understand how it works.

🤔What is Linear Regression?

Linear Regression is a supervised learning algorithm used to predict continuous values by finding the relationship between input features and an output variable.

A continuous value is a number that can take any value within a range.

In simple terms, it helps answer questions like:

What will the house price be?
What will next month's sales be?
What score might a student get in an exam?

Unlike classification algorithms that predict categories such as "Spam" or "Not Spam," Linear Regression predicts numerical values.

📈Prediction of Continuous Values

Imagine you're trying to predict a student's exam score based on the number of hours they study.

You might notice a pattern:

As study hours increase, exam scores tends to increase as well.

Linear Regression learns this relationship from the data and uses it to make predictions.

For example:

If a student studies for 7 hours, the model can estimate their expected score.

This ability to predict numerical values is what makes Linear Regression useful for many real-world problems.

🎯Understanding the Line Fitting Intuition

Suppose we plot the study hours and exam scores on a graph.

The points may not form a perfect straight line, but we can still see a trend.

Linear Regression tries to draw a line that best represents this trend.

The goal is not to pass through every data point.

Instead, the goal is to find a line that stays as close as possible to most of the points.

Think of it like this:

Imagine placing a ruler through a scattered set of points on paper. You would position it where it represents the overall pattern rather than trying to touch every point.

Linear Regression does the same thing mathematically.

This line is often called the Best Fit Line.

🧮How Does the Model Know If the Line Is Good?

Once a line is drawn, the model compares its predictions with the actual values.

For every data point:

Error = Actual Value − Predicted Value

A good line produces smaller errors.

A poor line produces larger errors.

The model's goal is to reduce these errors as much as possible.

This is where the concept of a Cost Function comes in.

📉Cost Function (High-Level Overview)

A Cost Function is a measure of how well or poorly the model is performing.

It calculates the overall error made by the model across all data points.

In simple words, it tells us how far the model's predictions are from the actual values.

Simple Idea

Smaller cost = Better predictions
Larger cost = Worse predictions

During training, Linear Regression continuously adjusts the line to reduce the cost.

You can think of it as a student improving their answers after checking mistakes on a practice test.

The fewer mistakes they make, the better their performance.

Similarly, the model keeps improving its line until the overall error becomes as small as possible.

🌍Real-World Use Cases

Linear Regression is used in many industries because of its simplicity and effectiveness.

1. House Price Prediction

Predict the price of a house based on features such as size, location, and number of bedrooms.

2. Sales Forecasting

Estimate future sales using historical sales data.

3. Salary Prediction

Predict a person's salary based on factors such as experience and education.

4. Stock Market Analysis

Estimate future trends using historical numerical data.

5. Energy Consumption Prediction

Forecast electricity usage based on previous consumption patterns.

📝Conclusion

Linear Regression is a simple yet powerful algorithm for predicting continuous values. By learning patterns from data and fitting the best possible line, it helps make accurate predictions for a wide range of real-world problems.

Understanding Linear Regression is an important first step because many advanced machine learning algorithms build upon the same fundamental ideas.

🚀Coming Up Next

We've learned how Linear Regression makes predictions using a best-fit line. In the next blog, we'll explore Gradient Descent and understand how the model finds that line by continuously reducing errors.

Part 9: Bias–Variance Tradeoff in Machine Learning

Divyajot Kaur — Thu, 04 Jun 2026 09:33:46 GMT

Imagine two students preparing for an exam.

One student studies only a few topics and performs poorly because they don't understand enough concepts.
Another student memorizes every question from previous exams but struggles when new questions appear.

Both students have problems, but for different reasons.

This is exactly what happens in Machine Learning.

Some models learn too little and fail to capture important patterns, while others learn too much and end up memorizing the training data.

That's why understanding the role of Bias and Variance is essential. These two concepts help us understand why models underfit, why they overfit, and how we can build models that generalize well to unseen data.

To see how these concepts influence a model's performance, let's first understand them individually.

🎯What is Bias?

Bias is the error that occurs when a machine learning model is too simple to understand and capture the actual patterns in the data. Because the model makes too many assumptions, it misses important relationships and fails to learn effectively.

Think of it as a student who only studies chapter summaries instead of understanding the full concepts.

Example

Suppose we want to predict house prices.

The relationship between house size and price may look like a curve.

But if we use a simple straight-line model to represent this relationship, the model may miss important patterns.

As a result:

Training accuracy is low
Testing accuracy is also low
The model performs poorly everywhere

This is called underfitting.

Signs of High Bias

✅Model is too simple

✅Misses important patterns

✅High training error

✅High testing error

📈What is Variance?

Variance is the error that occurs when a model becomes too complex and learns not only the actual patterns but also the noise and random details in the training data.

A high-variance model becomes overly sensitive to the training data. As a result, it performs very well on the training data but struggles when it sees new data.

Think of it as a student who memorizes every practice question instead of understanding the concepts.

Example

Again, consider house price prediction.

If we create a very complex model that tries to fit every single training data point perfectly, it may capture noise instead of actual patterns.

The model performs extremely well on training data but struggles with new data.

This is called overfitting.

Signs of High Variance

✅Model is too complex

✅Memorizes training data

✅Very low training error

✅High testing error

⚖️Understanding Underfitting and Overfitting

The concepts of bias and variance are closely connected to underfitting and overfitting.

Underfitting = High Bias + Low Variance

The model is too simple.

It cannot capture the true relationship in the data.

Example

A model predicting student marks using only the number of study hours while ignoring attendance, assignments, and previous performance.

The model misses important information and makes poor predictions.

Overfitting = Low Bias + High Variance

The model is too complex.

It learns both useful patterns and unnecessary noise.

Example

A model that memorizes every student's exact exam score from past years but fails when new students take the exam.

The model performs brilliantly during training but poorly in real-world situations.

🤔Why Does the Bias–Variance Tradeoff Exist?

Increasing model complexity usually:

Reduces bias
Increases variance

Reducing model complexity usually:

Reduces variance
Increases bias

This creates a tradeoff.

The goal is to find the right balance where both bias and variance are reasonably low.

🛠How to Balance Bias and Variance

1. Choose the Right Model Complexity

Avoid models that are too simple or too complex.

Start with a reasonable model and evaluate its performance.

Example

For a classification task:

A very shallow decision tree may underfit.
A very deep decision tree may overfit.

A moderately deep tree often performs best.

2. Use More Training Data

More data helps the model focus on genuine patterns rather than memorizing noise, and reduces the chance of overfitting.

Example

A facial recognition system trained on only 100 images may overfit.

Training it on thousands of images helps it generalize better.

3. Apply Regularization

Regularization prevents models from becoming excessively complex, and helps it focus on important patterns in the data.

Common techniques include:

L1 Regularization (Lasso)
L2 Regularization (Ridge)

These methods help reduce overfitting.

4. Use Cross-Validation

Instead of evaluating a model on a single train-test split, cross-validation tests it on multiple portions of the dataset.

This provides a more reliable estimate of how the model will perform on unseen data.

5. Feature Selection

Not every feature is useful.

Removing irrelevant features can reduce variance and improve model performance.

Example

When predicting house prices, features like location, size, and number of bedrooms are usually more useful than wall color.

🌍Real-World Example

Imagine you're building a machine learning model to predict whether an email is spam or not.

High Bias Model

The model only checks whether the email contains the word "free."

Because of this many spam emails are missed.

Result: Underfitting.

High Variance Model

The model memorizes every word pattern from the training emails.

It performs extremely well on emails it has already seen, but struggles when new types of spam appear.

Result: Overfitting

Balanced Model

The model learns meaningful patterns that commonly appear in spam emails without memorizing every detail.

As a result, it can accurately classify both familiar and new emails.

Result: Better performance on new emails.

📝Conclusion

Bias and variance represent two common challenges in machine learning.

High Bias leads to underfitting, where the model fails to learn important patterns.
High Variance leads to overfitting, where the model learns the training data too closely and struggles with new data.

The goal is to find the right balance between the two.

The best machine learning models don't memorize the data—they learn meaningful patterns and perform well on unseen data. That's the essence of the Bias–Variance Tradeoff.

🚀Coming Up Next

So far, we've covered datasets, preprocessing, train-test splitting, overfitting, underfitting, bias, and variance—the building blocks of machine learning.

In the next phase of this series, we'll start exploring Supervised Learning Algorithms and see how these concepts come together to build predictive models. We'll kick things off with Linear Regression, one of the simplest yet most powerful algorithms in machine learning.

Stay tuned!

Part 8: Overfitting vs Underfitting in Machine Learning

Divyajot Kaur — Thu, 28 May 2026 05:26:20 GMT

Imagine a student preparing for an exam.

One student memorizes every single question and answer from previous papers without actually understanding the concepts.

Another student barely studies and only understands a few basic ideas.

Now imagine both students facing completely new questions in the final exam.

What happens?

The first student struggles because the questions are slightly different.
The second student struggles because they never learned enough in the first place.

Machine Learning models face the exact same problem.

Some models learn too much, while others learn too little. These situations are called:

Overfitting
Underfitting

Finding the right balance between them is one of the most important goals in Machine Learning.

📉What is Underfitting?

Underfitting happens when a model is too simple to understand the patterns in the data.

The model:

Fails to learn properly
Performs poorly on training data
Also performs poorly on unseen data

This means:

The model has not learned enough.

🎓Real-Life Analogy for Underfitting

Imagine trying to pass a math exam after only reading the chapter titles.

You studied something, but not enough to truly understand the subject.

That is underfitting.

⚠️Signs of Underfitting

A model may be underfitting if:

Training accuracy is low
Testing accuracy is also low
Predictions are overly simple or inaccurate

📈What is Overfitting?

Overfitting happens when a model learns the training data too well—including noise and unnecessary details.

Instead of learning general patterns, the model starts memorizing the dataset.

As a result:

Training accuracy becomes very high
Testing accuracy becomes poor

This means:

The model performs well only on data it has already seen.

🎓Real-Life Analogy for Overfitting

Imagine a student memorizing answers to previous exam papers word for word.

If the final exam contains the exact same questions, the student performs perfectly.

But if the questions are slightly changed, the student struggles because they never truly understood the concepts.

That is overfitting.

⚖️Model Complexity

One major reason behind overfitting and underfitting is model complexity.

📌Simple Models

Very simple models may:

Miss important patterns
Make overly basic predictions

This often leads to:

Underfitting

📌Complex Models

Highly complex models may:

Learn every tiny detail
Capture random noise
Memorize training data

This often leads to:

Overfitting

🎯The Goal: Balanced Learning

The ideal model should:

Learn meaningful patterns
Ignore unnecessary noise
Perform well on unseen data

In simple terms:

We want a model that understands the subject instead of memorizing answers.

🔁Summary Table

🛠How to Reduce Underfitting

Some common solutions include:

Using a more complex model
Training for longer
Adding more useful features
Reducing excessive regularization

🛠How to Reduce Overfitting

Common solutions include:

Using more training data
Simplifying the model
Applying regularization
Using dropout (in deep learning)
Cross-validation

🎯Conclusion

A good Machine Learning model should learn meaningful patterns without memorizing noise.
In simple terms:

Underfitting means learning too little, while overfitting means learning too much.

The goal is to find the right balance between the two.

🔎Coming Up Next

In the next part, we’ll explore the Bias–Variance Tradeoff and understand how bias and variance are connected to underfitting and overfitting, and how to balance them effectively.

Part 7: Why We Split Data in Machine Learning?

Divyajot Kaur — Wed, 27 May 2026 16:39:55 GMT

Imagine preparing for an important exam.

You don’t just:

Read the textbook
Memorize every question
And walk into the final exam directly

Instead, you usually:

Study concepts
Practice with mock tests
And finally attempt the actual exam

Models learn in a very similar way.

If we train a model using all the available data without proper evaluation, the model may simply memorize patterns instead of actually learning them. To avoid this, we split the dataset into different parts:

Training data
Validation data
Test data

❓Why Do We Split Data?

The goal of Machine Learning is not just to perform well on known data.

But to make accurate predictions on new, unseen data.

If we train and test the model on the same dataset, the model may appear accurate because it has already seen the answers before.

This is similar to:

Practicing the exact same questions that appear in the final exam.

The student may score well, but that does not necessarily mean they truly understand the subject.

Data splitting helps us evaluate whether the model can generalize to new situations.

📚Training Data — The Learning Phase

The training dataset is the portion of data used to teach the model.

During this phase, the model:

Learns patterns
Finds relationships between features
Adjusts its internal parameters

Think of it as:

A student studying concepts from textbooks and classroom notes.

This is where the actual learning happens.

Typically, the largest portion of data is used for training.

🧪Validation Data — The Practice Test

The validation dataset is used to evaluate the model while it is still learning.

It helps us:

Tune hyperparameters
Compare different models
Prevent overfitting

This is similar to:

Taking mock tests before the final exam.

The validation set allows us to check:

Is the model improving?
Or is it just memorizing the training data?

Without validation data, we may unknowingly build a model that performs well only on training data.

🎯Test Data — The Final Exam

The test dataset is used only after training is completely finished.

It provides:

The final evaluation of model performance
An unbiased estimate of how the model performs on unseen data

This is similar to:

The actual final exam.

The model has never seen this data before.

If the model performs well here, it means:

✔ It has learned useful patterns

✔ It can generalize effectively

⚠️What is Data Leakage?

Imagine a teacher accidentally gives students the exact final exam questions while they are practicing.

During the exam, the students score very high—not because they truly understood the concepts, but because they had already seen the answers earlier.

This is exactly what happens in Machine Learning when data leakage occurs.

Data leakage happens when a model gets access to information during training that it should not normally have. As a result, the model appears extremely accurate during testing, but performs poorly in real-world situations.

As a result:

The model gets unfair hints
Performance appears artificially high
Real-world predictions become unreliable

🚫Common Causes of Data Leakage

Some common causes include:

1. Applying Preprocessing Before Splitting Data

If we scale or normalize the entire dataset before splitting it, the model indirectly gets information from the test data.

In this case:

The model learns patterns it should not know yet

2. Using Future Information

Sometimes the dataset contains information that would not actually be available at prediction time.

For example:

Predicting whether a customer will cancel a subscription
But using data collected after the customer already left

The model gets future information, making predictions unrealistically accurate.

3. Including Target-Related Features

Sometimes a feature is directly connected to the answer we want to predict.

For example:

Predicting whether a student will pass
Including “final result” as an input feature

The model can easily guess the answer instead of learning real patterns.

⚖️Typical Data Split Ratios

There is no single perfect ratio, but some commonly used splits are:

Training	Validation	Test
70%	15%	15%
80%	10%	10%

The choice depends on:

Dataset size
Problem complexity
Available data

⚠️Common Mistakes

Beginners often:

Train and test on the same dataset
Skip validation data completely
Apply preprocessing before splitting data

These mistakes can lead to misleading model performance.

🎯Conclusion

Splitting data helps models learn, improve, and perform well on unseen data.
In simple terms:

Training data teaches the model, validation data improves it, and test data evaluates it fairly.

Without proper data splitting, even a highly accurate model may fail in the real world.

🔮Coming Up Next

In the next part of this series, we’ll explore Overfitting and Underfitting and understand why some models memorize too much while others fail to learn enough.

Part 6: Data Preprocessing: The Foundation of Every ML Model

Divyajot Kaur — Thu, 14 May 2026 07:40:32 GMT

Imagine trying to cook a great meal with spoiled vegetables, missing ingredients, and random measurements.

No matter how skilled the chef is, the result probably won't be good.

Machine Learning works in a similar way.

In real-world scenarios, data is rarely clean. It is incomplete, inconsistent, and in formats that machines cannot understand directly.

That's why before training any model, we need to prepare it properly. This preparation step is called Data Preprocessing.

Data Preprocessing is the process of cleaning, transforming, and organizing raw data so that it can be effectively used for Machine Learning models.

🍳Data Preprocessing = Kitchen Preparation

Think of a Machine Learning project like preparing a dish.

The data is ingredients
The model is your chef
The prediction is the final dish

Now ask yourself:

👉 What happens if your ingredients are spoiled, unmeasured, or missing?

Even the best chef (algorithm) cannot fix bad ingredients.

That’s exactly why preprocessing matters.

❗️Why Data Preprocessing Matters

Raw data from the real world contains:

Missing values
Text data instead of numbers
Different scales for different features
Noise and inconsistencies

Without preprocessing, models may:

Learn incorrect patterns
Give biased predictions
Perform poorly on real data

So preprocessing ensures:

✔️Clean and usable data

✔️Better model performance

✔️More reliable predictions

✔️Improved model accuracy

🧹Handling Missing Values (Fixing Missing Ingredients)

Imagine you are cooking pasta, but you realize:

You don’t know how much salt to use
Half the ingredients list is missing

What do you do?

You either:

Estimate missing quantities
Or remove the missing items entirely

The same thing happens in datasets.

In real-world data, missing values are very common, and most Machine Learning models cannot work with them directly.

In Machine Learning, we solve this problem in a similar way:

Removing incomplete rows or columns (only if necessary)
Filling missing values using:
- Mean (average)
- Median (middle value)
- Mode (most frequent value)

This ensures the dataset is complete and usable.

🔤Encoding Categorical Data (Converting Recipes into Machine Language)

Now imagine your recipe book says:

“Add some salt”
“Use a pinch of sugar”
“Add a bit of spice”

A machine cannot understand “some” or “a pinch”.

It needs exact numbers.

Similarly, datasets often contain categorical data like:

City names
Colors
Product types
Gender

Machines cannot understand text directly, so we need to convert it into numbers using encoding.

Common techniques:

Label Encoding → assigning a unique number to each category
One-Hot Encoding → creating binary columns for each category (0/1 format)

Now the machine can properly “read” the data.

⚖️Feature Scaling (Balancing Ingredients)

Imagine a recipe where:

One ingredient is measured in grams
Another is measured in kilograms

If you don’t convert them properly, your dish will be ruined because one ingredient will dominate the rest.

In datasets, this happens when features have different scales. Example:

Age → range from 0 to 100 (small values)
Salary → range from thousands to lakhs (very large values)

Without scaling, models may incorrectly assume larger numbers are more important.

Solution: Feature Scaling

We use methods like:

Normalization (scaling values between 0 and 1)
Standardization (centering data around mean with unit variance)

This ensures all features contribute equally to the model.

🔁Data Preprocessing Pipeline (Step-by-Step Cooking Process)

Just like cooking has stages, preprocessing also follows a flow:

Collect raw data
Handle missing values
Encode categorical data
Scale features
Remove duplicates
Clean and finalize dataset

Only after this is complete do we train a Machine Learning model.

🌍Real-World Example: Bank Loan Approval System

Imagine a bank loan approval system that decides whether a person is eligible for a loan.

The dataset includes features like:

Age of the applicant
Annual income
Loan amount requested
Credit score

Now notice the problem:

Age ranges from 18 to 70
Income ranges from ₹1,00,000 to ₹20,00,000
Credit score ranges from 300 to 900

If we don’t apply feature scaling, the model will assume:

👉 Income is the most important feature

👉 Because its values are much larger than others

Even if credit score is actually more important for loan decisions, it gets “ignored” due to scale differences.

After Feature Scaling:

All features are brought to a similar range.

Now the model:

✔️Treats all features fairly

✔️Learns real patterns (like credit score impact)

✔️Makes more accurate loan approval predictions

⚠️Common Mistakes (Burning the Dish)

Even good cooks can make mistakes. Similarly, beginners often make mistakes such as:

Applying scaling before splitting data into train/test sets -> lead to data leakage.
Using wrong encoding techniques -> may cause the model to misinterpret categorical information.
Ignoring missing values -> can result in errors or unreliable predictions.
Over-cleaning data and removing important information

Avoiding these mistakes can significantly improve model performance.

🔚Conclusion

Data Preprocessing is not just a technical step—it is the foundation of Machine Learning.

If Machine Learning is cooking, then preprocessing is, washing, cutting, measuring, and preparing everything before the fire is even turned on.

Without it, even the best algorithm cannot produce good results.

With it, even simple models can perform surprisingly well.

🔮What's Next?

In the next blog, we’ll explore how to split data into training, validation, and test sets so we can evaluate models properly and avoid data leakage.

Part 5: Understanding Datasets: The Building Blocks of Machine Learning

Divyajot Kaur — Wed, 06 May 2026 16:56:23 GMT

Imagine you're trying to predict whether a student will pass an exam or not.

You look at things like study hours, attendance, and marks. Based on this, you make a guess*—*pass or fail.

Now imagine doing this for hundreds of students.

This collection of information is called a dataset.

In Machine Learning, datasets are the foundation. Every model learns patterns from data*—*and that data comes from datasets.

Let's break this down in the simplest way possible.

The image above shows the basic structure of a dataset.

Now, let’s understand each part step by step.

📊What is a Dataset?

A dataset is simply a collection of data.

In Machine Learning, the model is trained on a dataset to learn patterns and make predictions.

Example:

This entire table is a dataset.

📋How is a Dataset Organized?

In Machine Learning, data is usually organized in a table format, similar to an Excel sheet.

Each row represents a single record (called a sample)
Each column represents a type of information (called a feature)
The final column often represents the output (called the label)

This structured format helps machines easily read and learn from data.

🧩What are Samples?

Each row in a dataset is called a sample.

A sample represents one record or observation.

In our example:

Each row = one student

So:

1 row = 1 sample
3 rows = 3 samples

A sample contains all the information (features + label) for that dataset.

In simple terms, a sample is a single entry in a dataset.

⚙️What are Features?

Features are the input variables. They contain the information used to train a model and make predictions.

In our dataset, features are:

Study hours
Attendance
Marks

These are the factors that help us decide the result.

Think of features as clues to solve a problem.

🎯What is a Label?

A label is the final result or output that we want to predict.

In our example:

Result = Label (whether a student passed or failed)

In simple terms, the label is what the model is trying to learn.

🌍Real-Life Analogy

Now that we understand the structure of a dataset. Let's understand it with the help of a real-world example.

🩺Doctor Diagnosing a Patient

Imagine a doctor trying to diagnose a patient:

Symptoms (fever, cough, age) -> Features
Diagnosis (disease) -> Label
Each patient -> Sample
All patient records -> Dataset

The doctor use symptoms (features) to predict the disease (label).

This is exactly how Machine Learning works.

🔚Conclusion

Understanding datasets is the first step in Machine Learning.

Once you know how data is organized into samples, features, and labels, everything else becomes much easier to understand.

🔮What's Next?

In the next blog, we'll explore data preprocessing*—*how datasets are cleaned and prepared before training a machine learning model, an essential step for building accurate and reliable models.

Part 4: Regression vs Classification: How Machines Predict & Decide

Divyajot Kaur — Thu, 30 Apr 2026 03:26:44 GMT

In the previous blog, we explored what Supervised Learning is and how models learn from data.

Now, let's understand how Supervised Learning is used to solve real-world problems using two key approaches: Regression & Classification.

❓How Machines Actually Learn

As we discussed earlier, Supervised Learning is a technique where the model learns from labeled data.

But at its core, Supervised Learning is all about finding patterns in data.

Think of it as:

You give your model the data about the songs you listen to, it starts noticing patterns- if you prefer rock or pop songs, it will recommend similar tracks.

Over time, it learns your preferences and makes better predictions.

It doesn't "understand" the way humans do. Instead, it:

Observes relationships between input (features) and output (labels)
Learns patterns from past data
Use those patterns to make predictions on new, unseen data

This is what makes Machine Learning powerful.

Now, based on the type of problem we want to solve, supervised learning can be divided into two main approaches:

📈Regression: Predicting Continuous Value

Regression is a Machine Learning technique used to predict numerical (continuous) values.

In other words, it is used when the output you want is a number rather than a category.

💡What does that mean?

Regression answers questions like:

"What will be the price?"
"What will be the temperature?"
"How much sales will we get?"

🏡Example

Imagine predicting the price of a house based on:

Location
Number of rooms
Condition

The model learns from past data and predicts a numeric value, like 45,00,000.

🧠Intuition

Think of it as finding a relationship between things by drawing a best-fit line through data points to predict values. E.g., more study hours -> higher marks.

📌Where is it used?

Real estate price prediction
Sales forecasting
Weather prediction

🎯Classification: Predicting Categories

Classification is a Machine Learning technique used to categorize data.

In simple words, it is used when the output you want belongs to a specific group or class.

💡What does that mean?

Classification answers questions like:

Yes or No
"Is this spam or not?"
"Which category does this belong to?"

📧Example

Imagine predicting whether an email is spam or not.

You provide the model with:

Email content
Sender details
Keywords

The model learns patterns and predicts: Spam or Not Spam.

🧠Intuition

Instead of predicting a number, classification predicts which category something belongs to.

📌Types of Classification

Binary Classification: Two classes (e.g., Yes/No, Spam/Not Spam)
Multi-class Classification: More than two classes (e.g., classifying types of fruits or different animal species)

📌Where is it used?

Email filtering
Medical diagnosis
Image recognition

⚖️Difference Between Both

⚠️Challenges in Supervised Learning

While Supervised Learning is powerful, it's not perfect.

Overfitting: Instead of learning patterns model memorizes the training data i.e., it performs well during training but poor on new unseen data.
Underfitting: Model is too simple and fails to capture patterns, thus performs poor on both training and testing data.
Data Quality: Poor or insufficient data leads to poor predictions.

🧩Where is This Used in Real Life?

Healthcare: Used for detecting and diagnosing diseases.
E-commerce platforms: Recommend products based on user behavior.
Finance: Used to detect fraudulent transactions.

🔚Conclusion

Supervised Learning is one of the most practical and widely used approaches that helps models learn from data and make predictions.

By using Regression and Classification, it can:

Predict outcomes
Identify patterns
Solve real-world problems efficiently

Understanding these concepts is a key step toward building practical Machine Learning systems.

🔮What's Next?

In the next blog, we'll break down datasets, features, and labels- the building blocks of every machine learning system.

Part 3: Types of Machine Learning

Divyajot Kaur — Wed, 22 Apr 2026 18:10:05 GMT

Think about how you learned things growing up-being taught by someone, figuring things out on your own, or learning from mistakes.

Machine Learning follows these same patterns

In this blog, we'll break down the types of Machine Learning and explore how machines learn from data

⚙️Types of Machine Learning

Machine Learning can be divided into different types based on how models learn from data.

Let's explore each of these types in detail.

1. Supervised Learning

Supervised Learning is a type of Machine Learning where the model learns from labeled data, i.e., the correct output is already given.

How it works

Imagine a student is learning from a teacher. The teacher gives the questions along with its answers and points out mistakes. Over time, the student starts recognizing patterns and improves gradually.

In Supervised Learning the model learns the same way.

Example

Predicting the house prices based on features like location, condition, and number of rooms available.

Here the model is trained on labeled features and then tested on new unseen data.

Types of problems

Regression (predicting continuous values, e.g., house price prediction)
Classification (predicting categories, e.g., spam and not spam emails)

Common Algorithms

Linear Regression
Logistic Regression
KNN
Decision Trees
Support Vector Machines (SVM)

Use Cases

Email Spam Detection
Stock price prediction
Medical diagnosis

2. Unsupervised Learning

Unsupervised Learning is a technique where models are trained on unlabeled data. It is used to find hidden patterns and relationships by grouping similar data together.

How it works

Imagine you walk into a party where you don't know anyone. As you observe, you start noticing groups forming-people with similar interests, professions, or personalities naturally gather together.

No one tells you who belongs where, but patterns still emerge.

Unsupervised Learning works the same way-it identifies hidden patterns and groups in data without any predefined labels.

Example

Grouping customers based on their shopping behavior.

Types of tasks

Clustering (Grouping similar data points)
Dimensionality Reduction (simplifying data while keeping important information)

Common Algorithms

K-Means Clustering
Hierarchical Clustering
Principal Component Analysis (PCA)

Use Cases

Customer Segmentation
Anomaly Detection (fraud detection)
Market Research

3. Semi-Supervised Learning

Semi-Supervised Learning is a combination of supervised and unsupervised, where a small amount of labeled data and large amount of unlabeled data is used for model training.

Here the model first learns from labeled data and then improve using unlabeled data.

How it works

Imagine you are shown how to solve a particular set of problems, and than are given many similar ones to solve.

You start to recognize and learn patterns by solving the problems and then complete the rest accordingly.

Example

Image classification where few images are labeled, but thousands are not.

Use Cases

Medical Imaging
Speech Recognition
Large-scale classification problems

4. Reinforcement Learning

Reinforcement Learning is a technique where an agent learns by interacting with an environment and receiving rewards and penalties.

Here the model learns through trial and error method, aiming to maximize rewards over time.

How it works

Think about the first time you rode a bicycle. There were no specific instructions for every movement-you try, fall, adjust, and improve with practice.

Example

Training an AI to play a game of chess. For every correct move points are given and for every incorrect move points are deducted.

Common Algorithms

Q-Learning
Deep Q Learning (DQN)

Use Cases

Game AI
Robotics
Self-driving cars

🚀Final Thoughts

Each type of Machine Learning represents a different way of learning-just like humans. Whether it's learning from examples, exploring patterns, or improving through trial and error, these approaches shape how machines make decisions.

Understanding these different learning approaches helps build a strong foundation for seeing how machines learn and make decisions.

🔎Coming Up Next

In the next blog, we'll dive deeper into Supervised Learning and understand how Regression and Classification helps in solving real-world problems.

Part 2: Introduction to Machine Learning

Divyajot Kaur — Wed, 15 Apr 2026 17:55:35 GMT

Imagine teaching a child to recognize fruits. Instead of giving strict instructions like "apples are round and red", you show them pictures of apples. This is repeated until they start recognizing the pattern.

Machine Learning works in the same way - It allows systems to learn from data instead of relying on fixed rules.

❓So, What Exactly is Machine Learning?

Machine Learning is a branch of Artificial Intelligence that enables machines to learn from data, just like humans from experience. In machine learning, models are trained on data to recognize patterns and make decisions and predictions.

Machine Learning can be defined as:

"Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed."

-Arthur Samuel, 1959

⚙️How Machines Actually Learn?

Now you might be wondering how machines learn from data. Machines don't just read and understand information like humans; instead, they learn through a series of steps.

For example, imagine building a system that can identify whether an email is spam or not.

Let's understand how this works step by step:

Collecting Data: The model is given data, such as dataset of spam and non-spam emails.
Training the Model: The model is trained on this data to learn the difference between spam and not spam emails.
Finding Patterns: It identifies patterns, like words or phrases found in spam emails (like free, won prize).
Making Predictions: Once trained, the model predicts whether a new email is spam or not.
Improving the Model: With more data and tuning, the model becomes better at detecting spam emails.

🧠Key Concepts in Machine Learning

Let's look at some basic concepts that form the foundation of machine learning:

Features: The input data or variables used by the model. These can be anything like image, text, audio, video, or numerical data.
Labels: The correct output or answer the model is trained on (e.g., spam or not spam).
Model: A system or algorithm the learns patterns from data to make predictions.
Training: The process of teaching the model using data so it can learn patterns.

🧩Types of Machine Learning

Machine Learning is divided into different types based on how they learn from data. Each type uses a different approach to find patterns and make predictions.

Supervised Learning: The model is trained on a set of labeled data. Here the output is already known.
Unsupervised Learning: The model is trained on unlabeled dataset. It is used to find hidden patterns and structure.
Semi-Supervised Learning: It is a combination of both labeled and unlabeled training data.
Reinforcement Learning: The model learns by interacting with the environment and improves through rewards and penalties.

🌎Where is Machine Learning Used in Real Life?

Now that we understand how machines learn, you might be wondering - is machine learning used in real life? The answer is yes!

Let's look at some everyday applications that you use without even realizing it.

Recommendation Systems: Platforms like Netflix and YouTube use ML to suggest movies and videos you are interested in.
Virtual Assistants: Assistants like Siri and Google Assistant recognize your voice and respond to your queries.
Image Recognition: ML is used in apps like Google Photos to recognize faces, objects, and even scam documents.

⚠️Limitations and Challenges of Machine Learning

Data Dependency: ML relies heavily on data, and poor quality data can lead to wrong predictions.
Needs lots of Data: ML requires large amount of data - without enough data, accuracy may be low.
Overfitting & Underfitting: Models can either learn too much (overfitting) or too little (underfitting), which affects the overall model performance.

🚀Final Thoughts

Machine Learning may seem complex at first, but at it's core, it's all about learning from data and making smarter decisions. With its growing use in everyday life, understanding ML is becoming more important than ever.

🔍Coming up Next

In my next blog, we will explore different types of Machine Learning and understand how each approach works.

AI, ML & Deep Learning: A Beginner’s Journey (Part 1: Introduction to AI)

Divyajot Kaur — Mon, 06 Apr 2026 16:57:07 GMT

🧠The Beginning of Intelligent Machines

Have you ever wondered how Netflix knows what you like to watch and recommends it to you?

Or how self-driving cars work, or how your phone recognizes your voice?

Interesting, right!

Some thinks it's magic - but is it?

This is the power of Artificial Intelligence.

This blog marks the beginning of a series where we will explore Artificial Intelligence, Machine learning, and Deep Learning in depth - from basic concepts to real-world projects.

🤖What is Artificial Intelligence?

Artificial Intelligence (AI) refers to the ability of machines to perform tasks that usually requires human intelligence. It allows machines to learn from data, recognize patterns, understand language and make decisions.

AI can also be defined as:

Artificial Intelligence is the science and engineering of making intelligent machines.

- John McCarthy (Father of AI)

In simple terms, AI is about teaching machines to think, learn and make decisions just like humans.

⚖️AI vs Human Intelligence

Now before proceeding further let's understand the difference between AI and human intelligence:

Learning: AI learns from data and algorithms, whereas humans learn from experience, emotions and reasoning.
Creativity: Although AI can generate content (like art, music) but humans possess the originality, emotions and creativity.
Speed & Accuracy: AI is extremely fast and accurate in repetitive tasks while humans are slow and are prone to errors.
Emotional Intelligence: AI does not understand emotions and feelings while humans have deep understanding and empathy.
Decision Making: AI makes decisions fully based on data and logic whereas humans make decisions based on intuition, emotions and ethics.

From the above , it is clear that the real power lies in the collaboration between humans and AI.

While AI handles speed, accuracy, data, humans provide meaning, creativity and ethical judgement that give purpose and direction to that data.

🌎AI in Action: Real-World Applications

AI is used in various industries all around the world:

Heathcare: Helps in early disease detection, diagnosis, and personalized treatment.
Finance: Detects fraud, manages risks and makes smart investment decisions.
E-commerce: Recommends products based on past shopping behavior and user preferences.
Entertainment: Recommends shows, movies based on user interest.
Manufacturing: Used to automate repetitive tasks thus improving efficiency and accuracy of the tasks.

⚠️Limitations and Challenges of AI

Although AI improves speed, accuracy and automate tasks, but it also has its own limitations:

Bias in decision: AI models can become biased if trained on biased data.
Data Dependency: It relies heavily on data, and poor quality data can lead to wrong predictions.
Privacy concerns: AI systems save personal data of the user thus is prone to data leakage and misuse of personal information.
Job displacement: AI is transforming the job market by replacing repetitive work and reshaping career paths.
Lack of emotion: It lacks feelings and does not understand human emotions.

Understanding limitations of AI is important to ensure that it is used responsibly.

🚀Final Thoughts

AI is no longer a concept of the future - it is a part of everyday life, shaping the way we interact with technology and the world.

The true power of AI does not lie in replacing humans, but in enhancing human potential and solving real-world problems.

🔍Coming Up Next

In the next part of this series, we will explore Machine Learning and understand how machines actually learn from data.