Building Linear Regression from Scratch in Python

Linear regression is one of the most fundamental algorithms in machine learning and statistics. It models the relationship between independent variables (features) and a dependent variable (target) by fitting a straight line. Although libraries like Scikit-learn provide a ready-made implementation, building it from scratch is the best way to understand how it actually works.

While libraries like Scikit-learn provide ready-made implementations, building it from scratch helps you truly understand:

How weights and bias are optimized
How gradient descent works behind the scenes
How loss functions measure error
How model performance is evaluated

In this article, we will first implement Simple Linear Regression (with one independent variable) and then move to Multiple Linear Regression (with several independent variables). Along the way, we will explore how gradient descent updates weights, how loss is calculated, and how to evaluate our model.

Why Build Linear Regression From Scratch?

Before jumping into code, let’s answer the obvious question: Why reinvent the wheel when libraries already exist?

Learning Perspective: By coding it ourselves, we understand how predictions are generated and improved through optimization.
Debugging & Research: When working on advanced algorithms, you often need to tweak internals. Understanding the basics gives you the power to experiment.
Portfolio & Interviews: Showing recruiters or clients that you understand the math and coding of ML models adds credibility to your skills.

What is Linear Regression?

At its core, linear regression tries to draw a straight line that best fits the data. The equation is:

Simple Linear Regression:
y = wx + b
Multiple Linear Regression:
y = w1x1 + w2x2 + ⋯ + wnxn + c

Here:

w or wi → weights (slopes for each feature)
b or c → bias (intercept)
y → predicted value

The main idea is to minimize the difference between predicted values and actual values using a loss function, typically Mean Squared Error (MSE).

Simple Linear Regression from Scratch

Let’s start with the simple case of just one independent variable:

def simple_linear_regression(df, learning_rate=0.01, iteration=1000):
    x = df['independent'].values
    y = df['dependent'].values

    # Initialize weights and bias
    w = np.random.uniform(0, 5)
    b = np.random.uniform(0, 5)

    loss_list = []
    w_list = []
    b_list = []

    for i in range(iteration):
        # Make prediction
        y_pred = w * x + b

        loss = np.mean((y_pred - y) ** 2)
        loss_list.append(loss)
        w_list.append(w)
        b_list.append(b)


        # Compute gradients
        dw = 2 * np.mean((y_pred - y) * x)
        db = 2 * np.mean(y_pred - y)

        # Update weights
        w -= learning_rate * dw
        b -= learning_rate * db

    # Final prediction after training
    df = df.copy()  # Prevent modifying original DataFrame
    df['y_pred'] = w * x + b

    return df, w, b,loss_list,w_list,b_list

👉 Here, we:

Start with random w and b.
Make predictions.
Calculate error using MSE.
Compute gradients.
Update parameters with gradient descent.
Repeat for many iterations until convergence.

Multiple Linear Regression from Scratch

Now, let’s extend it to handle multiple features.

from sklearn.metrics import r2_score, root_mean_squared_error

def multiple_linear_regression(df, learning_rate=0.01, iteration=1000):
    x1, x2, x3, x4 = df['x1'].values, df['x2'].values, df['x3'].values, df['x4'].values
    y = df['y'].values

    # Initialize weights and bias
    w1, w2, w3, w4 = np.random.uniform(1, 100, 4)
    c = np.random.uniform(1, 100)

    loss_list = []

    for i in range(iteration):
        # Predictions
        y_pred = w1*x1 + w2*x2 + w3*x3 + w4*x4 + c

        # Loss
        loss = np.mean((y_pred - y) ** 2)
        loss_list.append(loss)

        # Gradients
        dw1 = 2 * np.mean((y_pred - y) * x1)
        dw2 = 2 * np.mean((y_pred - y) * x2)
        dw3 = 2 * np.mean((y_pred - y) * x3)
        dw4 = 2 * np.mean((y_pred - y) * x4)
        dc = 2 * np.mean(y_pred - y)

        # Update
        w1 -= learning_rate * dw1
        w2 -= learning_rate * dw2
        w3 -= learning_rate * dw3
        w4 -= learning_rate * dw4
        c  -= learning_rate * dc

    # Final prediction
    df = df.copy()
    df['y_pred'] = w1*x1 + w2*x2 + w3*x3 + w4*x4 + c

    # Evaluate performance
    r2 = r2_score(df['y'], df['y_pred'])
    rmse = root_mean_squared_error(df['y'], df['y_pred'])

    return df, w1, w2, w3, w4, c, loss_list, r2, rmse

Here, each feature gets its own weight, and gradient descent updates them all simultaneously. We also add evaluation metrics:

R² Score → How much variance the model explains
RMSE → Root Mean Squared Error, a measure of prediction accuracy

Model Evaluation

After training, you can visualize:

Loss curve (to see convergence)
Predicted vs Actual values
R² and RMSE for performance

This not only checks if the model works but also ensures your gradient descent is converging properly.

Conclusion

By building Linear Regression from scratch, we learned:

How predictions are made using weights and bias
How gradient descent minimizes error step by step
How to extend the algorithm from simple to multiple variables
How to evaluate models using R² and RMSE

While libraries like Scikit-learn handle this efficiently in one line of code, the real value of this exercise lies in understanding the foundations.

Machine learning is built on simple building blocks like this — once you understand them deeply, more complex models like Logistic Regression, Decision Trees, and Neural Networks become much easier to grasp.

⚡ Next Step: Try experimenting with different learning rates, iterations, and datasets. You’ll see how these hyperparameters influence model performance.