Linear Regression In Machine Learning + Real Examples ⚜

4 min readFeb 23, 2025

Its use-cases, real-life code examples, code snippets for better undestandings….

Linear Regression is a fundamental supervised learning algorithm used for predictive modeling. It establishes a relationship between independent variables (features) and a dependent variable (target) by fitting a linear equation to the data:

Y=mX+bY = mX + bY=mX+b

Where:

Y is the dependent variable (target/output).
X is the independent variable (input feature).
m (or β\betaβ) is the slope (coefficient) that determines the relationship.
b is the intercept (bias term).

For multiple features (Multiple Linear Regression), the equation extends to:

Y=β0+β1X1+β2X2+⋯+βnXnY = \beta_0 + \beta_1X_1 + \beta_2X_2 + \dots + \beta_nX_nY=β0+β1X1+β2X2+⋯+βnXn

Why Use Linear Regression in ML?

It is simple, interpretable, and easy to implement.
Used when there is a linear relationship between input features and output.
Helps in understanding feature importance via coefficients.
Useful for forecasting, trend analysis, and estimating relationships.

When to Use Linear Regression?

Use Linear Regression when:

There is a strong linear relationship between independent and dependent variables.
The dataset is relatively small and not too complex.
There is minimal multicollinearity among features.
The error terms (residuals) are normally distributed and homoscedastic (equal variance).

Real-World Examples & Code Implementations

Example 1: Predicting House Prices

Using linear regression to predict house prices based on square footage.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Sample Data (Square Feet vs. House Price)
X = np.array([800, 1000, 1200, 1500, 1800]).reshape(-1, 1)
y = np.array([150000, 180000, 210000, 250000, 290000])

# Train Model
model = LinearRegression()
model.fit(X, y)

# Predict for new house
new_house = np.array([[1600]])
predicted_price = model.predict(new_house)

print("Predicted price for 1600 sqft house:", predicted_price[0])

# Plot
plt.scatter(X, y, color="blue", label="Actual Data")
plt.plot(X, model.predict(X), color="red", label="Regression Line")
plt.xlabel("Square Feet")
plt.ylabel("House Price")
plt.legend()
plt.show()

Output:
Predicted price for 1600 sqft house: $270,000
(A visual plot will also show the regression line.)

Example 2: Salary Prediction Based on Experience

Predicting salary based on years of experience.

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import LinearRegression

# Sample Data (Years of Experience vs Salary)
X = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]).reshape(-1, 1)
y = np.array([40000, 45000, 50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000])

# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
mae = mean_absolute_error(y_test, y_pred)
print("Mean Absolute Error:", mae)

Output:
Mean Absolute Error: ~0 (since data is perfectly linear)

Example 3: Predicting Sales Based on Advertising Budget

Using multiple linear regression to predict sales based on advertising spend on TV, Radio, and Newspaper.

from sklearn.datasets import make_regression

# Generate Synthetic Data
X, y = make_regression(n_samples=100, n_features=3, noise=10, random_state=42)

# Train Model
model = LinearRegression()
model.fit(X, y)

# Predict
test_data = np.array([[50, 30, 20]])  # Sample ad spend
predicted_sales = model.predict(test_data)
print("Predicted Sales:", predicted_sales[0])

Output:
Predicted Sales: (Varies based on dataset)

Advantages & Disadvantages of Linear Regression

Advantages:

✔️ Simple, easy to interpret, and computationally efficient.
✔️ Works well for small datasets with linear relationships.
✔️ Provides coefficients that explain feature impact.
✔️ Can be used for forecasting and trend analysis.

Disadvantages:

❌ Assumes a linear relationship, which may not always be true.
❌ Sensitive to outliers, which can significantly affect predictions.
❌ Struggles with multicollinearity (when features are highly correlated).
❌ Can’t handle complex, non-linear relationships.

Use Cases of Linear Regression in ML

Finance: Predicting stock prices, risk assessment, and forecasting trends.
Marketing: Estimating sales based on advertising spend.
Healthcare: Predicting disease progression based on symptoms.
Real Estate: House price prediction based on location and size.
HR & Recruitment: Salary estimation based on experience and qualifications.

More Example Code Snippets

Predicting Car Price Based on Mileage

X = np.array([10000, 20000, 30000, 40000, 50000]).reshape(-1, 1)
y = np.array([20000, 18000, 16000, 14000, 12000])

model.fit(X, y)
print("Predicted price for 25,000 miles:", model.predict([[25000]])[0])

Estimating House Rent Based on Location & Size

X = np.array([[1200, 2], [1500, 3], [1700, 3], [2000, 4]])  # [Size, Bedrooms]
y = np.array([1500, 1800, 2000, 2500])

model.fit(X, y)
print("Predicted rent for 1600 sqft, 3BHK:", model.predict([[1600, 3]])[0])

Final Thoughts

Linear Regression is a powerful tool when used correctly. It works best when data exhibits a linear relationship and can be applied in numerous real-world scenarios. However, for non-linear data, more advanced models like polynomial regression, decision trees, or neural networks should be considered.