Linear Regression In Machine Learning + Real Examples ⚜

Muhammad Taha
4 min readFeb 23, 2025

--

Its use-cases, real-life code examples, code snippets for better undestandings….

Linear Regression is a fundamental supervised learning algorithm used for predictive modeling. It establishes a relationship between independent variables (features) and a dependent variable (target) by fitting a linear equation to the data:

Y=mX+bY = mX + bY=mX+b

Where:

  • Y is the dependent variable (target/output).
  • X is the independent variable (input feature).
  • m (or β\betaβ) is the slope (coefficient) that determines the relationship.
  • b is the intercept (bias term).

For multiple features (Multiple Linear Regression), the equation extends to:

Y=β0+β1X1+β2X2+⋯+βnXnY = \beta_0 + \beta_1X_1 + \beta_2X_2 + \dots + \beta_nX_nY=β0​+β1​X1​+β2​X2​+⋯+βn​Xn​

Why Use Linear Regression in ML?

  • It is simple, interpretable, and easy to implement.
  • Used when there is a linear relationship between input features and output.
  • Helps in understanding feature importance via coefficients.
  • Useful for forecasting, trend analysis, and estimating relationships.

When to Use Linear Regression?

Use Linear Regression when:

  • There is a strong linear relationship between independent and dependent variables.
  • The dataset is relatively small and not too complex.
  • There is minimal multicollinearity among features.
  • The error terms (residuals) are normally distributed and homoscedastic (equal variance).

Real-World Examples & Code Implementations

Example 1: Predicting House Prices

Using linear regression to predict house prices based on square footage.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Sample Data (Square Feet vs. House Price)
X = np.array([800, 1000, 1200, 1500, 1800]).reshape(-1, 1)
y = np.array([150000, 180000, 210000, 250000, 290000])

# Train Model
model = LinearRegression()
model.fit(X, y)

# Predict for new house
new_house = np.array([[1600]])
predicted_price = model.predict(new_house)

print("Predicted price for 1600 sqft house:", predicted_price[0])

# Plot
plt.scatter(X, y, color="blue", label="Actual Data")
plt.plot(X, model.predict(X), color="red", label="Regression Line")
plt.xlabel("Square Feet")
plt.ylabel("House Price")
plt.legend()
plt.show()

Output:
Predicted price for 1600 sqft house: $270,000
(A visual plot will also show the regression line.)

Example 2: Salary Prediction Based on Experience

Predicting salary based on years of experience.

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import LinearRegression

# Sample Data (Years of Experience vs Salary)
X = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]).reshape(-1, 1)
y = np.array([40000, 45000, 50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000])

# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
mae = mean_absolute_error(y_test, y_pred)
print("Mean Absolute Error:", mae)

Output:
Mean Absolute Error: ~0 (since data is perfectly linear)

Example 3: Predicting Sales Based on Advertising Budget

Using multiple linear regression to predict sales based on advertising spend on TV, Radio, and Newspaper.

from sklearn.datasets import make_regression

# Generate Synthetic Data
X, y = make_regression(n_samples=100, n_features=3, noise=10, random_state=42)

# Train Model
model = LinearRegression()
model.fit(X, y)

# Predict
test_data = np.array([[50, 30, 20]]) # Sample ad spend
predicted_sales = model.predict(test_data)
print("Predicted Sales:", predicted_sales[0])

Output:
Predicted Sales: (Varies based on dataset)

Advantages & Disadvantages of Linear Regression

Advantages:

✔️ Simple, easy to interpret, and computationally efficient.
✔️ Works well for small datasets with linear relationships.
✔️ Provides coefficients that explain feature impact.
✔️ Can be used for forecasting and trend analysis.

Disadvantages:

❌ Assumes a linear relationship, which may not always be true.
❌ Sensitive to outliers, which can significantly affect predictions.
❌ Struggles with multicollinearity (when features are highly correlated).
❌ Can’t handle complex, non-linear relationships.

Use Cases of Linear Regression in ML

  • Finance: Predicting stock prices, risk assessment, and forecasting trends.
  • Marketing: Estimating sales based on advertising spend.
  • Healthcare: Predicting disease progression based on symptoms.
  • Real Estate: House price prediction based on location and size.
  • HR & Recruitment: Salary estimation based on experience and qualifications.

More Example Code Snippets

Predicting Car Price Based on Mileage

X = np.array([10000, 20000, 30000, 40000, 50000]).reshape(-1, 1)
y = np.array([20000, 18000, 16000, 14000, 12000])

model.fit(X, y)
print("Predicted price for 25,000 miles:", model.predict([[25000]])[0])

Estimating House Rent Based on Location & Size

X = np.array([[1200, 2], [1500, 3], [1700, 3], [2000, 4]])  # [Size, Bedrooms]
y = np.array([1500, 1800, 2000, 2500])

model.fit(X, y)
print("Predicted rent for 1600 sqft, 3BHK:", model.predict([[1600, 3]])[0])

Final Thoughts

Linear Regression is a powerful tool when used correctly. It works best when data exhibits a linear relationship and can be applied in numerous real-world scenarios. However, for non-linear data, more advanced models like polynomial regression, decision trees, or neural networks should be considered.

--

--

Muhammad Taha
Muhammad Taha

Written by Muhammad Taha

0 Followers

A Software Engineering student passionate about machine learning.

No responses yet