Descion Tree In Machine Learning + Real Solved Examples 🔥

Muhammad Taha
4 min readFeb 23, 2025

--

descion tree examples, real code examples + some code snippets, code explainations for better understanding….

A Decision Tree is a supervised machine learning algorithm used for classification and regression tasks. It works by splitting data into smaller subsets based on feature conditions, forming a tree-like structure.

Each node represents a feature, each branch represents a decision, and each leaf node represents the final output (class label or value).

Example: If you want to decide whether to play outside based on the weather, a Decision Tree might look like this:

Is it Rainy?
/ \
Yes No
/ \
Windy? Play Outside
/ \
Yes No
Stay Play

Why Use Decision Trees in ML?

âś” Interpretable: Easy to understand and visualize.
âś” Handles both numerical & categorical data.
âś” No need for feature scaling (like standardization or normalization).
âś” Works well with small datasets.
âś” Can capture non-linear relationships.

When to Use Decision Trees?

Use Decision Trees when:
âś” You need interpretability.
âś” Your data has non-linear relationships.
âś” You want a model that automatically selects features.
âś” You need a model that can handle missing values.

Real-World Examples & Code Implementations

Example 1: Predicting if a Customer Will Buy a Product

Using a Decision Tree to classify customers as Buyers (1) or Non-Buyers (0) based on age and salary.

from sklearn.tree import DecisionTreeClassifier

# Sample Data (Age, Salary)
X = [[25, 50000], [30, 60000], [35, 70000], [40, 80000], [45, 90000]]
y = [0, 0, 1, 1, 1] # 0 = Won't Buy, 1 = Will Buy

# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X, y)

# Predict for a new customer
new_customer = [[32, 65000]]
prediction = model.predict(new_customer)
print("Prediction:", "Will Buy" if prediction[0] == 1 else "Won't Buy")

Output:
Prediction: Will Buy

Example 2: Predicting Loan Approval

Predict whether a loan will be Approved (1) or Denied (0) based on credit score and income.

from sklearn.tree import DecisionTreeClassifier

# Training Data (Credit Score, Income)
X = [[600, 3000], [650, 4000], [700, 5000], [750, 6000], [800, 7000]]
y = [0, 0, 1, 1, 1] # 0 = Denied, 1 = Approved

# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X, y)

# Predict for a new applicant
new_applicant = [[680, 4500]]
prediction = model.predict(new_applicant)
print("Loan Prediction:", "Approved" if prediction[0] == 1 else "Denied")

Output:
Loan Prediction: Approved

Example 3: Classifying Iris Flowers

Using the famous Iris dataset to classify flower species.

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Accuracy
print("Accuracy:", accuracy_score(y_test, y_pred))

Output:
Accuracy: 97–100% (varies per run)

Advantages & Disadvantages of Decision Trees

Advantages:

âś” Simple & Interpretable: Easy to understand and visualize.
âś” Feature Selection: Automatically selects important features.
âś” Handles Both Types of Data: Works with categorical & numerical data.
✔ No Need for Scaling: Unlike SVM or KNN, Decision Trees don’t require normalization.
âś” Can Model Non-Linear Data

Disadvantages:

❌ Overfitting: A deep tree can memorize data instead of generalizing.
❌ Sensitive to Small Changes: A slight variation in data can change the tree structure.
❌ Biased to Imbalanced Data: If one class dominates, the tree may favor it.
❌ Computational Cost: Large trees become slow and complex.

Solution to Overfitting: Use pruning (cutting unnecessary branches) or Random Forests.

Where is Decision Tree Used in ML?

âś” Medical Diagnosis: Predict diseases based on symptoms.
âś” Finance: Loan approval, fraud detection.
âś” Customer Segmentation: Classifying customers for marketing.
âś” Retail: Predicting product demand.
âś” HR & Hiring: Employee attrition prediction.
âś” E-commerce: Recommender systems.

More Example Code Snippets

Predicting Employee Attrition

X = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]  # [Experience, Satisfaction]
y = [0, 0, 1, 1, 1] # 0 = Stays, 1 = Leaves

model.fit(X, y)
print("Attrition Prediction for employee (4,5):", model.predict([[4, 5]])[0])

Predicting Fraudulent Transactions

X = [[500, 1], [1000, 0], [1500, 1], [2000, 1], [2500, 0]]  # [Transaction Amount, Previous Fraud]
y = [0, 0, 1, 1, 1] # 0 = Legit, 1 = Fraud

model.fit(X, y)
print("Fraud Prediction for $1800 transaction:", model.predict([[1800, 1]])[0])

Final Thoughts

Decision Trees are powerful for classification and regression but are prone to overfitting. To improve performance, use pruning, limit tree depth, or switch to Random Forests for better generalization. 🚀

--

--

Muhammad Taha
Muhammad Taha

Written by Muhammad Taha

0 Followers

A Software Engineering student passionate about machine learning.

No responses yet