Descion Tree In Machine Learning + Real Solved Examples 🔥
descion tree examples, real code examples + some code snippets, code explainations for better understanding….
A Decision Tree is a supervised machine learning algorithm used for classification and regression tasks. It works by splitting data into smaller subsets based on feature conditions, forming a tree-like structure.
Each node represents a feature, each branch represents a decision, and each leaf node represents the final output (class label or value).
Example: If you want to decide whether to play outside based on the weather, a Decision Tree might look like this:
Is it Rainy?
/ \
Yes No
/ \
Windy? Play Outside
/ \
Yes No
Stay Play
Why Use Decision Trees in ML?
âś” Interpretable: Easy to understand and visualize.
âś” Handles both numerical & categorical data.
âś” No need for feature scaling (like standardization or normalization).
âś” Works well with small datasets.
âś” Can capture non-linear relationships.
When to Use Decision Trees?
Use Decision Trees when:
âś” You need interpretability.
âś” Your data has non-linear relationships.
âś” You want a model that automatically selects features.
âś” You need a model that can handle missing values.
Real-World Examples & Code Implementations
Example 1: Predicting if a Customer Will Buy a Product
Using a Decision Tree to classify customers as Buyers (1) or Non-Buyers (0) based on age and salary.
from sklearn.tree import DecisionTreeClassifier
# Sample Data (Age, Salary)
X = [[25, 50000], [30, 60000], [35, 70000], [40, 80000], [45, 90000]]
y = [0, 0, 1, 1, 1] # 0 = Won't Buy, 1 = Will Buy
# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X, y)
# Predict for a new customer
new_customer = [[32, 65000]]
prediction = model.predict(new_customer)
print("Prediction:", "Will Buy" if prediction[0] == 1 else "Won't Buy")
Output:
Prediction: Will Buy
Example 2: Predicting Loan Approval
Predict whether a loan will be Approved (1) or Denied (0) based on credit score and income.
from sklearn.tree import DecisionTreeClassifier
# Training Data (Credit Score, Income)
X = [[600, 3000], [650, 4000], [700, 5000], [750, 6000], [800, 7000]]
y = [0, 0, 1, 1, 1] # 0 = Denied, 1 = Approved
# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X, y)
# Predict for a new applicant
new_applicant = [[680, 4500]]
prediction = model.predict(new_applicant)
print("Loan Prediction:", "Approved" if prediction[0] == 1 else "Denied")
Output:
Loan Prediction: Approved
Example 3: Classifying Iris Flowers
Using the famous Iris dataset to classify flower species.
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load Dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train Decision Tree
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
# Accuracy
print("Accuracy:", accuracy_score(y_test, y_pred))
Output:
Accuracy: 97–100% (varies per run)
Advantages & Disadvantages of Decision Trees
Advantages:
âś” Simple & Interpretable: Easy to understand and visualize.
âś” Feature Selection: Automatically selects important features.
âś” Handles Both Types of Data: Works with categorical & numerical data.
✔ No Need for Scaling: Unlike SVM or KNN, Decision Trees don’t require normalization.
âś” Can Model Non-Linear Data
Disadvantages:
❌ Overfitting: A deep tree can memorize data instead of generalizing.
❌ Sensitive to Small Changes: A slight variation in data can change the tree structure.
❌ Biased to Imbalanced Data: If one class dominates, the tree may favor it.
❌ Computational Cost: Large trees become slow and complex.
Solution to Overfitting: Use pruning (cutting unnecessary branches) or Random Forests.
Where is Decision Tree Used in ML?
âś” Medical Diagnosis: Predict diseases based on symptoms.
âś” Finance: Loan approval, fraud detection.
âś” Customer Segmentation: Classifying customers for marketing.
âś” Retail: Predicting product demand.
âś” HR & Hiring: Employee attrition prediction.
âś” E-commerce: Recommender systems.
More Example Code Snippets
Predicting Employee Attrition
X = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]] # [Experience, Satisfaction]
y = [0, 0, 1, 1, 1] # 0 = Stays, 1 = Leaves
model.fit(X, y)
print("Attrition Prediction for employee (4,5):", model.predict([[4, 5]])[0])
Predicting Fraudulent Transactions
X = [[500, 1], [1000, 0], [1500, 1], [2000, 1], [2500, 0]] # [Transaction Amount, Previous Fraud]
y = [0, 0, 1, 1, 1] # 0 = Legit, 1 = Fraud
model.fit(X, y)
print("Fraud Prediction for $1800 transaction:", model.predict([[1800, 1]])[0])
Final Thoughts
Decision Trees are powerful for classification and regression but are prone to overfitting. To improve performance, use pruning, limit tree depth, or switch to Random Forests for better generalization. 🚀