Machine Learning 101: PyTorch, TensorFlow & Decision Trees
Published:
If you can put toys into big and small groups,
or ask “Is it an animal?” when you play Guess Who?,
then you already think like a machine learning model!
This post answers three gentle questions, one baby step at a time:
- What are PyTorch and TensorFlow?
- Why do we need them?
- How does a Decision Tree work?
No scary symbols—just tiny doses of math, colourful pictures, and short code you can run in a notebook.
1 Why Bother With ML Libraries?
Imagine you have to count all the marbles in a swimming pool—by hand. Ouch! A calculator (or a friendly big sibling) would help. In ML the “marbles” are millions of numbers. PyTorch and TensorFlow are the helpful siblings that:
Hard thing | The library does it for you |
---|---|
Keep track of every number | Wrap them in tensors |
Do giant chains of maths | Use your computer’s GPU (super fast) |
Work out derivatives | Provide autograd (automatic calculus) |
Save & reload models | One‑line functions like torch.save() |
Result: you spend time on ideas, not on counting marbles.
2 Meet the Friendly Giants: PyTorch vs TensorFlow
PyTorch | TensorFlow / Keras | |
---|---|---|
Feels like | Regular Python + NumPy | Building blocks that snap into a graph |
Main fans | Researchers, hobby projects | Large companies, production apps |
Debugging | print() works instantly | Needs the Keras “eager” switch (now on by default) |
Fancy extras | TorchAudio, TorchVision, TorchText | TensorBoard, TF‑Lite (phones), TPUs |
Tiny rule‑of‑thumb: Prototyping fast? → PyTorch.
Shipping to millions of phones? → TensorFlow.
2·1 One‑Screen Demo – Linear Regression in Each Library
PyTorch
import torch, torch.nn as nn
a = torch.randn(100, 1)
b = 3 * a + 0.5 + 0.1 * torch.randn_like(a)
model = nn.Linear(1, 1)
optim = torch.optim.SGD(model.parameters(), lr=0.1)
loss_fn = nn.MSELoss()
for _ in range(300):
optim.zero_grad()
loss = loss_fn(model(a), b)
loss.backward()
optim.step()
print(model.weight.item(), model.bias.item()) # ~3.0 and ~0.5
TensorFlow / Keras
import tensorflow as tf
from tensorflow.keras import layers
# Generate data
x = tf.random.normal((100, 1))
y = 3 * x + 0.5 + 0.1 * tf.random.normal((100, 1))
# Define the model using Input layer
model = tf.keras.Sequential([
tf.keras.Input(shape=(1,)),
layers.Dense(1)
])
model.compile(optimizer="sgd", loss="mse")
model.fit(x, y, epochs=300, verbose=0)
# Print learned weights and bias
print(model.weights[0].numpy(), model.weights[1].numpy()) # ~3.0, ~0.5
Same maths, same answer—the libraries just do the heavy lifting.
3 Decision Trees: The “20 Questions” Algorithm
3·1A Story First
Picture a basket of fruit: apples and oranges. You want a robot kid to tell them apart.
- Ask a yes/no question like “Is the fruit orange‑coloured?”
- Put every fruit that says yes on the left, every no on the right.
- Keep asking new questions on each pile until every pile holds only one kind of fruit.
- To classify a new fruit, start at the top question and follow the answers down to a leaf node.
That’s a Decision Tree—nothing fancier than a flow‑chart of yes/no gates.
3·2 How Does the Tree Pick a “Good” Question?
It chooses the question that makes the piles cleaner.
We measure ‘clean-ness’ (or messiness) with a tiny formula called entropy. Think of entropy as the level of confusion in a pile:
- A pile that’s half apples, half oranges = very messy → entropy ≈ 1
- A pile that’s all apples = super clean → entropy = 0
🍎🍊 Real Talk: What’s This Formula?
The formula is:
\[H(S) = -\sum_{c} p_c \log_2 p_c,\]Where:
- \(( S )\) = a group of fruits (a pile)
- \(( c )\) = each class (like 🍎 or 🍊)
- \(( p_c )\) = the fraction of the pile that is class ( c )
🧮 Let’s Do It With Fruit
Say your pile has 2 apples and 2 oranges. That means:
- \[( p_{\text{apple}} = 2/4 = 0.5 )\]
- \[( p_{\text{orange}} = 2/4 = 0.5 )\]
Plug it in:
\[H(S) = -[0.5 \log_2 0.5 + 0.5 \log_2 0.5] = -[0.5 \times (-1) + 0.5 \times (-1)] = 1.0\]So this pile is very messy—totally mixed up.
Now, suppose we ask “Is the fruit orange-coloured?” and split the fruits like this:
- Left pile → both oranges → \(( p = 1.0 )\) → entropy = 0
- Right pile → both apples → \(( p = 1.0 )\) → entropy = 0
Boom! Now both piles are perfectly clean.
🔍 What’s Information Gain?
Information Gain is just the drop in entropy when you split a pile:
\[\text{Gain} = H(\text{parent}) - \left(\frac{|L|}{|S|}H(L) + \frac{|R|}{|S|}H(R)\right)\]- \((H(\text{parent}))\) = entropy of the big pile before splitting
- \((H(L) \), \( H(R))\) = entropy of the left/right piles
- \((\\|L\\|/\\|S\\| \), \( \\|R\\|/\\|S\\|)\) = how big each new pile is, as a fraction
In our example:
- Before split: entropy = 1.0
- After split: both piles = 0.0
- So Gain = 1.0 – 0 = 1.0 ← perfect!
Summary:
The decision tree is basically playing “20 Questions,” trying to find the question that makes the biggest mess shrink. That’s what “Information Gain” measures.
(Don’t panic: Scikit‑Learn figures this all out for you automatically.)
3·3 Hands‑On Example With Only Four Fruits
# | Colour score (0 = light, 1 = dark) | Diameter cm | Label |
---|---|---|---|
1 | 0 | 3.0 | 🍊 |
2 | 0 | 3.2 | 🍊 |
3 | 1 | 3.4 | 🍎 |
4 | 1 | 3.6 | 🍎 |
Try the question “Colour ≤ 0.5?”
- Left pile → rows 1 & 2 → all oranges → entropy = 0
- Right pile → rows 3 & 4 → all apples → entropy = 0
- Parent entropy ≈ 1.0 → Gain = 1.0 – 0 = 1.0 (perfect!)
One question and the job is done.
3·4 Code: Grow & Draw a Tree
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, plot_tree
import matplotlib.pyplot as plt
X, y = load_iris(return_X_y=True)
model = DecisionTreeClassifier(max_depth=3, criterion="entropy").fit(X, y)
plt.figure(figsize=(10, 6))
plot_tree(model,
feature_names=["sepal len", "sepal wid", "petal len", "petal wid"],
class_names=load_iris().target_names,
filled=True, rounded=True);
plt.show()
Run it in a notebook—the picture spells out every yes/no gate.
4 Trees vs Neural Nets at a Glance
Decision Tree | Neural Net (PyTorch / TF) | |
---|---|---|
How it learns | Greedy splits, no calculus | Gradient descent + autograd |
Needs scaling? | No | Often yes |
Explains itself easily? | Yes (draw the tree) | Hard (needs extra tools) |
Loves small tabular data | ✔️ | ❌ (needs lots of data) |
5 Try It Yourself 🧪
- Change the Fruit Basket
Make up your own table with shape, colour, weight and runDecisionTreeClassifier
again. - PyTorch vs TensorFlow
Rewrite the tiny linear‑regression demo above in the other library. - Entropy on Paper
Calculate the entropy ofColour ≤ 0.5
split by hand—see that it really is 0.
6 Cheat‑Sheet 🧾
PyTorch = Pythonic, research‑friendly tensor toolkit
TensorFlow/Keras = End‑to‑end, production‑friendly ML factory
Decision Tree = A flow‑chart that asks yes/no questions to classify data
Entropy = Messiness in a pile (0 = pure, 1 ≈ mixed)
Info Gain = Entropy drop after a split
Best Split = Question with the biggest Info Gain
🚀Final Words
If you can sort fruit or play “20 Questions,” you already grasp the soul of Decision Trees.
Frameworks like PyTorch and TensorFlow are simply powerful calculators that help your computer juggle the numbers so you don’t have to.
Master these gentle ideas and the door to the ML playground swings wide open.
Happy learning—and may your questions always split the pile just right!