Skip to main content

🕸️ Neural Networks

Brain cells learning together

The Brain Cells Analogy

Your brain has billions of neurons connected in networks. Each neuron receives signals, processes them, and if the combined signal is strong enough, it fires a signal to the next neurons.

Artificial neural networks mimic this structure digitally.

They have layers of artificial "neurons" that receive inputs, apply weights, and pass results forward. Through training, these weights adjust until the network can recognize patterns - like learning to identify cats in photos or predict tomorrow's weather.


How Neural Networks Work

Basic Structure

Input Layer      Hidden Layer(s)     Output Layer
    │                 │                  │
   [x1] ─┬─────────► [h1] ─┬──────────► [y]
         │            │    │
   [x2] ─┼─────────► [h2] ─┼
         │            │    │
   [x3] ─┴─────────► [h3] ─┴

Each connection has a weight that determines its importance.

A Single Neuron

function neuron(inputs, weights, bias) {
  // Weighted sum
  let sum = bias;
  for (let i = 0; i < inputs.length; i++) {
    sum += inputs[i] * weights[i];
  }

  // Activation function (ReLU)
  return Math.max(0, sum);
}

// Example
const inputs = [x1, x2, x3];
const weights = [w1, w2, w3];
const bias = b;

neuron(inputs, weights, bias); // Output depends on the math

Forward Pass

Data flows through the network:

Input: [1, 0, 1]
           ↓
Layer 1: Apply weights, add bias, activate
           ↓
Layer 2: Apply weights, add bias, activate
           ↓
Output: [p]    (probability-like score for classification)

Learning: Training a Network

The Training Loop

1. Forward pass: Input → Prediction
2. Calculate loss: How wrong was the prediction?
3. Backpropagation: Which weights caused the error?
4. Update weights: Adjust to reduce error
5. Repeat thousands of times

Loss Function

Measures how wrong the prediction is:

# Mean Squared Error (for regression)
loss = mean((predicted - actual) ** 2)

# Cross-Entropy (for classification)
loss = -sum(actual * log(predicted))

Gradient Descent

Adjust weights in the direction that reduces loss:

# Simplified weight update
weight = weight - learning_rate * gradient

The learning rate controls step size. Too large: overshoots. Too small: takes forever.


Activation Functions

Activation functions add non-linearity, allowing networks to learn complex patterns:

FunctionFormulaUse Case
ReLUmax(0, x)Hidden layers (most common)
Sigmoid1 / (1 + e^-x)Binary output (0 to 1)
Tanh(e^x - e^-x) / (e^x + e^-x)Output -1 to 1
Softmaxe^xi / sum(e^x)Multi-class probabilities
def relu(x):
    return max(0, x)

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

Types of Neural Networks

TypeStructureUse Case
Feedforward (MLP)Fully connected layersTabular data, simple tasks
CNNConvolutional layersImages, spatial data
RNN/LSTMRecurrent connectionsSequences, time series
TransformerAttention mechanismsText, language models
GANGenerator + DiscriminatorImage generation

Real-World Example: Image Classification

import tensorflow as tf

# Build model
model = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(H, W)),
  tf.keras.layers.Dense(N, activation='relu'),
  tf.keras.layers.Dense(K, activation='softmax')
])

# Compile
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Train
model.fit(train_images, train_labels, epochs=E)

# Predict
predictions = model.predict(test_images)

Common Mistakes and Gotchas

Overfitting

Network memorizes training data but fails on new data:

# Signs of overfitting:
# - Training accuracy: 99%
# - Validation accuracy: 60%

# Solutions:
model.add(tf.keras.layers.Dropout(rate))  # Randomly drop neurons
# Also: more data, simpler model, data augmentation

Vanishing Gradients

In deep networks, gradients become tiny and learning stops:

# Use ReLU instead of sigmoid
activation='relu'

# Use batch normalization
tf.keras.layers.BatchNormalization()

Not Normalizing Inputs

Unnormalized data causes training instability:

# Normalize to 0-1 range
train_images = train_images / scale

# Or standardize (mean=0, std=1)
train_images = (train_images - mean) / std

Wrong Learning Rate

Too high: Loss jumps around, may not converge
Too low: Training takes forever
Just right: Steady decrease in loss

Use learning rate schedulers or adaptive optimizers like Adam.


FAQ

Q: What is the difference between AI, ML, and deep learning?

AI is the broadest term (any intelligent system). Machine Learning is AI that learns from data. Deep Learning is ML using neural networks with many layers.

Q: How many layers do I need?

Start simple and add complexity if needed. More layers can be harder to train and more prone to overfitting.

Q: What is backpropagation?

The algorithm that calculates how much each weight contributed to the error, working backwards from output to input. It enables the network to learn.

Q: Do I need a GPU?

For small networks and datasets: CPU is fine. For deep learning with images or text at scale: GPU dramatically speeds up training.

Q: What is the difference between epoch and batch?

Epoch: one complete pass through all training data. Batch: a subset of data processed together before updating weights.

ReLU is simple (max(0, x)), fast to compute, and helps avoid vanishing gradients. It works well for most hidden layers.


Summary

Neural networks are the foundation of modern AI. By adjusting weights through training, they learn to recognize patterns and make predictions.

Key Points:

  • Neurons receive inputs, apply weights, and pass through activation
  • Training adjusts weights to minimize prediction error
  • Loss functions measure how wrong predictions are
  • Backpropagation calculates which weights to adjust
  • Different architectures (CNN, RNN, Transformer) suit different tasks
  • Overfitting is the main challenge - use regularization
  • Often normalize your input data

Neural networks power image recognition, language models, recommendation systems, and much more. Understanding the fundamentals opens the door to modern AI development.

Leave a Comment

Comments (0)

Be the first to comment on this concept.

Comments are approved automatically.