Building a Deep Neural Network from Scratch

Building a Deep Neural Network from Scratch

This post documents the journey of implementing a deep neural network (DNN) from first principles using only NumPy. The goal was to move beyond high-level libraries and understand how neural networks operate under the hood — mathematically and computationally. This work lays the foundation for more complex AI tasks in image classification, safety analysis, and intelligent fire systems.

1. What Is a Neural Network?

A neural network is a mathematical system inspired by the brain, designed to identify patterns. It takes in data, performs computations across several layers, and outputs predictions. Deep neural networks are especially powerful because they stack many layers to model highly non-linear relationships.

Think of it like this: each layer of the network transforms the input data slightly, passing it to the next layer until the final layer makes a decision — like classifying an image as "cat" or "not cat."

2. Network Structure

The architecture is flexible and supports any number of layers. Each layer performs two operations:

  • A linear transformation using weights and biases
  • A non-linear activation to introduce decision-making capability
Equation: Forward Propagation
Z[l] = W[l] · A[l−1] + b[l]
A[l] = activation(Z[l])
    

ReLU: A = max(0, Z) for hidden layers

Sigmoid: A = 1 / (1 + exp(−Z)) for the output layer

Layman's View: Each layer is like a filter that highlights certain aspects of the input. ReLU decides what information should pass through, and Sigmoid squashes outputs into probabilities.

3. Measuring Performance

To know how well the model is doing, we use a "loss function" to measure the error between predicted and actual values. Here, we use binary cross-entropy because the task is classification.

Equation: Cross-Entropy Cost
Cost = −1/m ∑ [Y log(AL) + (1−Y) log(1−AL)]
    

Layman's View: This formula penalizes confident wrong predictions more harshly. The better your model gets, the lower the cost becomes.

4. Learning from Mistakes: Backpropagation

Backpropagation is the process of computing how much each weight and bias contributed to the error. We use calculus to figure out how to slightly tweak the parameters to do better next time.

Equations: Backpropagation Steps
dAL = − (Y / AL − (1 − Y) / (1 − AL))

dZ[l] = dA[l] · g'(Z[l])

dW[l] = (1/m) · dZ[l] · A[l−1]ᵀ
db[l] = (1/m) · sum(dZ[l])
dA[l−1] = W[l]ᵀ · dZ[l]
    

Layman's View: These formulas tell us the "blame" for the error at each layer. We use that blame to gently nudge the parameters in the right direction.

5. Updating the Parameters

After calculating the gradients, we update the weights and biases using gradient descent. This is an optimization method that tries to reduce the error at every step.

Equation: Gradient Descent
W[l] = W[l] − α · dW[l]
b[l] = b[l] − α · db[l]
    

Layman's View: Imagine you're walking downhill toward a valley — each step (or parameter update) gets you closer to the lowest point, where your model performs best.

6. Why This Matters

Understanding the internals of neural networks is essential when applying them to fields like fire protection engineering. A solid mathematical foundation ensures transparency, trust, and the ability to adapt models to safety-critical domains, such as:

  • Wildland-urban interface (WUI) hazard detection from satellite images
  • Failure prediction in fire suppression systems
  • AI-assisted fire code compliance engines

7. What’s Next

With this foundation in place, I will build two practical models:

  • A two-layer neural network for simpler tasks
  • A deep L-layer neural network for complex classification

The next post will apply this network to an image classification problem — distinguishing cats from non-cats — and extend the logic to fire-related datasets.