Recurrent Neural Networks (RNN)

Illustration of an LSTM neural network showing the flow of information through the forget gate, input gate, and output gate, with labeled cell state (Cₜ) and hidden state (hₜ), visualizing how LSTM architecture controls memory and sequence learning.

How LSTMs Work: A Deep Dive into Gates and Information Flow

Long Short-Term Memory (LSTM) networks solve the limitations of traditional RNNs through a powerful gating mechanism. This article explains how the Forget, Input, and Output gates work internally, breaking down the math, vector dimensions, and intuition behind cell states and hidden states. A deep, implementation-level guide for serious deep learning practitioners.

Aryan

Feb 4

Diagram illustrating backpropagation through time in an RNN using a “cat–mat–rat” toy example, showing hidden states across three timesteps, shared weights, forward flow, and backward gradient propagation to the loss.

Backpropagation Through Time (BPTT) Explained Step-by-Step with a Simple RNN Example

Backpropagation in RNNs is often confusing because a single weight affects the loss through multiple time-dependent paths. In this post, I break down Backpropagation Through Time step by step using a small toy dataset, clearly showing how gradients flow across timesteps and why unfolding the network is necessary.

Aryan

Jan 28

How LSTMs Work: A Deep Dive into Gates and Information Flow

Backpropagation Through Time (BPTT) Explained Step-by-Step with a Simple RNN Example

© 2025 Aryan Upadhyay |