top of page
Exploring Opportunities in AI & Machine Learning


Layer Normalization Explained: Why Transformers Prefer It Over Batch Norm
Layer Normalisation is a core component of modern Transformer architectures. This article explains normalization fundamentals, internal covariate shift, why batch normalization fails in self-attention, and how layer normalization works mathematically inside Transformers—step by step with clear examples.

Aryan
Mar 6


The Vanishing Gradient Problem & How to Optimize Neural Network Performance
This blog explains the Vanishing Gradient Problem in deep neural networks—why gradients shrink, how it stops learning, and proven fixes like ReLU, BatchNorm, and Residual Networks. It also covers essential strategies to improve neural network performance, including hyperparameter tuning, architecture optimization, and troubleshooting common training issues.

Aryan
Nov 28, 2025
bottom of page