top of page
XGBoost


Handling Missing Data in XGBoost
Struggling with missing data? XGBoost simplifies the process by handling it internally using its sparsity-aware split finding algorithm. Learn how it finds the optimal "default direction" for missing values at every tree split by testing which path maximizes information gain. This allows you to train robust models directly on incomplete datasets without manual imputation.

Aryan
Sep 17
Â
Â


XGBoost Optimizations
XGBoost is one of the fastest gradient boosting algorithms, designed for high-dimensional and large-scale datasets. This guide explains its core optimizations—including approximate split finding, quantile sketches, and weighted quantile sketches—that reduce computation time while maintaining high accuracy.

Aryan
Sep 12
Â
Â


XGBoost Regularization
XGBoost is a powerful boosting algorithm, but it can overfit if not controlled. Regularization helps by simplifying trees, pruning unnecessary splits, and balancing bias–variance. This guide explains overfitting, how XGBoost improves on Gradient Boosting, and key parameters like gamma, lambda, max_depth, min_child_weight, learning rate, subsample, and early stopping to build robust models.

Aryan
Sep 5
Â
Â


The Core Math Behind XGBoost
XGBoost isn’t just another boosting algorithm — its strength lies in the mathematics that power its objective function, optimization, and tree-building strategy. In this post, we break down the core math behind XGBoost: from gradients and Hessians to Taylor series approximation, leaf weight derivation, and similarity scores. By the end, you’ll understand how XGBoost balances accuracy with regularization to build powerful predictive models.

Aryan
Aug 26
Â
Â


XGBoost for Classification
Master classification with XGBoost using a practical, beginner-friendly example. Understand how the algorithm builds decision trees, calculates log loss, optimizes splits, and uses probabilities to make accurate class predictions. A must-read for aspiring machine learning engineers.

Aryan
Aug 16
Â
Â


XGBoost For Regression
Dive into a step-by-step explanation of how XGBoost handles regression problems using a CGPA vs. salary dataset. Understand residual learning, tree construction, similarity scores, gain calculations, and how each stage progressively refines model accuracy. Ideal for beginners and intermediates mastering XGBoost.

Aryan
Aug 11
Â
Â


Introduction to XGBoost
XGBoost is one of the most powerful tools for structured/tabular data — known for its speed, scalability, and high performance. In this post, I’ve shared a detailed explanation of what makes XGBoost so effective, along with its history, features, and real-world use. A great resource for anyone learning ML!

Aryan
Jul 26
Â
Â
bottom of page