
Singular Value Decomposition (SVD)
- Aryan

- Apr 21
- 8 min read
Non-Square Matrix
Let’s first understand what a non-square matrix is, with an example, and how it relates to linear transformations.
A non-square matrix is a matrix where the number of rows and columns are not equal. These matrices are important in transforming vectors from one dimensional space to another.
Example 1: Transforming 2D Vectors to 3D Space
Consider the matrix :

This is a 3 × 2 matrix. That means it takes 2D vectors as input and transforms them into 3D vectors.
Now, consider unit vectors in 2D space:
i = (1,0)
j = (0,1)

So, this matrix takes the 2D input space and maps it to 3D space. The original 2D coordinate system is thus embedded into a 3D coordinate system.
Input Space: Dimension where the original vectors lie (here, 2D).
Output Space: Dimension where the transformed vectors lie (here, 3D).
Example 2: Transforming 3D Vectors to 2D Space
Now consider another matrix :

This is a 2 × 3 matrix. It takes 3D vectors as input and transforms them into 2D vectors.
Let’s multiply it with a 3D vector :

The result is a 2 × 1 vector, showing that the 3D input has been projected into a 2D space .
General Concept
If we have a matrix of size m × n:
n is the dimension of the input vector (input space).
m is the dimension of the output vector (output space).

Rectangular Diagonal Matrix
Let’s understand what a rectangular diagonal matrix is, along with an example.
A rectangular diagonal matrix is a non-square matrix where the non-zero elements lie along a diagonal, and it typically combines two operations during a linear transformation:
Dimension reduction or projection, and
Scaling of the components.
Example 1: Composition of Two Transformations
Consider the matrix :

This is a 2×3 matrix. We can break this transformation into two steps :


So, the full transformation first reduces the dimension from 3D to 2D, then scales the vector components by a and b respectively .
Intuition

The first matrix is 3 × 2 and projects the scaled vector back into 3D space.
The second matrix is 2 × 2 and performs the scaling.
In this case:
The scaling happens first,
Then, the embedding or projection into 3D happens by appending a zero in the third dimension.
When dealing with a rectangular diagonal matrix:
It represents a composition of two transformations.
You can often interpret the transformation as :
Scaling → Projection or Embedding (depending on matrix shape).
Whenever you see a rectangular diagonal matrix of size m × n, remember:
If m < n , it’s a projection (dimension reduction).
If m > n , it’s an embedding (adding extra dimensions with zero padding).
The diagonal values scale the relevant components.
What is SVD ?
Singular Value Decomposition (SVD) is a matrix factorization technique that decomposes any real or complex matrix into three other matrices:
A = UΣVᵀ
Where:
A is the original m × n matrix.
U is an m × m orthogonal matrix whose columns are the left singular vectors.
Σ is an m × n diagonal matrix with singular values (non-negative real numbers) on the diagonal.
Vᵀ is the transpose of an n × n orthogonal matrix V, whose columns are the right singular vectors.
Applications of SVD
Machine Learning and Data Science
SVD is used in Principal Component Analysis (PCA) for dimensionality reduction, especially when dealing with high-dimensional data.
It’s widely applied in recommendation systems, like the Netflix recommendation algorithm.
Natural Language Processing (NLP)
In Latent Semantic Analysis (LSA), SVD helps extract semantic meaning from large text corpora by reducing the dimensions of term-document matrices.
Computer Vision
SVD is used for image compression. By keeping only the top singular values, an image can be reconstructed with less storage but high visual fidelity.
Signal Processing
Helps in noise reduction and signal separation, which is useful in communication systems and audio signal processing.
Numerical Linear Algebra
Used for matrix inversion, especially for matrices that are ill-conditioned or not invertible, making it a numerically stable method.
Psychometrics
In psychology and education, SVD is used to extract latent traits from psychological or educational test data.
Bioinformatics
SVD helps analyze gene expression data, identifying underlying patterns of gene activity.
Quantum Computing
Applied in quantum state tomography to analyze and reconstruct quantum states.
SVD – The Equation and Its Relationship with Eigen Decomposition
Now the question is: what are U, Σ, and V in SVD?
The answer lies in eigen decomposition.
Eigen Decomposition :
If we have a square matrix, we can break it down as :

Where :
A is an n × n matrix
V is the matrix of eigenvectors
Λ is the diagonal matrix of eigenvalues
Special Case : Symmetric Matrix
If A is not only square but also symmetric, then :
V becomes orthogonal
𝑉⁻¹ = Vᵀ
Λ remains a diagonal matrix
So the decomposition becomes :
A = VΛVᵀ
Connecting Eigen Decomposition with SVD
SVD Equation :

Here, A can be non-square. That’s the power of SVD — it works for any real m × n matrix.
Now we explore the relationship between this SVD equation and eigen decomposition.
To do this, we convert A into a square symmetric matrix using the trick of multiplying it with its transpose :


Interpretation of U and V
Now, going back to our original SVD equation :
A = UΣVᵀ
We ask: What are U and V ?
U : Columns are eigenvectors of AAᵀ
V : Columns are eigenvectors of AᵀA
So we now understand:
U is derived from AAᵀ
V is derived from AᵀA
These matrices are not directly related to the original A , but rather indirectly through the symmetric matrices AAᵀ and AᵀA .
Singular Vectors and Values
With respect to A :
u and v are called singular vectors
Both U and V are orthogonal matrices
Σ contains the singular values (square roots of eigenvalues of AAᵀ or AᵀA)
U : Left singular vectors
V : Right singular vectors
Now the question arises : What is Σ in SVD ?
We know from the SVD (Singular Value Decomposition) that :
X = ΣΣᵀ or X = ΣᵀΣ
Now, let’s recall eigen decomposition :
A = VΛ𝑉⁻¹
Here, Λ is a diagonal matrix whose entries are the eigenvalues of matrix A.
In the context of SVD, the matrices X and Y refer to :
X = AAᵀ
Y = AᵀA
These matrices share non-zero eigenvalues .
Let’s take an example with a 2 × 3 matrix A.
So, the SVD of A is :
A = UΣVᵀ
Where:
U is a 2 × 2 matrix (left singular vectors),
Σ is a 2 × 3 rectangular diagonal matrix (singular values),
Vᵀ is a 3 × 3 matrix (right singular vectors).
Let’s define Σ :

X = ΣΣᵀ

Y = ΣᵀΣ

What do X and Y represent ?
X = ΣΣᵀ is a diagonal matrix whose entries are the eigenvalues of AAᵀ .
Y = ΣᵀΣ is a diagonal matrix whose entries are the eigenvalues of AᵀA .
So:
X and Y have eigenvalues a² , b² (and 0 in the case of Y, which we can ignore for simplicity).
Taking square roots of these eigenvalues gives us a and b, which are the singular values.
Thus, in SVD:
The diagonal entries of Σ , i.e., a and b, are the square roots of the eigenvalues of either AAᵀ or AᵀA .
These are called the singular values.
Final SVD Summary :
A = UΣVᵀ
U : Columns are the left singular vectors, which are the eigenvectors of AAᵀ
V : Columns of V (before transposing) are the right singular vectors, which are eigenvectors of AᵀA
Σ : Contains the singular values a , b , which are the square roots of eigenvalues of AAᵀ or AᵀA
Geometric Intuition of SVD: Understanding A = UΣVᵀ
Let’s build geometric intuition behind the decomposition:
A = UΣVᵀ
Where A is a matrix that applies a linear transformation to vector space. This transformation can be broken into three parts:
Vᵀ– Rotation (or reflection) of the coordinate system
Σ – Stretching or scaling along the new axes
U – Another rotation (or reflection) applied to the stretched result
Together, these create the full transformation represented by matrix A .
Case: Symmetric Matrix
If A is symmetric, we can write it as :
A = VΛ𝑉⁻¹
Λ is a diagonal matrix of eigenvalues
V is an orthogonal matrix (its columns are the eigenvectors)
For symmetric matrices, SVD and eigen decomposition align, and you can clearly visualize the transformation.
Transformation Steps
Assume the symmetric matrix :

And a standard 2D Cartesian plane .
Step 1: Apply 𝑉⁻¹
This rotates the vector space counterclockwise—since V is orthogonal, the rotation maintains the 90° angle between basis vectors.
Step 2: Apply Λ
Now, the vectors are stretched along the new axes (eigenvectors) by the corresponding eigenvalues. Each axis is scaled independently—this is the heart of PCA and spectral analysis.
Step 3: Apply V
This rotates the stretched space clockwise, restoring alignment with the original basis but with stretched axes because applying V after Λ undoes the initial rotation (V⁻¹).
Interpretation
The original x-axis is rotated to a new position (the red line in your visual).
The y-axis similarly rotates to its new orientation.
After scaling, and final rotation, we get the transformed version of space that matrix A represents.
This layered understanding of matrix transformation is super helpful for applications like PCA, image compression, and more .
Understanding SVD Geometrically : Case of a 2 × 3 Matrix


Summary : Transformation Flow

DEMO 1 [2 * 3]

DEMO 2 [3*2]

How to Calculate SVD
The Singular Value Decomposition (SVD) of a matrix A is given by :
A = UΣVᵀ
Where :
U and V are orthogonal matrices,
Σ is a diagonal matrix with non-negative real numbers (the singular values of A) .
Using Eigen Decomposition to Derive SVD
If we are given a matrix A , and need to compute U , Σ , and Vᵀ, we can use the following method based on eigen decomposition :
Compute AAᵀ :
This matrix is symmetric and positive semi-definite.
Perform eigen decomposition on AAᵀ :
AAᵀ = UΛUᵀ
The columns of U are the eigenvectors of AAᵀ , and Λ contains the eigenvalues .
Compute AᵀA :
Similarly, perform eigen decomposition :
AᵀA = VΛVᵀ
The columns of V are the eigenvectors of AᵀA .
Relating Eigenvalues to Singular Values :
The singular values σᵢ are the square roots of the non - zero eigenvalues λᵢ :

Numerical Stability Note
This approach using eigen decomposition works conceptually, but it's numerically unstable and not used in practice.
SVD in PCA
In Principal Component Analysis (PCA), we aim to find the principal components of the data. There are two main approaches to achieve this:
Using Eigen Decomposition of the Covariance Matrix
Using Singular Value Decomposition (SVD) Directly
Eigen Decomposition Approach
First, we center the data (subtract the mean from each feature).
Then, we compute the covariance matrix of the data.
For a dataset with shape n × d , the covariance matrix has shape d × d.
We perform eigen decomposition on the covariance matrix :
Cov(X) = QΛQᵀ
The eigenvectors represent the directions of principal components.
The eigenvalues represent the variance captured along each principal component.
SVD Approach (More Efficient)
Instead of computing the covariance matrix, we can directly apply SVD to the centered data matrix X .
Perform SVD :
X = UΣVᵀ
The columns of V (or rows of Vᵀ) give the principal directions.
The singular values in Σ relate to the variance captured along each component.
Advantages of SVD in PCA :
No need to compute the covariance matrix.
More numerically stable and efficient, especially for high-dimensional data .
Commonly used in practical implementations like scikit-learn's PCA .
Understanding the Relationship Between SVD and Covariance in PCA
Let’s consider the Iris dataset. It contains 150 rows (samples) and 4 columns (features):
Sepal length
Sepal width
Petal length
Petal width
Hence, the data matrix X has dimensions 150 × 4.
If we calculate the covariance matrix of this data, it will be of shape 4 × 4 , since covariance is calculated between features .
Step-by-step: How Covariance is Computed

How SVD Connects to Covariance

Why SVD is Powerful in PCA



