Singular Value Decomposition (SVD)

Aryan
Apr 21
8 min read

Non-Square Matrix

Let’s first understand what a non-square matrix is, with an example, and how it relates to linear transformations.

A non-square matrix is a matrix where the number of rows and columns are not equal. These matrices are important in transforming vectors from one dimensional space to another.

Example 1: Transforming 2D Vectors to 3D Space

Consider the matrix :

This is a 3 × 2 matrix. That means it takes 2D vectors as input and transforms them into 3D vectors.

Now, consider unit vectors in 2D space:

i = (1,0)
j = (0,1)

So, this matrix takes the 2D input space and maps it to 3D space. The original 2D coordinate system is thus embedded into a 3D coordinate system.

Input Space: Dimension where the original vectors lie (here, 2D).
Output Space: Dimension where the transformed vectors lie (here, 3D).

Example 2: Transforming 3D Vectors to 2D Space

Now consider another matrix :

This is a 2 × 3 matrix. It takes 3D vectors as input and transforms them into 2D vectors.

Let’s multiply it with a 3D vector :

The result is a 2 × 1 vector, showing that the 3D input has been projected into a 2D space .

General Concept

If we have a matrix of size m × n:

n is the dimension of the input vector (input space).
m is the dimension of the output vector (output space).

Rectangular Diagonal Matrix

Let’s understand what a rectangular diagonal matrix is, along with an example.

A rectangular diagonal matrix is a non-square matrix where the non-zero elements lie along a diagonal, and it typically combines two operations during a linear transformation:

Dimension reduction or projection, and
Scaling of the components.

Example 1: Composition of Two Transformations

Consider the matrix :

This is a 2×3 matrix. We can break this transformation into two steps :

So, the full transformation first reduces the dimension from 3D to 2D, then scales the vector components by a and b respectively .

Intuition

The first matrix is 3 × 2 and projects the scaled vector back into 3D space.
The second matrix is 2 × 2 and performs the scaling.

In this case:

The scaling happens first,
Then, the embedding or projection into 3D happens by appending a zero in the third dimension.

When dealing with a rectangular diagonal matrix:

It represents a composition of two transformations.
You can often interpret the transformation as :
Scaling → Projection or Embedding (depending on matrix shape).

Whenever you see a rectangular diagonal matrix of size m × n, remember:

If m < n , it’s a projection (dimension reduction).
If m > n , it’s an embedding (adding extra dimensions with zero padding).
The diagonal values scale the relevant components.

What is SVD ?

Singular Value Decomposition (SVD) is a matrix factorization technique that decomposes any real or complex matrix into three other matrices:

A = UΣVᵀ

Where:

A is the original m × n matrix.
U is an m × m orthogonal matrix whose columns are the left singular vectors.
Σ is an m × n diagonal matrix with singular values (non-negative real numbers) on the diagonal.
Vᵀ is the transpose of an n × n orthogonal matrix V, whose columns are the right singular vectors.

Applications of SVD

Machine Learning and Data Science
- SVD is used in Principal Component Analysis (PCA) for dimensionality reduction, especially when dealing with high-dimensional data.
- It’s widely applied in recommendation systems, like the Netflix recommendation algorithm.
Natural Language Processing (NLP)
- In Latent Semantic Analysis (LSA), SVD helps extract semantic meaning from large text corpora by reducing the dimensions of term-document matrices.
Computer Vision
- SVD is used for image compression. By keeping only the top singular values, an image can be reconstructed with less storage but high visual fidelity.
Signal Processing
- Helps in noise reduction and signal separation, which is useful in communication systems and audio signal processing.
Numerical Linear Algebra
- Used for matrix inversion, especially for matrices that are ill-conditioned or not invertible, making it a numerically stable method.
Psychometrics
- In psychology and education, SVD is used to extract latent traits from psychological or educational test data.
Bioinformatics
- SVD helps analyze gene expression data, identifying underlying patterns of gene activity.
Quantum Computing
- Applied in quantum state tomography to analyze and reconstruct quantum states.

SVD – The Equation and Its Relationship with Eigen Decomposition

Now the question is: what are U, Σ, and V in SVD?

The answer lies in eigen decomposition.

Eigen Decomposition :

If we have a square matrix, we can break it down as :

Where :

A is an n × n matrix
V is the matrix of eigenvectors
Λ is the diagonal matrix of eigenvalues

Special Case : Symmetric Matrix

If A is not only square but also symmetric, then :

V becomes orthogonal
𝑉⁻¹ = Vᵀ
Λ remains a diagonal matrix

So the decomposition becomes :

A = VΛVᵀ

Connecting Eigen Decomposition with SVD

SVD Equation :

Here, A can be non-square. That’s the power of SVD — it works for any real m × n matrix.

Now we explore the relationship between this SVD equation and eigen decomposition.

To do this, we convert A into a square symmetric matrix using the trick of multiplying it with its transpose :

Interpretation of U and V

Now, going back to our original SVD equation :

A = UΣVᵀ

We ask: What are U and V ?

U : Columns are eigenvectors of AAᵀ
V : Columns are eigenvectors of AᵀA

So we now understand:

U is derived from AAᵀ
V is derived from AᵀA

These matrices are not directly related to the original A , but rather indirectly through the symmetric matrices AAᵀ and AᵀA .

Singular Vectors and Values

With respect to A :

u and v are called singular vectors
Both U and V are orthogonal matrices
Σ contains the singular values (square roots of eigenvalues of AAᵀ or AᵀA)
U : Left singular vectors
V : Right singular vectors

Now the question arises : What is Σ in SVD ?

We know from the SVD (Singular Value Decomposition) that :

X = ΣΣᵀ or X = ΣᵀΣ

Now, let’s recall eigen decomposition :

A = VΛ𝑉⁻¹

Here, Λ is a diagonal matrix whose entries are the eigenvalues of matrix A.

In the context of SVD, the matrices X and Y refer to :

X = AAᵀ
Y = AᵀA

These matrices share non-zero eigenvalues .

Let’s take an example with a 2 × 3 matrix A.

So, the SVD of A is :

A = UΣVᵀ

Where:

U is a 2 × 2 matrix (left singular vectors),
Σ is a 2 × 3 rectangular diagonal matrix (singular values),
Vᵀ is a 3 × 3 matrix (right singular vectors).

Let’s define Σ :

X = ΣΣᵀ

Y = ΣᵀΣ

What do X and Y represent ?

X = ΣΣᵀ is a diagonal matrix whose entries are the eigenvalues of AAᵀ .
Y = ΣᵀΣ is a diagonal matrix whose entries are the eigenvalues of AᵀA .

So:

X and Y have eigenvalues a² , b² (and 0 in the case of Y, which we can ignore for simplicity).
Taking square roots of these eigenvalues gives us a and b, which are the singular values.

Thus, in SVD:

The diagonal entries of Σ , i.e., a and b, are the square roots of the eigenvalues of either AAᵀ or AᵀA .
These are called the singular values.

Final SVD Summary :

A = UΣVᵀ

U : Columns are the left singular vectors, which are the eigenvectors of AAᵀ
V : Columns of V (before transposing) are the right singular vectors, which are eigenvectors of AᵀA
Σ : Contains the singular values a , b , which are the square roots of eigenvalues of AAᵀ or AᵀA

Geometric Intuition of SVD: Understanding A = UΣVᵀ

Let’s build geometric intuition behind the decomposition:

A = UΣVᵀ

Where A is a matrix that applies a linear transformation to vector space. This transformation can be broken into three parts:

Vᵀ– Rotation (or reflection) of the coordinate system
Σ – Stretching or scaling along the new axes
U – Another rotation (or reflection) applied to the stretched result

Together, these create the full transformation represented by matrix A .

Case: Symmetric Matrix

If A is symmetric, we can write it as :

A = VΛ𝑉⁻¹

Λ is a diagonal matrix of eigenvalues
V is an orthogonal matrix (its columns are the eigenvectors)

For symmetric matrices, SVD and eigen decomposition align, and you can clearly visualize the transformation.

Transformation Steps

Assume the symmetric matrix :

And a standard 2D Cartesian plane .

Step 1: Apply 𝑉⁻¹

This rotates the vector space counterclockwise—since V is orthogonal, the rotation maintains the 90° angle between basis vectors.

Step 2: Apply Λ

Now, the vectors are stretched along the new axes (eigenvectors) by the corresponding eigenvalues. Each axis is scaled independently—this is the heart of PCA and spectral analysis.

Step 3: Apply V

This rotates the stretched space clockwise, restoring alignment with the original basis but with stretched axes because applying V after Λ undoes the initial rotation (V⁻¹).

Interpretation

The original x-axis is rotated to a new position (the red line in your visual).
The y-axis similarly rotates to its new orientation.
After scaling, and final rotation, we get the transformed version of space that matrix A represents.

This layered understanding of matrix transformation is super helpful for applications like PCA, image compression, and more .

Understanding SVD Geometrically : Case of a 2 × 3 Matrix

Summary : Transformation Flow

DEMO 1 [2 * 3]

DEMO 2 [3*2]

How to Calculate SVD

The Singular Value Decomposition (SVD) of a matrix A is given by :

A = UΣVᵀ

Where :

U and V are orthogonal matrices,
Σ is a diagonal matrix with non-negative real numbers (the singular values of A) .

Using Eigen Decomposition to Derive SVD

If we are given a matrix A , and need to compute U , Σ , and Vᵀ, we can use the following method based on eigen decomposition :

Compute AAᵀ :
- This matrix is symmetric and positive semi-definite.
- Perform eigen decomposition on AAᵀ :
  AAᵀ = UΛUᵀ
- The columns of U are the eigenvectors of AAᵀ , and Λ contains the eigenvalues .
Compute AᵀA :
- Similarly, perform eigen decomposition :
  AᵀA = VΛVᵀ
- The columns of V are the eigenvectors of AᵀA .
Relating Eigenvalues to Singular Values :
- The singular values σᵢ are the square roots of the non - zero eigenvalues λᵢ :

Numerical Stability Note

This approach using eigen decomposition works conceptually, but it's numerically unstable and not used in practice.

SVD in PCA

In Principal Component Analysis (PCA), we aim to find the principal components of the data. There are two main approaches to achieve this:

Using Eigen Decomposition of the Covariance Matrix
Using Singular Value Decomposition (SVD) Directly

Eigen Decomposition Approach

First, we center the data (subtract the mean from each feature).
Then, we compute the covariance matrix of the data.
- For a dataset with shape n × d , the covariance matrix has shape d × d.
We perform eigen decomposition on the covariance matrix :

Cov(X) = QΛQᵀ

The eigenvectors represent the directions of principal components.
The eigenvalues represent the variance captured along each principal component.

SVD Approach (More Efficient)

Instead of computing the covariance matrix, we can directly apply SVD to the centered data matrix X .
Perform SVD :
X = UΣVᵀ
- The columns of V (or rows of Vᵀ) give the principal directions.
- The singular values in Σ relate to the variance captured along each component.

Advantages of SVD in PCA :