Loading graph…

Sparse Matrices and Truncated SVD

Sparse Matrix

A sparse matrix is a matrix where most entries are zero.

Basket-by-item data are usually sparse because each basket contains only a small fraction of all products.

Dense Storage

A dense matrix stores every cell:

\[n_{rows} \times n_{columns}\]

This is inefficient when most cells are zero.

Sparse Storage

A sparse matrix stores mainly the nonzero entries and their positions.

Storage is closer to:

\[O(\text{number of nonzero entries})\]

instead of:

\[O(n_{rows}n_{columns})\]

Basket-Item Matrix

For basket $i$ and item $j$:

\[x_{ij} = \text{quantity of item } j \text{ in basket } i\]

or for item presence:

\[x_{ij} = 1\{\text{item } j \text{ appears in basket } i\}\]

Most $x_{ij}$ values are zero.

SVD

Singular value decomposition factors a matrix $X$ as:

\[X = UDV^T\]

where:

  • $U$ contains left singular vectors
  • $D$ contains singular values
  • $V$ contains right singular vectors

PCA can be computed using SVD on centered/scaled data.

Truncated SVD

Truncated SVD computes only the first $k$ singular vectors instead of the full decomposition.

This is useful when:

  • the matrix is large
  • only the leading dimensions are needed
  • the matrix is sparse

Why It Matters

For high-dimensional retail data, full dense PCA can be memory-expensive.

Sparse matrices plus truncated SVD allow the analysis to focus on leading structure without materializing all zeros as explicit values.

See

25

25
Ready to start
Sparse Matrices and Truncated SVD
Session: 1 | Break: Short
Today: 0 sessions
Total: 0 sessions