- Machine Learning with Spark(Second Edition)
- Rajdeep Dua Manpreet Singh Ghotra Nick Pentreath
- 257字
- 2025-04-04 19:20:52
Singular value decomposition
Singular value decomposition of a matrix M: m x n (real or complex) is a factorization with the form UΣV*, where U is an m x R matrix. Σ is an R x R rectangular diagonal matrix with non-negative real numbers on the diagonal, and V is an n x r unitary matrix. r is equal to the rank of the matrix M.
The diagonal entries Σii of Sigma are known as the singular values of M. The columns of U and the columns of V are called the left-singular vectors and right-singular vectors of M respectively.
The following is an example of an SVD in Apache Spark:
package linalg.svd
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.linalg.{Matrix,
SingularValueDecomposition, Vector, Vectors}
object SparkSVDExampleOne {
def main(args: Array[String]) {
val denseData = Seq(
Vectors.dense(0.0, 1.0, 2.0, 1.0, 5.0, 3.3, 2.1),
Vectors.dense(3.0, 4.0, 5.0, 3.1, 4.5, 5.1, 3.3),
Vectors.dense(6.0, 7.0, 8.0, 2.1, 6.0, 6.7, 6.8),
Vectors.dense(9.0, 0.0, 1.0, 3.4, 4.3, 1.0, 1.0)
)
val spConfig = (new
SparkConf).setMaster("local").setAppName("SparkSVDDemo")
val sc = new SparkContext(spConfig)
val mat: RowMatrix = new RowMatrix(sc.parallelize(denseData, 2))
// Compute the top 20 singular values and corresponding
singular vectors.
val svd: SingularValueDecomposition[RowMatrix, Matrix] =
mat.computeSVD(7, computeU = true)
val U: RowMatrix = svd.U // The U factor is a RowMatrix.
val s: Vector = svd.s // The singular values are stored in a
local dense vector.
val V: Matrix = svd.V // The V factor is a local dense matrix.
println("U:" + U)
println("s:" + s)
println("V:" + V)
sc.stop()
}
}