Run SVD from Apache Spark

Run SVD from Apache Spark

Create a unigram/5-gram matrix array:

//Covert to Spark data

LinkedList<Vector> rowsList = new LinkedList<Vector>();
for (int i = 0; i < array.length; i++) {
Vector currentRow = Vectors.dense(array[i]);
rowsList.add(currentRow);
}
JavaRDD<Vector> rows = JavaSparkContext.fromSparkContext(sc).parallelize(rowsList);

// Create a RowMatrix from JavaRDD<Vector>.
RowMatrix mat = new RowMatrix(rows.rdd());

Compute SVD

SingularValueDecomposition<RowMatrix, Matrix> svd = mat.computeSVD(60, true, 1.0E-9d);
RowMatrix U = svd.U();
DenseMatrix U_matrix = new DenseMatrix((int) U.numRows(), (int) U.numCols(), U.toBreeze().toArray$mcD$sp(), true);

//row matrix to dense matrix
//
Vector s = svd.s();
double []s_arr = s.toArray();

for (int id=0;id < s.size();id++){
s_arr[id] = 1/ s_arr[id];
}

Vector s_inverse = Vectors.dense(s_arr);

Matrix sm_inverse = DenseMatrix.diag(s_inverse);

Matrix V = svd.V();
Matrix V_T = V.transpose();


anh

Leave a Reply

Your email address will not be published. Required fields are marked *