How to convert spark RDD to mahout DRM? -
i fetching data alluxio in mahout using sc.textfile(), spark rdd. program further uses spark rdd mahout drm, therefore needed convert rdd drm. current code remains stable.
an apache mahout drm can created apache spark rdd in following steps:
- convert each row of rdd mahout vector
- zip rdd index (and swap tuple of form
(long, vector)
- wrap rdd drm.
consider following example code:
val rdda = sc.parallelize(array((1.0, 2.0, 3.0), ( 2.0, 3.0, 4.0), ( 4.0, 5.0, 6.0))) val drmrdda: drmrdd[long] = rdda.map(a => new densevector(a)) .zipwithindex() .map(t => (t._2, t._1)) val drma = drmwrap(rdd= drmrdda)
source /more info/ shameless self promotion (toward bottom): my blog
Comments
Post a Comment