can give detailed explanation of how below agrregate action in spark produces result of (9,4) val rdd = sc.parallelize(list(1,2,3,3)) rdd.aggregate((0, 0)) ((x, y) => (x._1 + y, x._2 + 1), (x, y) => (x._1 + y._1, x._2 + y._2)) res : (9,4) this spark 2.1.0 here (which should not matter much, but...) go the official documentation of aggregate (aka scaladoc ) , read: aggregate elements of each partition, , results partitions, using given combine functions , neutral "zero value". function can return different result type, u, type of rdd, t. thus, need 1 operation merging t u , 1 operation merging 2 u's, in scala.traversableonce. both of these functions allowed modify , return first argument instead of creating new u avoid memory allocation. the signature follows (removed implicit parameter not particularly interesting): aggregate[u](zerovalue: u)(seqop: (u, t) ⇒ u, combop: (u, u) ⇒ u): u the scaladoc says: zerovalue initial value a...