Spark核心编程-RDD行动操作

一、集合标量行动操作

  • first: 返回RDD中第一个元素,不排序
  • count: 返回RDD中元素数量
  • reduce: 根据映射函数,对RDD进行计算
  • collect: 将RDD转化为数组
  • take(num): 获取RDD中从0到num-1下标的元素,不排序
  • top(num): 按照默认降序排序的或者指定规则排序返回前num个元素
  • takeOrdered(num): 和top类似,只不过它是以升序排序,返回前num个元素
val rdd =sc.makeRDD(List(("E",5),("B",2),("A",1),("D",4),("C",3),("H",7)),2)

scala> rdd.first
res0: (String, Int) = (E,5)

scala> rdd.take(3)
res1: Array[(String, Int)] = Array((E,5), (B,2),(A,1))

scala> rdd.top(3)
res2: Array[(String, Int)] = Array((H,7), (E,5),(D,4))

scala> rdd.takeOrdered(3)
res3: Array[(String, Int)] = Array((A,1), (B,2),(C,3))

scala> rdd.count
res4: Long = 6

scala> rdd.collect
res5: Array[(String, Int)] = Array((E,5), (B,2),(A,1), (D,4), (C,3), (H,7))

scala> rdd.reduce((x,y) => (x._1 + y._1, x._2+ y._2))
res6: (String, Int) = (DCHEBA,22)
  • lookUp(key:K):Seq[V] 指定K值,返回RDD中该K对应的所有V值
val rdd = sc.makeRDD(Array(("A",0),("A",2),("B",3)))
rdd.lookUp("A")

res0: Seq[Int] = WrappedArray(0,2)
  • countByKey(): Map[K, Long] 统计RDD[K,V]中k的个数
  • countByValue()(implicit ord: Ordering[T] = null):Map[T, Long] 统计RDD[K,V]中V的个数
  • foreach(f: T => Unit): Unit 遍历每一个元素
  • foreachPartition(f: Iterator[T] => Unit): Unit遍历每一个分区
  • sortBy[K](f: (T) => K,ascending: Boolean =true,numPartitions: Int = this.partitions.length)(implicit ord: Ordering[K],ctag: ClassTag[K]): RDD[T] 根据指定的排序函数将RDD中的元素进行排序
  • sortByKey(ascending: Boolean = true,numPartitions: Int = self.partitions.length)
val rdd1 = sc.makeRDD(List(5,1,6,9,2))
val rdd2 =sc.makeRDD(List("hadoop","spark","hive","endeca","storm"))
val rdd3 = rdd1.zip(rdd2)

rdd3.sortByKey().collect
Array((1,spark), (2,storm), (5,hadoop), (6,hive),(9,endeca))

rdd3.sortByKey(false).collect
Array((9,endeca), (6,hive), (5,hadoop), (2,storm),(1,spark))

二、存储行动操作

  • saveAsTextFile(path: String): Unit 以文本文件形式存储
  • saveAsTextFile(path: String, codec: Class[_ <:CompressionCodec]): Unit 以文本文件形式存储,并且可以指定压缩类型
  • saveAsObjectFile(path: String): Unit 将RDD元素序列化成对象存入文件

猜你喜欢

转载自blog.csdn.net/Anbang713/article/details/81587963