Spark operator 1. Basic RDD conversion 2. Key-value RDD conversion 3. Action operation

Summary:

RDD: Resilient Distributed Data Set, is a special collection, supports multiple sources  , has fault tolerance mechanism  , can be cached  , supports parallel operations, an RDD represents a dataset in a partition
RDD has two operation operators:

        Transformation: Transformation is a delayed calculation. When one RDD is converted into another RDD, the transformation is not performed immediately. It just remembers the logical operation of the dataset. Action
         (execution): triggers the running of the Spark job, which actually triggers the transformation calculation. Sub-calculation
 
This series mainly explains the function operations commonly used in Spark:
         1. Basic
         RDD conversion 2. Key-value RDD conversion

         3.Action operation chapter


Connection: https://www.cnblogs.com/MOBIN/p/5384543.html#9

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325815693&siteId=291194637