ÀÖÓãµç¾º

½ÌÓýÐÐÒµA¹ÉIPOµÚÒ»¹É£¨¹ÉƱ´úÂë 003032£©

È«¹ú×Éѯ/ͶËßÈÈÏߣº400-618-4000

RDDÐж¯Ëã×ÓAPIÏêϸ½²½â

¸üÐÂʱ¼ä:2021Äê04ÔÂ28ÈÕ14ʱ08·Ö À´Ô´:ÀÖÓãµç¾º ä¯ÀÀ´ÎÊý:

ÀÖÓãµç¾º-Ò»ÑùµÄ½ÌÓý£¬²»Ò»ÑùµÄÆ·ÖÊ


Ðж¯Ëã×ÓÖ÷ÒªÊǽ«ÔÚÊý¾Ý¼¯ÉÏÔËÐмÆËãºóµÄÊýÖµ·µ»Øµ½Çý¶¯³ÌÐò£¬´Ó¶ø´¥·¢ÕæÕýµÄ¼ÆËã¡£ÏÂÃæ£¬ÁоÙһЩ³£ÓõÄÐж¯Ëã×ÓAPI£¬Èç±í1Ëùʾ¡£

±í1 ³£ÓõÄÐж¯Ëã×ÓAPI

Ðж¯Ëã×Ó Ïà¹ØËµÃ÷
count() ·µ»ØÊý¾Ý¼¯ÖеÄÔªËØ¸öÊý                                                                                    
first() ·µ»ØÊý×éµÄµÚÒ»¸öÔªËØ
take(n) ÒÔÊý×éµÄÐÎʽ·µ»ØÊý×鼯ÖеÄǰn¸öÔªËØ
reduce(func) ͨ¹ýº¯Êýfunc£¨ÊäÈëÁ½¸ö²ÎÊý²¢·µ»ØÒ»¸öÖµ£©¾ÛºÏÊý¾Ý¼¯ÖеÄÔªËØ
collect() ÒÔÊý×éµÄÐÎʽ·µ»ØÊý¾Ý¼¯ÖеÄËùÓÐÔªËØ
foreach(func) ½«Êý¾Ý¼¯ÖеÄÿ¸öÔªËØ´«µÝµ½º¯ÊýfuncÖÐÔËÐÐ

ÏÂÃæ£¬½áºÏ¾ßÌåµÄʾÀý¶ÔÕâЩÐж¯Ëã×ÓAPI½øÐÐÏêϸ½²½â¡£

  • count()

count()Ö÷ÒªÓÃÓÚ·µ»ØÊý¾Ý¼¯ÖеÄÔªËØ¸öÊý¡£¼ÙÉ裬ÏÖÓÐÒ»¸öarrRdd£¬Èç¹ûҪͳ¼ÆarrRddÔªËØµÄ¸öÊý£¬Ê¾Àý´úÂëÈçÏ£º

  scala> val arrRdd=sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize at <console>:24
  scala> arrRdd.count()
  res0: Long = 5

ÉÏÊö´úÂëÖУ¬µÚ1ÐдúÂë´´½¨ÁËÒ»¸öRDD¶ÔÏ󣬵±arrRddµ÷ÓÃcount()²Ù×÷ºó£¬·µ»ØµÄ½á¹ûÊÇ5£¬ËµÃ÷³É¹¦»ñÈ¡µ½ÁËRDDÊý¾Ý¼¯µÄÔªËØ¡£ÖµµÃÒ»ÌáµÄÊÇ£¬¿ÉÒÔ½«µÚÒ»ÐдúÂë·Ö½â³ÉÏÂÃæÁ½ÐдúÂ룬¾ßÌåÈçÏ£º

val arr = Array(1£¬2£¬3£¬4£¬5)
val arrRdd = sc.parallelize(arr)

ÉÏÊö´úÂëÖУ¬µÚ1ÐдúÂë´´½¨ÁËÒ»¸öRDD¶ÔÏ󣬵±arrRddµ÷ÓÃcount()²Ù×÷ºó£¬·µ»ØµÄ½á¹ûÊÇ5£¬ËµÃ÷³É¹¦»ñÈ¡µ½ÁËRDDÊý¾Ý¼¯µÄÔªËØ¡£ÖµµÃÒ»ÌáµÄÊÇ£¬¿ÉÒÔ½«µÚÒ»ÐдúÂë·Ö½â³ÉÏÂÃæÁ½ÐдúÂ룬¾ßÌåÈçÏ£º

val arr = Array(1£¬2£¬3£¬4£¬5)
val arrRdd = sc.parallelize(arr)
  • first()

first()Ö÷ÒªÓÃÓÚ·µ»ØÊý×éµÄµÚÒ»¸öÔªËØ¡£ÏÖÓÐÒ»¸öarrRdd£¬Èç¹ûÒª»ñÈ¡arrRddÖеÚÒ»¸öÔªËØ£¬Ê¾Àý´úÂëÈçÏ£º

  scala> val arrRdd=sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize at <console>:24
  scala> arrRdd.first()
  res1: Int = 1

´ÓÉÏÊö½á¹û¿ÉÒÔ¿´³ö£¬µ±Ö´ÐÐarrRdd.first()²Ù×÷ºó·µ»ØµÄ½á¹ûÊÇ1£¬ËµÃ÷³É¹¦»ñÈ¡µ½Á˵Ú1¸öÔªËØ¡£

  • take(n)

take()Ö÷ÒªÓÃÓÚÒÔÊý×éµÄÐÎʽ·µ»ØÊý×鼯ÖеÄǰn¸öÔªËØ¡£ÏÖÓÐÒ»¸öarrRdd£¬Èç¹ûÒª»ñÈ¡arrRddÖеÄǰÈý¸öÔªËØ£¬Ê¾Àý´úÂëÈçÏ£º

 scala> val arrRdd =sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize  at <console>:24
  scala> arrRdd.take(3)
  res2: Array[Int]=Array(1,2,3)

´ÓÉÏÊö´úÂë¿ÉÒÔ¿´³ö£¬Ö´ÐÐarrRdd.take(3)²Ù×÷ºó·µ»ØµÄ½á¹ûÊÇArray(1£¬2£¬3)£¬ËµÃ÷³É¹¦»ñÈ¡µ½ÁËRDDÊý¾Ý¼¯µÄǰ3¸öÔªËØ¡£

  • reduce(func)

reduce()Ö÷ÒªÓÃÓÚͨ¹ýº¯Êýfunc£¨ÊäÈëÁ½¸ö²ÎÊý²¢·µ»ØÒ»¸öÖµ£©¾ÛºÏÊý¾Ý¼¯ÖеÄÔªËØ¡£ÏÖÓÐÒ»¸öarrRdd£¬Èç¹ûÒª¶ÔarrRddÖеÄÔªËØ½øÐоۺÏ£¬Ê¾Àý´úÂëÈçÏ£º

 scala> val arrRdd =sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize at <console>:24
  scala> arrRdd.reduce((a,b)=>a+b)
  res3: Int = 15

ÔÚÉÏÊö´úÂëÖУ¬Ö´ÐÐarrRdd.reduce((a£¬b)=>a+b)²Ù×÷ºó·µ»ØµÄ½á¹ûÊÇ15£¬ËµÃ÷³É¹¦µÄ½«RDDÊý¾Ý¼¯ÖеÄËùÓÐÔªËØ½øÐÐÇóºÍ£¬½á¹ûΪ15¡£

  • collect()

collect()Ö÷ÒªÓÃÓÚÒÔÊý×éµÄÐÎʽ·µ»ØÊý¾Ý¼¯ÖеÄËùÓÐÔªËØ¡£ÏÖÓÐÒ»¸ördd£¬Èç¹ûÏ£ÍûrddÖеÄÔªËØÒÔÊý×éµÄÐÎʽÊä³ö£¬Ê¾Àý´úÂëÈçÏ£º

  scala> val arrRdd =sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize at <console>:24
  scala> arrRdd.collect()
  res4: Array[Int] = Array(1,2,3,4,5)

ÔÚÉÏÊö´úÂëÖУ¬Ö´ÐÐarrRdd.collect()²Ù×÷ºó·µ»ØµÄ½á¹ûÊÇArray(1£¬2£¬3£¬4£¬5)£¬ËµÃ÷³É¹¦µÄ½«RDDÊý¾Ý¼¯ÖеÄÔªËØÒÔÊý×éµÄÐÎʽÊä³ö¡£

  • foreach(func)

foreach()Ö÷ÒªÓÃÓÚ½«Êý¾Ý¼¯ÖеÄÿ¸öÔªËØ´«µÝµ½º¯ÊýfuncÖÐÔËÐС£ÏÖÓÐÒ»¸öarrRdd£¬Èç¹ûÏ£Íû±éÀúÊä³öarrRddÖеÄÔªËØ£¬Ê¾Àý´úÂëÈçÏ£º

 scala> val arrRdd =sc.parallelize(Array(1,2,3,4,5))
  arrRdd: org.apache.spark.rdd.RDD[Int]=
ParallelcollectionRDD[0] at parallelize at <console>:24
  scala> arrRdd.foreach(x => println(x))  1
  2
  3
  4
  5

ÔÚÉÏÊö´úÂëÖУ¬foreach(x => println(x))µÄº¬ÒåÊÇÒÀ´Î±éÀúarrRddÖеÄÿһ¸öÔªËØ£¬°Ñµ±Ç°±éÀúµÄÔªËØ¸³Öµ¸ø±äÁ¿x£¬²¢ÇÒͨ¹ýprintln(x)´òÓ¡³öxµÄÖµ¡£Ö´ÐÐarrRdd.foreach()²Ù×÷ºó£¬arrRddÖеÄÔªËØ±»ÒÀ´ÎÊä³öÁË£¨¼´RDDÊý¾Ý¼¯ÖÐËùÓеÄÔªËØ±»±éÀúÊä³ö£©¡£ÕâÀïµÄarrRdd.foreach(x => println(x))¿ÉÒÔ¼òдΪarrRdd.foreach(println)¡£



²ÂÄãϲ»¶£º

ÔõÑù¸øRDD·ÖÇø£¿¸÷ÖÖģʽϵķÖÇøÊýÄ¿ÉèÖÃ

RDDÊÇÈçºÎ²Ù×÷Êý¾Ýת»»µÄ£¿RDDת»»Ëã×ÓAPIʾÀý

Á½ÖÖRDDµÄÒÀÀµ¹ØÏµ½éÉÜ    

ÀÖÓãµç¾ºPython+´óÊý¾Ý¿ª·¢

0 ·ÖÏíµ½£º
ºÍÎÒÃÇÔÚÏß½»Ì¸£¡
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿