ÀÖÓãµç¾º

½ÌÓýÐÐÒµA¹ÉIPOµÚÒ»¹É£¨¹ÉƱ´úÂë 003032£©

È«¹ú×Éѯ/ͶËßÈÈÏߣº400-618-4000

Spark½ÇÉ«ÔÚStandAloneÖеķֲ¼

¸üÐÂʱ¼ä:2022Äê03ÔÂ11ÈÕ15ʱ59·Ö À´Ô´:ÀÖÓãµç¾º ä¯ÀÀ´ÎÊý:

Spark½ÇÉ«ÔÚStandAloneÖеķֲ¼

ÔÚStandAloneÖÐDriver Program£¬Ï൱ÓÚAppMaster£¬Õû¸öÓ¦ÓùÜÀíÕߣ¬¸ºÔðÓ¦ÓÃÖÐËùÓÐJobµÄµ÷¶ÈÖ´ÐÐ; ÔËÐÐJVM Process£¬ÔËÐгÌÐòµÄMAINº¯Êý£¬±ØÐë´´½¨SparkContextÉÏÏÂÎĶÔÏó;Ò»¸öSparkApplication½öÓÐÒ»¸ö;

µÚ¶þ¡¢Executors Ï൱ÓÚÒ»¸öÏ̳߳Ø£¬ÔËÐÐJVM Process£¬ÆäÖÐÓкܶàỊ̈߳¬Ã¿¸öÏß³ÌÔËÐÐÒ»¸öTaskÈÎÎñ£¬Ò»¸öTaskÈÎÎñÔËÐÐÐèÒª1 Core CPU£¬ËùÓпÉÒÔÈÏΪExecutorÖÐÏß³ÌÊý¾ÍµÈÓÚCPU CoreºËÊý; Ò»¸öSpark Application¿ÉÒÔÓжà¸ö£¬¿ÉÒÔÉèÖøöÊýºÍ×ÊÔ´ÐÅÏ¢¡£


Óû§³ÌÐò´Ó×ʼµÄÌá½»µ½×îÖյļÆËãÖ´ÐУ¬ÐèÒª¾­ÀúÒÔϼ¸¸ö½×¶Î£º
1) Óû§³ÌÐò´´½¨SparkContext ʱ£¬Ð´´½¨µÄSparkContext ʵÀý»áÁ¬½Óµ½ClusterManager¡£Cluster Manager »á¸ù¾ÝÓû§ÌύʱÉèÖõÄCPU ºÍÄÚ´æµÈÐÅϢΪ±¾´ÎÌá½»·ÖÅ伯Ëã×ÊÔ´£¬Æô¶¯Executor¡£

2) Driver»á½«Óû§³ÌÐò»®·ÖΪ²»Í¬µÄÖ´Ðн׶ÎStage£¬Ã¿¸öÖ´Ðн׶ÎStageÓÉÒ»×éÍêÈ«ÏàͬTask×é³É£¬ÕâЩTask·Ö±ð×÷ÓÃÓÚ´ý´¦ÀíÊý¾ÝµÄ²»Í¬·ÖÇø¡£Ôڽ׶λ®·ÖÍê³ÉºÍTask´´½¨ºó£¬Driver»áÏòExecutor·¢ËÍTask;

3) ExecutorÔÚ½ÓÊÕµ½Taskºó£¬»áÏÂÔØTaskµÄÔËÐÐʱÒÀÀµ£¬ÔÚ×¼±¸ºÃTaskµÄÖ´Ðл·¾³ºó£¬»á¿ªÊ¼Ö´ÐÐTask£¬²¢ÇÒ½«TaskµÄÔËÐÐ״̬»ã±¨¸øDriver;

4) Driver»á¸ù¾ÝÊÕµ½µÄTaskµÄÔËÐÐ״̬À´´¦Àí²»Í¬µÄ״̬¸üС£Task·ÖΪÁ½ÖÖ£ºÒ»ÖÖÊÇShuffle Map Task£¬ËüʵÏÖÊý¾ÝµÄÖØÐÂÏ´ÅÆ£¬Ï´ÅƵĽá¹û±£´æµ½Executor ËùÔÚ½ÚµãµÄÎļþϵͳÖÐ;ÁíÍâÒ»ÖÖÊÇResult Task£¬Ëü¸ºÔðÉú³É½á¹ûÊý¾Ý;

5) Driver »á²»¶ÏµØµ÷ÓÃTask£¬½«Task·¢Ë͵½ExecutorÖ´ÐУ¬ÔÚËùÓеÄTask ¶¼ÕýÈ·Ö´ÐлòÕß³¬¹ýÖ´ÐдÎÊýµÄÏÞÖÆÈÔȻûÓÐÖ´Ðгɹ¦Ê±Í£Ö¹;

Spark³ÌÐòÔËÐвã´Î½á¹¹

¼à¿ØÒ³Ãæ,ÓÐ4040,ÓÐ8080,ÓÐ18080,ËüÃÇÓкÎÇø±ð?

4040: ÊÇÒ»¸öÔËÐеÄApplicationÔÚÔËÐеĹý³ÌÖÐÁÙʱ°ó¶¨µÄ¶Ë¿Ú,ÓÃÒԲ鿴µ±Ç°ÈÎÎñµÄ״̬.4040±»Õ¼Óûá˳ÑÓµ½4041.4042µÈ¡£

8080: ĬÈÏÊÇStandAloneÏÂ, Master½ÇÉ«(½ø³Ì)µÄWEB¶Ë¿Ú,ÓÃÒԲ鿴µ±Ç°Master(¼¯Èº)µÄ״̬ 18080: ĬÈÏÊÇÀúÊ··þÎñÆ÷µÄ¶Ë¿Ú, ÓÉÓÚÿ¸ö³ÌÐòÔËÐÐÍê³Éºó,4040¶Ë¿Ú¾Í±»×¢ÏúÁË. ÔÚÒÔºóÏë»Ø¿´Ä³¸ö³ÌÐòµÄÔËÐÐ״̬¾Í¿ÉÒÔͨ¹ýÀúÊ·¡£

·þÎñÆ÷²é¿´,ÀúÊ··þÎñÆ÷³¤ÆÚÎȶ¨ÔËÐÐ,¿É¹©ËæÊ±²é¿´±»¼Ç¼µÄ³ÌÐòµÄÔËÐйý³Ì¡£

ÔËÐÐÆðÀ´Ò»¸öSpark Application, È»ºó´ò¿ªÆä4040¶Ë¿Ú,²¢²é¿´£º /export/server/spark/bin/spark-shell --master spark://node1.itcast.cn:7077

ÔÚnode1ÔËÐÐpyspark-shell£¬WEB UI¼à¿ØÒ³ÃæµØÖ·£ºhttp://node1:4040

¿ÉÒÔ·¢ÏÖÔÚÒ»¸öSpark ApplicationÖУ¬°üº¬¶à¸öJob£¬Ã¿¸öJobÓжà¸öStage×é³É£¬Ã¿¸öJobÖ´Ðа´ÕÕDAGͼ½øÐеÄ¡£

Spark Application³ÌÐòÔËÐÐʱÈý¸öºËÐĸÅÄJob¡¢Stage¡¢Task£¬ËµÃ÷ÈçÏ£º

Job£ºÓɶà¸öTask µÄ²¢ÐмÆË㲿·Ö£¬Ò»°ãSpark ÖÐµÄ action ²Ù×÷(Èçsave¡¢collect£¬ºóÃæ½øÒ»²½ËµÃ÷)£¬»áÉú³ÉÒ»¸öJob¡£

Stage£ºJob µÄ×é³Éµ¥Î»£¬Ò»¸öJob »áÇзֳɶà¸öStage£¬Stage ±Ë´ËÖ®¼äÏ໥ÒÀÀµË³ÐòÖ´ÐУ¬¶øÃ¿¸öStage ÊǶà¸öTask µÄ¼¯ºÏ£¬ÀàËÆmap ºÍreduce stage¡£

Task£º±»·ÖÅäµ½¸÷¸öExecutor µÄµ¥Î»¹¤×÷ÄÚÈÝ£¬ËüÊÇ Spark ÖеÄ×îСִÐе¥Î»£¬Ò»°ãÀ´ËµÓжàÉÙ¸öParitition¡£

(ÎïÀí²ãÃæµÄ¸ÅÄ¼´·ÖÖ§¿ÉÒÔÀí½âΪ½«Êý¾Ý»®·Ö³É²»Í¬²¿·Ö²¢Ðд¦Àí)£¬¾Í»áÓжàÉÙ¸öTask£¬Ã¿¸öTask Ö»»á´¦Àíµ¥Ò»·ÖÖ§ÉϵÄÊý¾Ý¡£

Ò»¸öSpark³ÌÐò»á±»·Ö³É¶à¸ö×ÓÈÎÎñ(Job)ÔËÐÐ, ÿһ¸öJob»á·Ö³É¶à¸öState(½×¶Î)À´ÔËÐÐ, ÿһ¸öStateÄÚ»á·Ö³öÀ´¶à¸öTask(Ïß³Ì)À´Ö´ÐоßÌåÈÎÎñ¡£



²ÂÄãϲ»¶£º

SparkµÄ¿ò¼ÜÄ£¿éºÍÔËÐÐģʽÊÇʲô£¿

SparkµÄÓ¦Óó¡¾°ÓÐÄÄЩ£¿

SparkÓëHadoopÓÐÄÄÐ©Çø±ð£¿¡¾´óÊý¾ÝÅàѵ¡¿

ÔõÑùʹÓÃSpark ShellÀ´¶ÁÈ¡HDFSÎļþ£¿

ÀÖÓãµç¾ºPython+´óÊý¾Ý¿ª·¢Åàѵ¿Î³Ì

0 ·ÖÏíµ½£º
ºÍÎÒÃÇÔÚÏß½»Ì¸£¡
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿