ÀÖÓãµç¾º

½ÌÓýÐÐÒµA¹ÉIPOµÚÒ»¹É£¨¹ÉƱ´úÂë 003032£©

È«¹ú×Éѯ/ͶËßÈÈÏߣº400-618-4000

SparkStreamingÁ¬½ÓKafkaÁ½ÖÖ·½Ê½

¸üÐÂʱ¼ä:2021Äê12ÔÂ16ÈÕ18ʱ18·Ö À´Ô´:ÀÖÓãµç¾º ä¯ÀÀ´ÎÊý:

Spark StreamingÖ§³Ö´Ó¶àÖÖÊý¾ÝÔ´»ñÈ¡Êý¾Ý,ÆäÖоͰüÀ¨ Kafka£¬ÒªÏë´Ó Êý¾ÝÔ´»ñÈ¡Êý¾Ý£¬Ê×ÏÈÒª½¨Á¢Á½ÕßÖ®¼äµÄÁ¬½Ó£¬±¾½ÚÀ´½éÉÜÁ½ÖÖÁ¬½ÓKafkaµÄ·½Ê½¡£

1.Receiver based Approach:

(1)KafkaUtils.createDstream»ùÓÚ½ÓÊÕÆ÷·½Ê½£¬Ïû·ÑKafkaÊý¾ÝÒÑÌÔÌ­ÆóÒµÖв»ÔÙʹÓÃ;

(2)Receiver×÷Ϊ³£×¤µÄTaskÔËÐÐÔÚExecutorµÈ´ýÊý¾Ý£¬µ«ÊÇÒ»¸öReceiverЧÂʵÍ£¬ÐèÒª¿ªÆô¶à¸ö£¬ÔÙÊÖ¶¯ºÏ²¢Êý¾Ý(union)£¬ÔÙ½øÐд¦Àí£¬ºÜÂé·³;

(3)ReceiverÄÇ̨»úÆ÷¹ÒÁË£¬¿ÉÄܻᶪʧÊý¾Ý£¬ËùÒÔÐèÒª¿ªÆôWAL(ԤдÈÕÖ¾)±£Ö¤Êý¾Ý°²È«£¬ÄÇôЧÂÊÓֻήµÍ;

(4)Receiver·½Ê½ÊÇͨ¹ýzookeeperÀ´Á¬½Ókafka¶ÓÁУ¬µ÷ÓÃKafka¸ß½×API,offset´æ´¢ÔÚzookeeper,ÓÉReceiverά»¤

(5)SparkÔÚÏû·ÑµÄʱºòΪÁ˱£Ö¤Êý¾Ý²»¶ªÒ²»áÔÚCheckpointÖдæÒ»·Ýoffset,¿ÉÄÜ»á³öÏÖÊý¾Ý²»Ò»ÖÂ;

2.· Direct Approach (No Receivers):

(1)KafkaUtils.createDirectStreamÖ±Á¬·½Ê½£¬StreamingÖÐÿÅú´ÎµÄÿ¸öjobÖ±½Óµ÷ÓÃSimple Consumer API»ñÈ¡¶ÔÓ¦TopicÊý¾Ý£¬´ËÖÖ·½Ê½Ê¹ÓÃ×î¶à£¬ÃæÊÔʱ±»ÎʵÄ×î¶à;

(2)Direct·½Ê½ÊÇÖ±½ÓÁ¬½Ókafka·ÖÇøÀ´»ñÈ¡Êý¾Ý£¬´Óÿ¸ö·ÖÇøÖ±½Ó¶ÁÈ¡Êý¾Ý´ó´óÌá¸ß²¢ÐÐÄÜÁ¦

(3)Direct·½Ê½µ÷ÓÃKafkaµÍ½×API(µ×²ãAPl)£¬offset×Ô¼º´æ´¢ºÍά»¤£¬Ä¬ÈÏÓÉSparkά»¤ÔÚcheckpointÖУ¬Ïû³ýÁËÓëzk²»Ò»ÖµÄÇé¿ö

(4)µ±È»Ò²¿ÉÒÔ×Ô¼ºÊÖ¶¯Î¬»¤£¬°Ñoffset´æÔÚMySQL/RedisÖÐ;

Á½ÖÖAPI

Spark StreamingÓëKafka¼¯³É£¬ÓÐÁ½Ì×API£¬Ô­ÒòÔÚÓÚKafka Consumer APIÓÐÁ½Ì×£¬Îĵµ£º

http://spatkapathe.org/docs/2.4.5/streaming-kafka-integration.html

http://spark apache.org/docs/latest/streaming-kafka-integration.html

Kafka0.8.x°æ±¾-ÔçÒÑÌÔÌ­

µ×²ãʹÓÃÀϵÄKafkaAPI:Old Kafika Consumer API

Ö§³ÖReceiver(ÒÑÌÔ´ï)ºÍDirectģʽ£º

Kafka 0.10.x°æ±¾-¿ª·¢ÖÐʹÓÃ

µ×²ãʹÓÃеÄKafkaAPI:New Kafka Consumer API

Ö»Ö§³ÖDirectģʽ

Á½¸ö°æ±¾API

ºÃ¿Ú±®ITÅàѵ






²ÂÄãϲ»¶£º

ÔõÑùʹÓÃSpark ShellÀ´¶ÁÈ¡HDFSÎļþ£¿

Spark Streaming¿ò¼ÜÓÐÊ²Ã´ÌØµã£¿¡¾´óÊý¾ÝÅàѵ¡¿

Spark Streaming¹¤×÷Ô­ÀíÊÇʲô£¿

ÔõÑùÒ»¼üÆô¶¯»ò¹Ø±ÕKafka£¿Óпì½ÝµÄ·½·¨Âð£¿

ÀÖÓãµç¾ºpython+´óÊý¾Ý¿ª·¢¹¤³ÌʦÅàѵ

0 ·ÖÏíµ½£º
ºÍÎÒÃÇÔÚÏß½»Ì¸£¡
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿