ÀÖÓãµç¾º

½ÌÓýÐÐÒµA¹ÉIPOµÚÒ»¹É£¨¹ÉƱ´úÂë 003032£©

È«¹ú×Éѯ/ͶËßÈÈÏߣº400-618-4000

InputFormat½Ó¿ÚµÄ¶¨Òå´úÂëÔõôÉèÖã¿

¸üÐÂʱ¼ä:2020Äê11ÔÂ03ÈÕ17ʱ38·Ö À´Ô´:ÀÖÓã²¥¿Í ä¯ÀÀ´ÎÊý:

      HadoopÖÐÓÐÎå¸ö±à³Ì×é¼þ£¬·Ö±ðÊÇ£ºInputFormat¡¢Mapper¡¢Reducer¡¢Parttioner¡¢OutputFromatºÍCanbiner£¬ÆäÖÐCanbinerµÄ×÷ÓÃÊǶÔMap½×¶ÎµÄÊä³öµÄÖØ¸´Êý¾ÝÏÈ×öÒ»´ÎºÏ²¢¼ÆË㣬ËùÒÔ²»ÊôÓÚ±ØÊô¼þ¡£±¾½Ú¿Î¾ÍÀ´¶ÔMapReducerµÄÕâ5¸ö±Ø±¸×é¼þµÄ´úÂë²Ù×÷²½Öè×öÒ»¸ö¼òµ¥½éÉÜ£º

¡¡¡¡InputFormatÖ÷ÒªÓÃÓÚÃèÊöÊäÈëÊý¾ÝµÄ¸ñʽ£¬ËüÌṩÒÔÏÂÁ½¸ö¹¦ÄÜ£º

¡¡¡¡Êý¾ÝÇзÖ£º°´ÕÕij¸ö²ßÂÔ½«ÊäÈëÊý¾ÝÇзֳÉÈô¸É¸ö·ÖƬ(split)£¬ÒÔ±ãÈ·¶¨MapTask¸öÊýÒÔ¼°¶ÔÓ¦µÄ·ÖƬ(split)¡£

¡¡¡¡·ÎªMapperÌṩÊäÈëÊý¾Ý£º¸ø¶¨Ä³¸ö·ÖƬ(split)£¬½«Æä½âÎö³ÉÒ»¸öÒ»¸öµÄkey/value¼üÖµ¶Ô¡£

¡¡¡¡· Hadoop×Ô´øÁËÒ»¸ö InputFormat½Ó¿Ú£¬¸Ã½Ó¿ÚµÄ¶¨Òå´úÂëÈçÏÂËùʾ£º

public abstract class InputFormat {

     public abstract List getSplits(JobContext context

               ) throws IOException, InterruptedException;

     public abstract RecordReadercreateRecordReader(InputSplit split,

                     TaskAttemptContext context

               ) throws IOException, InterruptedException;

  }

¡¡¡¡´ÓÉÏÊö´úÂë¿ÉÒÔ¿´³ö£¬InputFormat½Ó¿Ú¶¨ÒåÁËgetSplits()ºÍcreateRecordReader()Á½¸ö·½·¨£¬ÆäÖУ¬getSplits()·½·¨¸ºÔð½«ÎļþÇзÖΪ¶à¸ö·ÖƬ(split)£¬createRecordReader()·½·¨¸ºÔð´´½¨RecordReader¶ÔÏó£¬ÓÃÀ´´Ó·ÖƬÖжÁÈ¡Êý¾Ý¡£ÏÂÃæ£¬ÎÒÃÇÖ÷Òª¶ÔgetSplits()·½·¨½øÐнéÉÜ¡£
      getSplits()·½·¨Ö÷ҪʵÏÖÁËÂß¼­ÇÐÆ¬»úÖÆ¡£ÆäÖУ¬ÇÐÆ¬µÄ´óСsplitSizeÊÇÓÉ3¸öֵȷ¶¨µÄ£¬¼´minSize¡¢maxSizeºÍblockSize¡£
     minSize£ºsplitSizeµÄ×îСֵ£¬ÓɲÎÊýmapred.min.split.sizeÈ·¶¨£¬¿ÉÔÚmapred-site.xmlÖнøÐÐÅäÖã¬Ä¬ÈÏΪ1MB¡£
     maxSize£ºsplitSizeµÄ×î´óÖµ£¬ÓɲÎÊýmapreduce.jobtracker.split.metainfo.maxsizeÈ·¶¨£¬¿ÉÔÚmapred-site.xmlÖнøÐÐÉèÖã¬Ä¬ÈÏֵΪ10MB¡£
     blockSize£ºHDFSÖÐÎļþ´æ´¢¿éµÄ´óС£¬ÓɲÎÊýdfs.block,sizeÈ·¶¨£¬¿ÉÔÚhdf-site.xmlÖнøÐÐÐ޸ģ¬Ä¬ÈÏΪ128MB¡£

    ²ÂÄãϲ»¶£º

¡¡Znode´¢´æ½á¹¹ÊÇÔõÑùµÄ?½ÚµãÀàÐÍÓм¸ÖÖ?

¡¡SparkµÄ¼¯Èº°²×°ÓëÅäÖüò½é

¡¡ÀÖÓã²¥¿Í´óÊý¾ÝÅàѵ¿Î³Ì

0 ·ÖÏíµ½£º
ºÍÎÒÃÇÔÚÏß½»Ì¸£¡
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿