三个版本的WorkCount

Java版的wordcount

1.读取文件的数据,一行行读。

FileRead.readline( );

2.对读取的数据进行切割(比如 “  ”)

line.sqlit (" ");

string[ ] words=line.split ( "  ");

3. 对数组中的单词进行统计

Map<String,Integer> count= new HashMap<String,Integer> ( ) ;

for(String word:word){

if (count.contains(word)){

Integer  val= Count.get(word);

Count.put (word,val++);

}

else{

count.put(word,1);

}  }

-------------------------------------------------------------------------------------------------------------------

Hadoop版的wordcount     MAP

1. TextInputFormat ---> readLine

2. Map<key .value>{

String line=value.toString( );

String [ ] words=Line.split(" ");

for(String word:words){

Context.write(word,new Inwritable(1));

}}

Hadoop shuffle

Reduce <key.value>{

list.foreach(++++);

contex.write(word,num);

}

-------------------------------------------------------------------------------------------------------------------

Storm 版WordCountBolt

spot

FileReader.readerline( );

输出:tuple Tuple 对象line

SplitBolt

输入:line

string[  ] words=line.split( " ");

for(String word:word){

输出:

Collecto.emit(word);

}

WordCountBolt

输入  word

Map<string,integer> counts =new HashMap<String ,Integer> ( ) ;

if (counts.contains(word)){

Integer val=counts.get(word);

counts.put(word,val++)++;

}else{

counts.put(word,1)

}

}

 

 

猜你喜欢

转载自blog.csdn.net/abcdefghwelcome/article/details/85054325
今日推荐