Java MapReduce for top N Twitter Hashtags
Back in 2015 i had to implement MapReduce job to extract top 15 hashtags from twitter’s raw data in Hadoop. This was a part of Business Intelligence lecture exercise at Vienna University of Technology. Regex used (not the best one by my opinion), for hashtag extraction, has following format: String regex = “text\”:\\\”(.*)\\\”,\”source”; The ‘source’ field comes ...
By dzhamzic on September 21, 2016