While some may find it concerning that sites on the Internet, such as Facebook, Google, and Twitter, will monitor user activity, that observation allows them to find the most interesting topics for you to read. The algorithm Twitter uses considers factors such as the number of posts on a specific topic as well as the rate the post count increases. Now researchers at MIT have developed a new algorithm that can predict trending tweets with near perfect accuracy, but hours faster.
To craft the algorithm the researchers used machine learning to analyze the data on a set of 400 topics: 200 trended, 200 did not. Machine learning has been used before to study datasets, but this time was different in a rather significant way. No initial model was created prior to data analysis. The information was just fed to the machine learning system that then crafted its own hypotheses it then tested. When it finished that processing it had an algorithm to predict trending topics accurately 95% of the time, with only a 4% false-positive rate. On average this was also accomplished an hour and a half faster than Twitter's own algorithm, but in extreme cases it was four or five hours faster.
If unleashed onto the whole of Twitter, the accuracy should increase even, due to the greater amount of information to base a hypothesis on. If unleashed on the statistical analysis world though, the impacts could be greater than just finding what's trending as the novel idea of letting machine learning build its own models could help find other, previously unseen, patterns.