Giving Computers the Ability to Identify Actions in Videos
One technology that has been hitting a number of headlines over the past several years is natural language processing, as it allows us to interact with computers without the formal logic computers typically require. Researchers at MIT are looking to take that interaction beyond language by giving computers the ability to analyze and identify the actions it sees.
Activity recognition techniques have been developed before, but the MIT researchers have made great strides forward with it by bringing in some of the lessons learned from natural language processing. When parsing a sentence, a computer can identify its different parts and then work out their connections to understand what it is reading. Similarly actions can be divided into sub-activities, which a computer can watch for and connect together. It may start with a large number of hypotheses, but as it continues to observe the action, a computer can make an accurate guess about what it is seeing.
By applying this approach, the researchers were able to give the algorithm the ability to identify actions before they are completed, a fixed amount of memory usage, and linear scaling for execution time so that files ten times larger only take ten times longer to process. While this algorithm has any number of applications, the researchers foresee medical uses for it, such as watching physical therapy exercises, and reporting if they are not being done properly.