Indian researchers from Xerox and the International Institute of Information Technology at Hyderabad have used the sport of Cricket to develop a novel approach to interpreting human movement recorded by video.

As the trio of authors from the Institute's Centre for Visual Information Technology write, in an ArXiv paper titled Fine-Grain Annotation of Cricket Videos, “The labelling of human actions in videos is a challenging problem for computer vision systems”. The challenge comes because one needs to identify the frames that represent movement, figure out who is moving and then identify and apply a “semantic label” to the movements. The authors say each of the three tasks is doable discretely, but doing them all at once is hard.

Cricket, it turns out, offers a way to tackle that problem because it is documented with two public data sets: videos of matches and text commentaries of the same games written on cricket news web sites. Those commentaries mention players by name and describe their actions, providing very useful clues with which to interpret what's going on in a video. Armed with commentaries from CricInfo and videos from the Indian Premier League's YouTube Channel, the researchers therefore had plenty to work with.

Results are promising: the team was often able to match text commentary with a specific frame of video from matches depicting the named player performing the action described by commentators.

Of course there's plenty more work to be done on this idea, but the authors are hopeful their experiment will have wider applications among the video recognition community. ®