Image: Getty Images/iStockphoto

Someone grabbed my wrist and pointed sharply below us: 'Quick, look down!' I swam to face the seabed and gasped through my snorkel. A shark stretching some 8m long, as long as a bus, had moved silently underneath us -- and I hadn't noticed it at all.

The whale shark, a beautiful creature, a filter-feeder whose giant mouth gives it a benign look, is good at escaping detection. It's still not known how large the global population of whale sharks is, how they migrate across the oceans, and where they give birth, despite being the largest fish species alive today.

For a few minutes off the coast of Ningaloo in Western Australia, I was able to swim with one such shark, before it disappeared off into the Indian Ocean, too deep and too fast for humans to follow. All that remained of our encounter was a short video, taken by another snorkeller who'd been with me that day.

The tourist that made the video might have thought they were just creating a holiday souvenir, but unknowingly he was making a scientific record for marine researchers, and tagging the shark we'd both seen in a way that would allow us to reconnect with the animal many years later.

Stills from the video of our encounter were sent to a project that's helping to find out more about these enigmatic animals, using a technological toolbox that draws on computer vision, social media, and neural networks to track whale shark movements across the globe.

Whaleshark.org has been curating whale shark photos from all over the world for around 15 years, and uses the distinctive spot patterns surrounding the animals' left pectoral fins to identify particular individuals from their pictures. With the spot patterns as unique as a fingerprint, animals' movements can be tracked across the world using photos and locations uploaded to the site by everyone from professional whale shark researchers to holiday makers that have snapped the creatures during a scuba dive.

Whaleshark.org is run by Wild Me, a not-for-profit that aims to help wildlife research and conservation using technology. The idea for Whaleshark.org came to Jason Holmberg, Wild Me's information architect, in the early 2000s after he took park in an expedition in La Paz, Mexico, that tracked animals by attaching plastic tags with spearguns.

Holmberg asked a member of the expedition how often the tags were subsequently resighted. "He said 'less than one percent of the time'. I said, 'Oh, there's some room for improvement then!'" Holmberg told ZDNet. "So I sat down and started programming and said, 'OK, what if we were to use these natural spots [on a whale shark] like a human fingerprint and just allow people to photograph them?'"

While the spot patterns are distinctive enough for human researchers to identify one shark from any number of animals in the species, Holmberg's efforts to code an algorithm that could do the same were proving fruitless.

Holmberg's friend and subsequently co-founder of Wild Me, NASA pulsar astronomer Zavan Arzoumanian, persuaded Holmberg to put aside coding for one night and join him and a Dutch astronomer for a drink.

The chance meeting proved to be the answer Holmberg had needed: after dejectedly explaining to the astronomer the pattern-matching problem the whale shark project was facing, the Dutchman told him that NASA was already doing exactly the same thing using an algorithm that had come out of the software development for the Hubble space telescope.

"When the Hubble telescope takes pictures of the night sky, it tries to turn those pictures into a larger mosaic. What happens is it needs to match star patterns so it can position the photos correctly [within the mosaic]. That process of matching the stars correctly is exactly the process we need to match whale sharks' spots," Holmberg said.

After uncovering the original paper that had led to the creation of the algorithm -- created by a Princeton physics professor for NASA's Hubble program -- and spending some time refining it, the algorithm was rolled out on Whaleshark.org and has been used to identify whale sharks ever since.

"The algorithm was developed around 1984. It was really well ahead of its time in terms of its elegance. There are many computer vision algorithms that have been developed since then, even for whale sharks, that don't work anywhere near as well. It's the only one that scales to a global dataset... and reliably identify the right whale shark across 50,000 photos," Holmberg said.

The system is underpinned by Amazon's EC2 in the Portland, Oregon region, where Wild Me is based. A public cloud service allows the organisation to scale up and down the servers at its disposal, according to how much computing grunt it needs at any one time.

Surprisingly, the hardest part of building a system that can identify a single whale shark from a cast of thousands is the data management layer involved: "getting data into a manageable format so [researchers] can identify individual animals and use computer vision systems, which require good well managed datasets," Holmberg said.

To deal with the problem, Wild Me built Wildbook, an open source data management framework for use in wildlife and ecology studies.

"In 2003, I didn't understand how bad the data management challenge for wildlife was, and it still is, 13 years later. Most people who began using Wildbook were migrating off of 1990s desktop applications -- off of Access, off of Excel -- which don't allow them to share data, pool data, collect data from citizen scientists."

Image: Wild Me

Wildbook is one part of the IBEIS (image based ecological information system) computer vision platform, storing and managing the pictoral data that IBEIS queries.

IBEIS was developed by Wild Me in concert with researchers at University of Illinois at Chicago and Princeton University, and has been used for other animal identification projects, including monitoring zebra migrations. "Everything we've learned from Whaleshark.org we've been successful at making generic and reusable so other wildlife communities that are facing the same challenges can reuse the system," Holmberg said.

Wild Me is also working with the Rensselaer Polytechnic Institute researchers on convolutional neural networks to identify injuries to whale sharks caused ship-strike or fishing net entanglements. By working out what percentage of whale sharks experience such injuries, conservationists can talk to the companies behind the fishing vessels or tourist boats to get them to change their behaviour and not harm the sharks.

The organisation is also hoping to put computer vision to work identifying and tracking dolphins by the shape of their fins. "We know it can be done. Are those notches and the shape of the fin too subtle for computer vision systems that exist today? We don't think so, but they have been generally too subtle for past efforts."

Wild Me is also looking to make better use of the wealth of material on social media. Over the last couple of years, Holmberg has been data mining YouTube for holiday videos of whale sharks, and the valuable information they contain. The organisation hopes eventually to be able to automatically import the relevant stills from social media sites and leave information for YouTubers on the whale shark they've spotted, and how their video is contributing to research.

When I uploaded pictures of my own whale shark encounter from 2007 to whaleshark.org, the website showed me a whole history of sightings of the same animal dating back to 2002. From then until last year, the male nicknamed Mandu has been seen repeatedly off the coast of Ningaloo in Western Australia. With lifespans though to be 70 years and up, I'm looking forward to following Mandu's progress for decades to come.

Read more by Jo Best