Transport for London (TfL) oversees a network of buses, overground and underground trains, taxis, roads, cycle paths, footpaths and even ferries which are used by millions every day.

Running these vast networks which are integral to so many people’s lives in one of the world’s busiest cities gives it access to huge amounts of data. This is collected through ticketing systems as well as sensors attached to vehicles and traffic signals, surveys and focus groups, and of course social media.

Lauren Sager-Weinstein, head of analytics at TfL spoke to me about the two key priorities for collecting and analyzing this data – planning services, and providing information to customers.

Lauren told me “London is growing at a phenomenal rate. The population is currently 8.6 million and is expected to grow to 10m very quickly. We have to understand how they behave and how to manage their transport needs.”

“Passengers want good services and value for money from us, and they want to see us being innovative and progressive in order to meet those needs.”

Oyster prepaid travel cards were first issued in 2003 and have since been expanded across the network. Passengers effectively “charge” them by converting real money from their bank accounts into “Transport for London money” which are swiped to gain access to buses and trains. This enables a huge amount of data to be collected about precise journeys that are being taken.

Journey mapping

This data is anonymized and used to produce maps showing when and where people are traveling, giving both a far more accurate overall picture, as well as allowing more granular analysis at the level of individual journeys, than was possible before. As a large proportion of London journeys involve more than one method of transport, this level of analysis was not possible in the days when tickets were purchased from different services, in cash, for each individual leg of the journey.

That isn’t to say that integrating state of the art data collection strategies with legacy systems has been easy in a city where the public transport has operated since 1829. For example on London Underground (Tube) journeys passengers are used to “checking out and checking in” – tickets are validated (by automatic barriers) at the start and end of a journey. However on buses, passengers simply check in. Traditionally tickets were purchased from the driver or inspector for a set fee per journey. There is no mechanism for recording where a passenger leaves the bus and ends their journey – and implementing one would have been impossible without creating an inconvenience to the customer.

“Data collection has to be tied to business operations. This was a challenge to us, in terms of tracking customer journeys,” says Sager-Weinstein. TfL worked with MIT, just one of the academic institutions which it has research partnerships with, to devise a Big Data solution to the problem.

“We asked ‘can we use Big Data to infer where someone exited?’ We know where the bus is, because we have location data and we have Oyster data for entry.

“What we do next is look at where the next tap is. If we see the next tap follows shortly after and is at the entry to a tube station, we know we are dealing with one long journey using bus and tube.”

“This allows us to understand load profiles – how crowded a particular bus or range of buses are at a certain time, and to plan interchanges, to minimize walk times and plan other services such as retail.”

Unexpected events

Big Data analysis also helps TfL respond in an agile way then disruption occurs. Sager-Weinstein cites an occasion where Wansworth Council was forced to close Putney Bridge – crossed by 870,000 people every day – for emergency repairs.

“We were able to work out that half of the journeys started or ended very close to Putney Bridge. The bridge was still open to pedestrians and cyclists, so we knew those people would be able to cross and either reach their destination or continue their journey on the other side. They either live locally, or their destination is local.”

“The other half were crossing the bridge at the half-way point of their journey. In order to serve their needs we were able to set up a transport interchange and increase bus service on alternate routes. We also sent them personalized messages about how their journey was likely to be affected. It was very helpful that we were able to use Big Data to quantify them.”

This personalized approach to providing travel information is the other key priority for TfL’s data initiatives.

“We have been working really hard to really understand what our customers want from us in terms of information. We push information from 23 Twitter accounts and provide online customer services 24 hours a day.”

Personalized travel news

Travel data is also used to identify customers who regularly use specific routes and send tailored travel updates to them. “If we know a customer frequently uses a particular station, we can include information about service changes at that station in their updates. We understand that people are hit by a lot of data these days and too much can be overwhelming so there is a strong focus on sending data which is relevant,” says Sager-Weinsten.

“We use information from the back-office systems for processing contactless payments, as well as Oyster, train location and traffic signal data, cycle hire and the congestion charge. We also take into account special events such as the Tour de France and identify people likely to be in those areas. 83% of our passengers rate this service as ‘useful’ or ‘very useful’.” Not bad when you consider that complaining about the state of public transport is considered a hobby by many British people.

TfL also provides its data through open APIs for use by 3rd party app developers, meaning that tailored solutions can be developed for niche user groups.

Its systems currently runs on a number of Microsoft and Oracle based platforms but the organization is currently looking into adopting Hadoop and other open source solutions to cope with growing data demands going forwards.

Plans for the future include increasing the capacity for real-time analytics and working on integrating an even wider range of data sources, to better plan services and inform customers.

Big Data has clearly played a big part in re-energizing London’s transport network. But importantly, it is clear that it has been implemented in a smart way, with eyes firmly on the prize. “One of the most important questions is always ‘why are we asking these questions’” explains Sager-Weinstein.

“Big Data is always very interesting but sometimes it is only interesting. You need to find a business case.”

“We always try to come back to the bigger questions – growth in London and how we can meet that demand, by managing the network and infrastructure as efficiently as possible.”

Thank you for reading my post. Here at LinkedIn and at Forbes I regularly write about management, technology and the mega-trend that is Big Data. If you would like to read my regular posts then please click 'Follow' and feel free to also connect via Twitter, Facebook and The Advanced Performance Institute.

You might also be interested in my new big data case study collection, which you can download for free from here: Big Data Case Study Collection: 7 Amazing Companies That Really Get Big Data.

Here are some other recent articles I have written:

About : Bernard Marr is a globally recognized expert in big data, analytics and enterprise performance. He helps companies improve decision-making and performance using data. His new book is Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance. You can read a free sample chapter here.

Photo: Shutterstock.com