Photo credit: Jason Hiner | CBS Interactive

While Big Data is arguably the hottest buzz phrase in tech in 2012, there is a shockingly scarce amount of information about how real companies are using Big Data to do big things. We recently sat down with Ford, one of the world’s most data-driven and data-rich companies, to talk about how the revived U.S. automaker is using Big Data analytics for real world stuff and what kinds of possibilities it sees for the future of this red-hot segment of IT.

Ford’s Big Data analytics leader John Ginder, who technically runs the Systems Analytics and Environmental Sciences team in Ford Research, said that the combination of Ford’s near-death experience in the mid-2000s and the arrival of CEO Alan Mulally in 2006 have changed the company into a data hound that is sitting on a wealth of data stores that could be used to benefit consumers, the general public, and Ford itself.

Crisis and opportunity

Ford's John Ginder

"We went through a really difficult period in the last decade where we lost about half of our people and were near death at one point,” said Ginder (right). “It really encouraged people to think outside the box and think about solutions coming from folks like us that they may not have considered in the past. There is a lot more willingness to consider analytical solutions, simulations, novel approaches that maybe are different from the traditional business or intuitional approach. That's benefited us greatly."

Ford began started getting serious about analytics in the 1990s as servers and storage got cheaper and many Wall Street companies showed the world what was possible with serious data modeling. Various analytics groups popped up within Ford, including what would become Ginder’s group in Research, as well as separate groups in Marketing, in the Ford Credit department, and in other groups.

Still, all of these analytics groups were focused on a few very specific tasks -- like risk analysis in Ford Credit -- or were doing more abstract scientific stuff like the Research group and weren’t being called upon to be a core business driver. But then, Ford’s near-death experience "helped open people's minds [and] created a sense of panic,” recalled Ginder. He said Ford leaders started looking at each other and asking, “What do we do? Well, let's ask these guys.” That gave analytics the chance to step in and play a big role in Ford’s turnaround.

At the same time, another factor came into play -- the arrival of a new CEO.

Ginder said, “Alan Mulally came in in 2006 and he has meetings every week with his direct reports that are filled with tables and charts saying, 'How are we doing against our objectives? Quantitatively, are we hitting whatever the metrics are, and if we're missing them, then why?' That trickles down and encourages a data-driven approach in the company. I hate to admit it, but some parts of the company would have been less [data-driven] if they were left to their own devices.”

Big Data at Ford

With analytics now embedded into the culture of Ford, the rise of Big Data analytics has created a whole host of new possibilities for the automaker.

"We recognize that the volumes of data we generate internally -- from our business operations and also from our vehicle research activities as well as the universe of data that our customers live in and that exists on the Internet -- all of those things are huge opportunities for us that will likely require some new specialized techniques or platforms to manage,” said Ginder. “Our research organization is experimenting with Hadoop and we're trying to combine all of these various data sources that we have access to. We think the sky is the limit. We recognize that we're just kind of scraping the tip of the iceberg here."

The other major asset that Ford has going for it when it comes to Big Data is that the company is tracking enormous amounts of useful data in both the product development process and the products themselves.

Ginder noted, "Our manufacturing sites are all very well instrumented. Our vehicles are very well instrumented. They're closed loop control systems. There are many many sensors in each vehicle… Until now, most of that information was [just] in the vehicle, but we think there's an opportunity to grab that data and understand better how the car operates and how consumers use the vehicles and feed that information back into our design process and help optimize the user's experience in the future as well."

Of course, Big Data is about a lot more than just harnessing all of the runaway data sources that most companies are trying to grapple with. It’s about structured data plus unstructured data. Structured data is all the traditional stuff most companies have in their databases (as well as the stuff like Ford is talking about with sensors in its vehicles and assembly lines). Unstructured data is the stuff that’s now freely available across the Internet, from public data now being exposed by governments on sites such as data.gov in the U.S. to treasure troves of consumer intelligence such as Twitter. Mixing the two and coming up with new analysis is what Big Data is all about.

"The fundamental assumption of Big Data is the amount of that data is only going to grow and there's an opportunity for us to combine that external data with our own internal data in new ways,” said Ginder. “For better forecasting or for better insights into product design, there are many, many opportunities."

Ford is also digging into the consumer intelligence aspect of unstructured data. Ginder said, "We recognize that the data on the Internet is potentially insightful for understanding what our customers or our potential customers are looking for [and] what their attitudes are, so we do some sentiment analysis around blog posts, comments, and other types of content on the Internet."

That kind of thing is pretty common and a lot of Fortune 500 companies are doing similar kinds of things. However, there’s another way that Ford is using unstructured data from the Web that is a little more unique and it has impacted the way the company predicts future sales of its vehicles.

"We use Google Trends, which measures the popularity of search terms, to help inform our own internal sales forecasts,” Ginder explained. “Along with other internal data we have, we use that to build a better forecast. It's one of the inputs for our sales forecast. In the past, it would just be what we sold last week. Now it's what we sold last week plus the popularity of the search terms... Again, I think we're just scratching the surface. There's a lot more I think we'll be doing in the future."

Big Data still needs better tools

The reason why Ford is only scratching the surface on a lot of this Big Data stuff is that the tools for it are still in their infancy. In spite of the fact that there’s so much buzz around Big Data in 2012, there are still relatively few turn-key commercial tools to help big companies do this stuff. Ginder and his group mostly rely on open source tools like Hadoop for managing large sets of data and the R Project for statistical analysis and other open source apps for data mining and text mining.

While these types of tools are extremely powerful and scalable, they also require highly-skilled, database-trained IT professionals and programmers to operate them. Another one of the promises of Big Data is that non-technical people will eventually be able to use natural language tools to access these giant mashed-up data sets. These “data scientists” of the future won’t have to know how to string together SQL queries, but will be more like business analysts who know how to ask the right kinds of questions in order to discover data gems that can change the ways a company thinks about a problem.

However, Ginder still sees that as a future state that’s still several steps away. "That's a great endpoint I'd love us to move toward,” said Ginder, “but there aren't enough of us and there aren't enough of those tools out there to enable us to do that yet. We have our own specialists who are working with the tools and developing some of our own in some cases and applying them to specific problems. But, there is this future state where we'd like to be where all that data would be exposed. [And] where data specialists -- but not computer scientists -- could go in and interrogate it and look for correlations that might not have been able to look at before. That's a beautiful future state, but we're not there yet."

The good news is that once the tools develop and Ford gets to a future state with Big Data, Ginder would like to see Ford share a lot of its data openly with the larger community.

"We need to give ourselves and everyone in the community access to this data and these tools,” Ginder said. “Some of it is proprietary, of course, but once it's in our hands I think then we might discover applications or uses that we hadn't really imagined that might be more helpful or more important than the ones that were envisioned at the beginning. Get it in people's hands, let them experiment with it, and I'm sure it will open up huge new opportunities for us."

In terms of the amazing possibilities, Ginder speculated about some of the things Ford could do with Big Data once the tools catch up.

"Increasingly we're incorporating cameras on vehicles… What else could we use [camera] data for, and can we combine that high bit-rate data with other kinds of sensor signals to help inform context-awareness for various types of applications, just as another environmental sensor, if you will?” Ginder said. “We've got sensors on the car now. We've got temperature, pressure, humidity, local concentrations of pollutants (the stuff coming out of tailpipes), so what else can we do with these new sensors? That's a huge unexplored opportunity for us. Can you build better weather forecasts? Can you make better traffic predictions? Can you help asthmatics avoid certain areas? Can you control the airflow in the car?"

At this point, it’s easy to tell why Big Data wonks like Ginder are fired up about where Big Data analytics is going to take us in the next few years, even if we’re still only taking baby steps in 2012.

Ginder noted, "Never before did we have all of this data available to us nor did we have the computing power to handle it all. The killer app may be one that we haven't really anticipated yet."

Read the rest of the series

This is the first piece in my four-part series on Ford Motor Company and its transformation into an important player in the technology world.

Part 1: Ford's Big Data chief sees massive possibilities, but the tools need work (ZDNet)

Part 2: How Ford reimagined IT from the inside-out to power its turnaround (TechRepublic)

Part 3: Ford's 'open platform' car: How open is 'open'? (ZDNet)

Part 4: Ford is now a 'personal mobility' company: How the comeback kids are riding tech to a new destiny (CNET)