“You can’t have everything,” notes comedian Steven Wright. “Where would you put it?”

Similarly, we can’t know everything, because how would we find it? As governments and other organizations support open data and make their data more available, this becomes more of a problem. The premise of big data (or, even, Big Content) is that if we collect enough data, we can learn new insights — but how do we know what data we have in the first place?

That’s where the semantic web comes in — a set of technologies that enables us to not only have a lot of data, but also have a way to describe it, learn what it means, what the relationships are between the types of data, and where the data comes from. It’s like metadata on steroids. This allows the development of new kinds of applications supporting the Internet of Things, such as search engines that can answer more complex questions than “Which Doctor Who actor also played Rasputin?”

“Some of what we’re going to be doing is taking government data that is now available around the world and making it a little more machine-readable with this Semantic Web stuff,” says James Hendler, head of the computer science department at Rensselaer Polytechnic Institute — which recently got an implementation of IBM’s Watson hardware and software for this research — in GeekExchange. “You’ll be able to ask Watson ‘Where can I find data about obesity in Europe?’ Tools for gathering a lot of that information — tying that information to things you find on the Web — is hard to do.”

First described in a computer context in 2001 by Hendler and Tim Berners-Lee, the developer of the World Wide Web, the semantic web is essentially a next-generation World Wide Web, based on data rather than documents, and is sometimes known as Web 3.0. To a certain extent, the concept has had to wait until now, when hardware and data could catch up with it. (Indeed, a number of aspects of the description of the future in the 2001 Scientific American article sound pretty familiar, or at least plausible, now.)

In fact, earlier this year, Gartner named the semantic web as one of its top technology trends. “Semantic technologies extract meaning from data, ranging from quantitative data and text, to video, voice and images,” the company writes. “Many of these techniques have existed for years and are based on advanced statistics, data mining, machine learning and knowledge management. One reason they are garnering more interest is the renewed business requirement for monetizing information as a strategic asset. Even more pressing is the technical need. Increasing volumes, variety and velocity — big data — in IM and business operations, requires semantic technology that makes sense out of data for humans, or automates decisions.”

So, what does this mean to you? It means you need to start getting prepared for this to become a Thing. You need to think about the taxonomy of your data, so that you will be able to add the appropriate metadata when the time comes. You also need to start thinking about how to obtain and use your data contextually, either to work with other semantic applications or to develop your own.

SemanticWeb.com — which, naturally, is intended to encourage the development of semantic web applications — offers other suggestions. This is particularly true if you work in an industry that generates a lot of data, such as financial services, health care, oil and gas, and pharmaceuticals, says Gartner vice president and distinguished analyst Debra Logan.

“The semantic Web requires an entirely different type of thinking about online marketing content,” writes David Amerland in Forbes. “Information placed online needs to be capable of generating some kind of interaction with the online population, which means ‘marketing deliverables’ have to contain real value, not just keywords.” To do that, we need to be able to forge connections in all this data, to see how each piece of knowledge relates to every other, such as through social media, he writes.

It remains to be seen how much of a reality the semantic web will be. It will require a lot of work to bring about. Right now, it’s at the pretty abstract and technical stage, like the Web itself was in the early 1990s. On the other hand, that seems to have turned out okay.