My startup promotes Internet family safety, and as a result I can lay claim to having indexed more pornography than perhaps any other company in the world -- more than OpenDNS, Norton, and McAfee combined.

We maintain two data buckets specifically for porn. One contains 7.2 million unique domains and a second holds millions of URLs across Tumblr, Facebook, Twitter and other sites fueled by user-generated content. I believe that this massive data set provides me with unique insights as to how much porn exists on sites like Tumblr. It’s a big probem for Tumblr, and now Yahoo, since most brands will shy away from advertising unless something is done about porn blogs. And parents, I’d venture, should be concerned about their kids hanging out on Tumblr blogs that publish some truly shocking stuff.

Yahoo!’s challenges not only lay with the number of Tumblr blogs that contain sexually explicit content material, or what it calls “NSFW” (not safe for work). That’s because:

Tumblr doesn’t appear to know the difference between tagging and filtering with respect to freedom of speech and the open web. They are not mutually exclusive

Porn bloggers are not tagging content as NSFW, despite what Tumblr claims.

Tumblr is unable to label blogs properly when self-tagging doesn’t occur.

It needs to break down “NSFW” into sub-categories so that innocent sites aren’t caught in porn nets. That’s because “porn” and “gay” don’t mean the same thing.

There is extreme porn on many Tumblr blogs -- some, illegal.

Tumblr relies on bloggers to tag content as NSFW, and when users enable Safe Mode, it automatically filters out “adult themed” blogs. Good in theory but in practice it doesn’t go nearly far enough. Less than 2 percent of Tumblr porn blogs that my company classified were self-tagged as NSFW. Yet over 50 percent used the keyword NSFW in their metadata for their blogs to be indexed by Google and other search engines. Meanwhile 100 percent of them had porn-related keywords in either the domain or metadata. Either those behind Tumblr porn blogs purposely avoid using this tagging system to avoid being filtered by people who would rather not see their content, or they don’t know it exists. Either way, Tumblr’s self-tagging system doesn’t work.

Yahoo recently made news by claiming to pull porn blogs from search. As reported by the Daily Dot last week, “under Tumblr’s new content restrictions, posts from Adult-rated blogs will no longer show up on tags. Any tags. Also, Tumblr is able to flag your account as Adult without you labeling it as such.” But keyword blocking doesn’t work and hasn’t worked since parental control companies started using it in the mid 90’s. While “gay” is automatically blocked by Tumblr safe settings, keywords such as “cfnm” and other related porn-search terms are not, making the content accessible with safe guards enabled. The Daily Dot went on to say, “If your blog has been flagged as Adult, nothing you post will ever appear on Tumblr’s public tag searches. Your posts will only be visible to your own followers and the followers of people who reblog your content.”

Even if Tumblr’s self-tagging worked, or it were able to correctly label adult content on behalf of bloggers, it would provide little benefit as every porn blog is indexed by Google and all other search engines anyway. Not only is Tumblr failing to protect brands and minors inside its own site, it’s making every inch of its content accessible to every search engine without providing any tagging to make it easy for parental controls to spot the porn. It also means that brands that end up with ads on porn blogs will also be exposed in search engines.

For example, Google “porn” and narrow your search to the Tumblr site (here’s how: (site:tumblr.com porn) and you’re confronted with 85 million results. And that’s just one search term.

What kind of porn am I talking about? Not just run of the mill nude photos, that’s for sure.

There’s some extreme material. On a scale of 1 to 10, where youporn.com is a 5, some of the content on Tumblr would reach 9 and possibly 10. We’ve actually reported some Tumblr domains to The National Center of Missing and Exploited Children. By illegal, I’m referring to the blacklist of keywords that’s shared among Industry stake holders that partner with organizations such as The National Center of Missing and Exploited Children in the US and the IWF in the UK, to help identify and block websites that contain images or videos of children being raped and abused.

Some Tumblr domains we classified also contain keywords the British Government may soon make illegal, if Prime Minister David Cameron’s plan to prohibit the possession or distribution of pornography depicting rape becomes law. As I write this, my company has amassed a list of hundreds of Tumblr domains that are classified as pornography and contain the keyword “rape.”

Let me emphasize that I am not advocating censorship. Tumblr wouldn’t prohibit anyone’s freedom of speech by better labeling content. It would simply make it easier for parental controls to identify adult websites to help people avoid the content they would rather not see. It’s about enabling adults to make better-informed choices for themselves and their families. It would enable schools to provide a safer Internet for students. It would help Yahoo! be more precise with its ad targeting while allowing it to avoid placing ads next to hardcore, or possibly illegal, content.

Yahoo! needs to build dedicated crawlers for Tumblr blogs. These crawlers should be intelligent enough to know the difference between a site that talks about adult content and one that actually contains adult content. Each blog would then be labeled according to the type of content on it; porn, sex, gambling, dating, etc.

Lumping everything into “NSFW” is far from ideal. Many adults wish to avoid pornography without excluding other adult-themed categories. I’m confident that lingerie site owners don't want to be filtered out under the “all or nothing” methodology. And some consumers may wish to buy lingerie without stumbling upon content they deem inappropriate.

Some people want full, unrestricted access to pornography. They should have it. That’s freedom of speech. But some would prefer to avoid pornography so they too should have the freedom to block it.