The definitive list of the most interesting and innovative people in data

We love data – a lot. And when we realized there was no good list of the awesome people in data we were horrified. There are so many cool people out there doing amazing things with data – how has no one made a list yet?!?!

We get to meet lots of exciting data innovators at Extract, so we decided to use our knowledge to create the first Data Mavericks list. A lot of thought went in to choosing people who are truly making a difference in the data scene.

It was no easy task narrowing our list down, but 40 under 40 seemed like a good place to start. Here they are. In alphabetical order, please enjoy our list of the Top 40 Data Mavericks under 40.

Aaron Koblin (33), CTO at Vrse

If you haven’t heard of Vrse, you need to check them out. Right now. They’re a virtual reality and spherical filmmaking production company working on creating immersive experiences. These guys have become industry leaders in short-form commercials, music videos, feature films, theatre, design photography and fine arts. As CTO, Aaron uses real world and community generated data to reflect on cultural trends and on the relationship between humans and the systems they create. His projects are seriously cool and his TED talk has more than 1 million views.

Twitter: @aaronkoblin

Aaron Levie (29), Founder of Box

It’s hard to argue with the Maverickness of someone who’s worth over a billion at the young age of 29. Aaron is the creator Box, a startup focused on enabling secure access to data and applications on the web, mobile, and in desktop environments. In just a few short years he’s gathered more than 20 million users spread out among 180,000 businesses. They’re using the Box platform to upload files, collaborate, and share content online. Oh and by the way, Box has customers at 97% of the companies on the Fortune 500. Yeah, we’re impressed too.

Twitter: @levie @boxHQ

Website: box.com

Alex White (26), Founder of Next Big Sound

Think Moneyball, but for music. Alex is the creator of Next Big Sound which is using data to revolutionize the music industry. He takes all the data spewing into the ether – Pandora spins, Facebook likes, digital downloads – and packages them into one central dashboard. NBS can predict which artists will become popular, which chat shows contribute to an artist’s career and which artists brands should use for their marketing efforts. In short, he’s bringing the power of data analytics to a traditionally “gut-based” industry. He was so good in fact, NBS (and Alex) was just aquired by Pandora this year.

Twitter: @mralexwhite @nextbigsound

Website: nextbigsound.com

Alice Zheng (under 40), Director of Data Science at Dato

Dato is one seriously powerful machine learning and graph analytics tool. As their Director of Data Science, Alice is helping to develop a fast, scalable machine learning platform for Big Data analytics to help ease the dependence on expertise. She’s actually trying to automate herself! Dato is bringing machine learning to the masses by making learning algorithms more automated, their outputs more interpretable, and the labeling tasks simpler. We just love it.

Twitter: @rainydata @datoinc

Website: dato.com

Andrew Ng (39), Chief Scientist at Baidu

Chinese Internet-search giant Baidu is attempting to do the seemingly impossible – take on Google. And they’re doing it with the help of Andrew Ng, a Stanford researcher who founded Google’s artificial-intelligence unit. Now, Andrew heads up an artificial-intelligence center in Silicon Valley where he is the leading voice in “deep learning,” a branch of artificial intelligence in which scientists try to get computers to “learn” for themselves by processing massive amounts of data. He also co-founded Coursera, and his ML course has been the inspiration and introduction to data science for more than 100,000 people. Fun fact: Andrew was part of a team that in 2012 famously taught a network of computers to recognize cats after being shown millions of photos.

Twitter: @andrewyng

Website: baidu.com

Ann Miura-Ko (38), Partner at Floodgate

Ann Miura-Ko has been called “the most powerful woman in startups” by Forbes and is a lecturer in entrepreneurship at Stanford. She specializes in investing in e-commerce as well as big data analytics companies. Some of Ann’s investments include Lyft, Ayasdi, Xamarin, Refinery29, Chloe and Isabel, Maker Media, Wanelo, TaskRabbit, and Modcloth. She makes our Mavericks list for taking chances and investing in companies with strong data skills. Her LinkedIn says she’s an “investing ninja assassin” – and we agree!

Twitter: @annimaniac

Website: floodgate.com

Anthony Goldbloom (32), Founder of Kaggle

Kaggle started as a simple idea: why not leverage the brains and free time of data science nerds to help companies solve their data issues? So Anthony created Kaggle, which hosts competitions for data scientists based on real issues and funded by corporates who need help. After seeding his first competition with his own money, Kaggle now uses predictive modeling competitions to solve problems for the likes of NASA, Wikipedia, Ford and Deloitte. Their user base exceeds 100,000 data scientists and enthusiasts. Kaggle is considered the biggest and only “data science university”. It’s a true place to train new data scientists.

Twitter: @antgoldbloom @kaggle

Website: kaggle.com

Azmat Yusuf (33), Founder of Citymapper

Azmat is reinventing the transport app for the world’s most complicated cities. CityMapper can apply its unique algorithm to any city that has open source (free to everyone) transport data. According to Azmat, they can basically plug any data set into their app and it will “just work”. How many people can say that?!? We also love the app’s humourous side. For example, CityMapper will tell you how long it will take to get to your destination via jetpack – a figure based on real jetpack calculations!

Twitter: @azmingo @CityMapper

Website: citymapper.com

Brian Chesky (33), Joe Gebbia (33) & Nathan Blecharczyk (31), Founders of Airbnb

Airbnb makes our Mavericks list because it’s a platform that threatens that status quo of a very old industry. No to mention they’re doing some pretty cool stuff with data. To operate at scale, Airbnb needs to ensure that its ability to match hosts with guests keeps improving over time. It achieves this by gathering better data on its users and improving the algorithms that match the two sides. Leveraging a strong curation system and improving on its ability to scale using data, Airbnb has successfully booked more than 10 million nights and eaten a noticeable chunk of global travel.

Twitter: @bchesky @jgebbia @nathanblec @airbnb

Website: airbnb.com

Christian Rudder (under 40), Co-founder of OKCupid

Christian has personal possession of one of the richest data sets in the world. As the Founder of one of the largest dating sites ever created, OKCupid, he has accumulated data not only from his dating site, but also Twitter, Tumblr, Facebook, and the like. He also authors their fantastic blog, OKTrends, and the popular book Dataclysm, both on the data behind dating. What makes him a true Maverick is that he’s not afraid to use data to tackle tough topics like Race, Sex and Age.

Twitter: @christianrudder @okcupid

Website: okcupid.com

Curtis Lee (37), Founder of Luxe Valet

If you’ve ever struggled to find a parking spot in one of America’s most crowded cities, Curtis may be the answer to your payers. He’s created a free app that offers on-demand valet parking in 9 US cities. Definitely useful, but he makes our Mavericks list for his app’s ability to analyze city demand patterns. Their algorithm can accurately predict where they need the most valet drivers at what times of day – making the service highly efficient.

Twitter: @curtislee @luxevalet

Website: luxe.com

Danielle Morrill (30), Founder of Mattermark

Mattermark is a premier database that rates startups and private companies. Founder Danielle’s job is to constantly analyze the value and growth of startups to help cutting edge venture funds make smarter investments through data science. With an unprecedented amount of data, Mattermark’s blog has become a go-to source for insights into the startup and investor scene.

Twitter: @daniellemorril @Mattermark

Website: mattermark.com

Drew Conway (32), Data Science Advisor

A leading expert in the application of computational methods to social and behavioral problems at large-scale, Drew has been writing and speaking about the role of data for years. In addition to being an advisor, he has also been the Scientist-in-Residence at IA Ventures. Drew’s cutting edge research focuses on gaining a better understanding of the dynamics of groups engaged in illicit activity; particularly, terrorist and criminal organizations.

Twitter: @drewconway

Website: drewconway.com

Edward Snowden (32), NSA Whistleblower

Few have done more to bring data into the forefront of the public eye than Edward Snowden. A former contractor for the CIA, Edward is responsible for leaking details to the media of extensive internet and phone surveillance by American intelligence. The effects of the leak were profound – the N.S.A.’s invasive call-tracking program was declared unlawful by the courts and disowned by Congress. Despite his continued exile from the US, Snowden has remained outspoken on government surveillance.

Website: nsafilesdecoded

Elizabeth Holmes (31), Founder of Theranos

Elizabeth is the world’s youngest female billionaire. Her company, Theranos, has developed a faster, cheaper and less painful way of doing blood tests. Their methods allow them to quickly test a drop of blood at a fraction of the price of commercial labs, and they can run up to 70 different tests. What makes her a Maverick is the way she makes this data available to the end user, taking data on the self to a whole new level.

Hilary Mason (35), Founder of Fast Forward Labs

Fast Forward Labs is a machine intelligence research company helping organizations accelerate their data science and machine intelligence capabilities. But this is not your average data consulting firm. The team at FFL are taking the most cutting edge ideas – straight from academic research – and applying them to problems in the real world. Which means FFL is actually helping their clients become data Mavericks!

Twitter: @hmason @fastforwardlabs

Website: fastforwardlabs.com

James Park (37), CEO & Co-Founder of Fitbit

Thanks to CEO James Park, Fitbit has consistently outrun it’s competition in the wearable tech market – they sold 67% of all full-body activity trackers in 2013. Their recent IPO opened up 52%, putting it among the 10 biggest opening days for an IPO so far this year. Leading up to the NYSE debut, James uploaded a trove of personal data that documented his heartbeat, sleeping schedule and biometrics.

Twitter: @parkjames @fitbit

Website: fitbit.com

Jeff Hammerbacher (30), Founder and Chief Scientist at Cloudera

As the founder of Facebook’s data science operation, Jeff has always had a knack for data innovation. The lack of commercially available data storage and analysis inspired him to start Cloudera, an enterprise software company that provides analytical data management using Apache Hadoop. The company’s enterprise data hub (EDH) software platform empowers organizations to store, process and analyze all enterprise data, of whatever type, in any volume — creating remarkable cost-efficiencies as well as enabling business transformation.

Twitter: @hackingdata @cloudera

Website: cloudera.com

John Foreman (under 40), Chief Data Scientist at Mailchimp

John does all sorts of awesome data science stuff for MailChimp. At the moment, he’s leading their data product development effort on a project called the Email Genome Project. EGP is constantly analyzing millions of email lists and billions of email addresses in an attempt to uncover stories and hidden trends. In addition, this John makes the technical side of data science fun through his blog Analytics Made Skeezy and book Data Smart.

Twitter: @john4man @mailchimp

Website: mailchimp.com

John Zimmer (31), Co-founder of Lyft

Like its rival Uber, Lyft has been capitalizing on the data it collects through its ride sharing app. Specifically, their data science department is trying to cut down on traffic (and our carbon footprint) in crowded cities like San Francisco. They recently introduced HotSpots where riders get picked up at designated intersection for a fixed price.

Twitter: @johnzimmer @lyft

Website: lyft.com

Kathryn Parsons (32), CEO of Decoded

More than 20,000 people worldwide have taken Decoded’s hit courses – which include Data in Day, Code in a Day and Hacker in a Day. Attendees have included staff from Facebook, Disney, the BBC and IMF. Kathryn is a Maverick, not only for creating one of the most popular executive education companies in the world with offices in London, NYC and Sydney, but also for being a champion of employee wellbeing and the UK startup market. What’s more, coding is becoming a more level playing field in terms of gender as women make up over 50% of the Decoded team and half of the course attendees. A good sign for the next generation of female techies, developers and coders.

Twitter: @kathrynparsons @DecodedCo Website: decoded.com

Mark Zuckerberg (31), Founder of Facebook

Zuckerberg is obviously best known for founding Facebook, but he makes our data Mavericks list for his continued innovation in using the data Facebook collects to push the envelope. Their latest venture is giving YouTube a run for its money by peppering our newsfeeds with a curated list of video content. Facebook is using their vast array of personal and demographic data as well as behavioural data to show you the perfect videos you didn’t even know you wanted to see!

Twitter: @finkd @facebook

Website: facebook.com

Matthew Hurst (under 40), Author of Data Mining

Matthew is an avid data blogger on topics such as social media and data mining. As one of the scientists at Microsoft’s Live Labs, he’s interested in all things data and artificial intelligence. He brings a unique and refreshing voice to the world of data science, blogging on everything from book reviews to current events and the complexity of the web.

Twitter: @matthewhurst

Website: datamining.typepad.com

Matt Sundquist (28), Founder of Plot.ly

The Plotly API enables users to analyze and visualize data in one place, and forms an important step in building the infrastructure for data science to be further democratized. Matt and his team make our Mavericks list for creating an incredible platform where users share their content and learn from the content of others. Not to mention they’ve made major strides in lowering the barrier to entry for complex data visualization.

Twitter: @plotlygraphs

Website: plot.ly

Michael Schmidt (32), Founder and CEO of Nutonian

Demand for statisticians and data experts currently outstrips supply. According to the boffins at McKinsey the shortfall in the U.S. alone could reach 190,000 workers by 2018. So, Michael Schmidt decided to create an automated data scientist (called Eureqa) that can take in observations about the world and spit out theories to explain them. Armed with automated, transparent answers about what’s causing business outcomes, Eureqa users are empowered to make mission-critical changes to strategies and processes.

Twitter: @nutonian @eureqa

Website: nutonian.com

Monica Rogati (under 40), Former VP of Data at Jawbone

After creating one of the most intelligent job-networking systems in the world – the code that sifted through LinkedIn profiles and magically recommended “People You May Know,” – Monia joined Jawbone as their VP of Data. While she was there, she was responsible for analyzing people’s health behavior patterns and trying to figure out how the data can help them meet their goals. Now, she’s the Data Science advisor at Insight Data Science and Equity Partner at Data Collective helping people to turn data into products and stories.

Twitter: @mrogati @DCVC @insighdatasci

Nate Silver (37), Editor-in-Chief at FiveThirtyEight

Nate Silver is a number-crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential primaries. He’s become a leading statistician through his innovative analyses of political polling. In addition, Nate is pioneering the new field of data journalism with his award-winning website FiveThirtyEight – applying data to everything from sports to current events. He makes our Mavericks list for making data analysis fun and accessible to anyone and everyone.

Twitter: @natesilver538 @FiveThirtyEight

Website: fivethirtyeight.com

Nathan Yau (33), Statistician at FlowingData

In a nutshell, Nathan is trying to make data understandable to those who aren’t necessarily data experts – a mission we can all support. He does this through the visualizations he posts on his popular blog FlowingData, which features tutorials on how to make useful graphics and how to use the tools to do so. What’s most interesting about FlowingData is Yau’s unique commentary as a professional working in the field of data.

Twitter: @flowingdata

Website: flowingdata.com

Neil Patel (30) & Hiten Shah (under 40), Founders of Kissmetrics

Kissmetrics delivers key insights and timely interactions to turn visitors into customers. It’s used by sites to track the number of visitors, what those visitors do on the site, and where they come to the site from. Kissmetrics goes far beyond what most marketing softwares have done before, giving users a nontechnical way to do something with all that data.

Twitter: @neilpatel @hnshah @kissmetrics

Website: kissmetrics.com

Nick Sinai (36), former U.S. Deputy Chief Technology Officer, Office of Science and Technology Policy, White House

Now a venture partner at Insight Venture Partners, Nick once led President Obama’s Open Data Initiative to liberate data for innovation and economic growth. He secures his place on our Maverick’s list for his work advancing data innovation in health, energy, education, and finance. Nick also led the Open Government Initiative to ensure the Federal Government is more transparent, participatory, and collaborative. And he helped start and grow the Presidential Innovation Fellows program, which brings entrepreneurs and technologists into Federal government.

Twitter: @nicksinai @WhiteHouseOSTP @InsightPartners

Website: insightpartners.com

Rand Fishkin (36), Founder of Moz

Rand would probably describe himself more as a Wizard than a Maverick (seriously, that’s his job title) – but we thought he deserved a place on our list anyway. He created the popular inbound marketing software, Moz and we’d say his efforts have had a huge impact on the search marketing industry – specifically when it comes to how they use data for insights. Rand has helped push the envelope with regards to knowledge, collaboration, and the development of toolsets that help us all succeed.

Twitter: @randfish @moz

Website: moz.com

Riley Newman (under 40), Head of Data Science at Airbnb

OK, we’ve already covered Airbnb we know. But as their first data scientist, Riley is doing some super cool stuff and we think he deserves his own place on our Mavericks list. His team (which he built from the ground up) works cross-functionally with all sides of the company, and has contributed to initiatives ranging from the implementation of Hadoop and other data technologies to architecting Airbnb’s data warehouse, and constructing frameworks for setting operational and product strategy. As he once famously said “data is the lifeblood of the business” – we couldn’t agree more Riley!

Twitter: @rileynewman @airbnb

Website: airbnb.com

Rob Barry (under 40), Investigative Reporter & Co-Chief of Data Journalism at Wall Street Journal

Rob is the most recent winner of Best Individual Portfolio at the annual Data Journalism awards, and we think he totally deserved the win. As a reporter for the WSJ he’s contributed to three ground-breaking public-service investigations, sparking changes in Wall Street regulatory practices, an investigation into Medicare practices and potential shifts in the way the FBI gathers data. Hard to get more Maverick than that!

Twitter: @rob_barry @WSJ

Website: wsj.com

Scott Crouch (23), Matt Polega (24) & Flo Mayr (24), Co-founders of Mark43

Mark43 are the makers of Cobalt, a revolutionary police software platform that combines records management with analysis tools in a seamlessly integrated suite. Scott and his fellow Co-founders help law enforcement do sophisticated data analysis, such as mapping the hierarchy of gang members to make more targeted arrests and giving officers on the street critical information about dangerous suspects.

Twitter: @scottcrouch @mattpolega @mark43

Website: mark43.com

Sean Ellis (under 40), Founder of GrowthHacker.com

As the first marketer at Dropbox, Lookout, Xobni, LogMeIn (IPO), and Uproar (IPO) (as well as interim marketing exec roles at Eventbrite, Socialcast, and Webs), we think it’s fair to say the Sean knows a little something when it come to marketing. He even coined the term “Growth Hacker” – i.e. someone who uses data to market. Inspired by the popular sharing site Hacker News, Sean created growthhackers.com – a community where hacker-minded marketers come to share and learn from the best in the industry

Twitter: @seanellis @GrowthHackers

Website: growthhackers.com

Sean Kandel (under 40), CTO and Co-founder of Trifacta

Before data scientists can uncover correlations, anomalies and other signals hidden in large volumes of data, they must perform tedious and time-consuming work: getting the data ready for analysis. But Sean Kandel has created a software (Trifacta) that automates the process of transforming data from database sources like Hadoop into something that can be easily digested by software visualization and business intelligence tools. Making clean up fun? Yeah, that’s pretty Maverick.

Twitter: @seankandel @trifacta

Website: trifacta.com

Steve Huffman (31), Co-founder and CEO of Reddit

Reddit is one of the Internet’s most visited websites with over 170 million regular monthly users. It has become a repository for all sorts of content and discussion as well as a catalyst for free speech. And who do we have to thank for this revolutionary website? Steve Huffman of course, who founded it when he was just 21!

Twitter: @reddit

Website: reddit.com

Suhail Doshi (28) & Tim Trefren (28), Founders of Mixpanel

In just three years, Suhail Doshi and Tim Trefren built Mixpanel into a real-time mobile and web analytics platform which analyzes around 6.2 billion actions a month for more than 550 customers. The former Y-Combinator startup has customers including AirBnb, Quora and Jawbone. Their claim to fame is codeless mobile analytics, meaning customers can use a point-and-click interface to identify the interactions that they want to track in their Android or iOS app.

Twitter: @sunhail @ttrefren @mixpanel

Website: mixpanel.com

Tammy Camp (36), Distribution Hacker at 500 Startups

With an extensive background in user acquisition and analytics infrastructure implementation, Tammy is a growth and distribution expert. She uses her powers of growth as an advisor to numerous startups in the security, consumer Internet and executive research fields.

Twitter: @tammycamp @500Startups

Website: 500.co

Travis Kalanick (38) & Garrett Camp (36), Founders of Uber

This duo, best known as the founders of revolutionary car service Uber, are on our Mavericks list for their devotion to using data in every way possible. Specifically, we’re impressed by their algorithm which sets a prices based on a balance of supply and demand. Dynamic pricing is nothing new, but Uber’s algorithm is the company’s greatest asset and most significant innovation, allowing it to find the price that will attract drivers without alienating customers.Uber’s use of data might not always be seen as PC, but that’s how you know you are truly innovating.

Twitter: @travisk @gmc @uber

Website: uber.com

Well there you have it. This is in no way a definitive list (that would be impossible!), but we think it’s a pretty good start. If we’ve left off a Maverick you know, tell us about them in the comments below. We always love hearing about new people doing cool stuff in data!

Extract

Early bird tickets for Extract SF (October 30th) are on sale now! The conference is all about quality content and high energy stories. Our talks will showcase the innovative ways people have used data to stave off the competition, grow billion dollar companies and build killer products.You can check out our first batch of innovative speakers (some of whom are on this list) on the Extract website.