I wonder if the PostgreSQL development team had any idea.

After all, when you go about the incredibly difficult task of building (that is, coding) a relational database, you don't think: man, this thing will one day achieve mass popularity as the storage repository of choice for tens of thousands of real-money online poker players.

Do you?

You don't wonder how long it will be until a group of people in Austin, Texas, in a dilapidated basement not far from 6th Street...

...subvert your open-source database as a cog in the gearworks of an intricate online poker botting rig; nor how many other people will do the same thing in Dallas, and New York, and L.A., and London, and Paris, and Leipzig, and Hyderabad, and Rio, those elusive poker botting hatcheries springing up out of nowhere and from nothing, like underground massage parlors or Northern Cali marijuana grow-houses or seedy all-night poker clubs conducting business in smoke-filled back-offices in nondescript greater metropolitan area office parks. You wouldn't consider or even be aware of the thousands upon thousands of PokerTracker and Hold'em Manager users who'll flock to your database, learning its ins and outs, getting their hands dirty, occasionally even parlaying an end-user poker player's knowledge of a third-party online poker tool into a job as a full-time database administrator. And you'd never (in your hypothetical shoes as a PostgreSQL contributor) be able to predict that certain dedicated poker players would take it upon themselves to learn SQL and relational database theory/practice, delving into the arcane (to a normal person) minutiae of keys and joins and tables and DDL and DML, all in order to better use a third-party online poker tool originally written by a single developer with an MS Access database; now thankfully running on a PostgreSQL back-end.

These things happen, but they can't be predicted.

And I guess that the application of open-source technology to a game as viscerally competitive as poker is not without irony. Software-assisted poker isn't the noblest use of open-source technology that can be imagined...but it is one of the most interesting. The evolution of online poker from its naive roots into the tooled-up search-and-destroy seven-headed acid-spitting monster of today is fascinating, at times profound, colorful, even crazy. Degenerate. But always interesting...

And it all started with those pesky goddamn text files.

Since the dawn of online poker (more or less), players have had the ability to request hand histories—textual, step-by-step descriptions of each and every hand of poker played by the player while playing. After all, it's poker over the Internet for real money; some sort of formal, human-readable game record is essential. And early on, on some sites at least, the way you got these hand histories was over email, usually via sort of automated "Request Hand Histories" window. You'd go in and say "hey, give me my last 100 hands," and you'd get an email containing the textual history of those hands.

Bada bing.

And that was fine, for a while. It meant you could do post-mortem analysis, to a certain extent. If you played an interesting hand or won or lost big in a killer session, you could request the hand history and sanity-check your play or maybe email it to a friend; whatever; you had the record. But the thing which had earth-shaking ramifications for poker as a whole was that textual hand histories enabled the crowdsourcing of poker strategy. (Also, the Internet may have had something to do with it.)

You know what I mean by crowdsourcing. I'm talking about the Online Poker Tactical RFC. Here's how it works:

Take a textual hand history. Cut-and-paste it into a post in a poker forum. Ask for advice.

For the first time in history, players could get tactical and strategic feedback from dedicated players around the world, and not just generic advice, but specific advice pertaining to the exact play of a specific hand or poker scenario, taking into account all the factors: stack sizes, position, exact bet amounts, and so forth. And as you know if you've spent more than two seconds in a poker forum like CardsChat, this trend is alive and well to this day. (Sadly, the original home of the Tactical RFC, rec.gambling.poker, is now an empty lot littered with beer cans and cigarette butts. Spam City.)

Anyway, the idea of the hand history as a formal textual record of a single hand of poker gained consensus early on.

And not too much time passed before certain enthusiasts got the idea of subjecting these hand histories to computerized analysis. After all, it doesn't take a genius to figure out that rigorous number-crunching of a player's complete set of hand histories might yield useful and/or interesting data. And although hand history formats differed (and still differ) from site to site, within a site they were fairly uniform, which in turn meant that it wouldn't be too difficult to build a hand-history analyzer: a tool which a) reads in a bunch of hand histories and b) extracts meaningful data from them.

The idea was kicked around on rec.gambling.poker and other forums for a spell, and the early poker/programming pioneers descended into their basements and concocted little one-off pieces of software which could parse this or that kind of hand history from such-and-such an online poker site. I myself wrote (or started writing) no less than three of these. But no single tool achieved real mindshare until PokerStat was released in 2001. The original rec.gambling.poker thread announcing its release is like taking a glimpse into another world:

Hi all, I've been spending the past several months writing a Paradise Poker hand analysis software for Windows. It will have: 1. you and other's starting cards play

2. you and other's statistics like win rate in terms of money and big bets

3. ability to leave player comments and notes

4. you and other's play in the blinds (improve your EV when stealing blinds!)

5. how you and your opponents play when raising preflop And to be added... 6. tourney support (results plus hand histories)

7. filter to only view blinds play, exclude blinds play, view by position, etc...

8. list of players, see who the top players are and the worst players are in your database

9. other things to be added before the final release (any ideas?)

Ah, the innocence.

Now it's important to understand that as of 2001, hand history acquisition was still treated as something a player might occasionally want to do. The idea was that if you played an interesting hand or session, or if you had some question about how the money had been awarded in a certain pot, you could request the hand history. It was a manual, email-centric process and that was okay, because what would you do with a mountain of 250,000 hand histories anyway? Write a program to sort through them all? Ridiculous. And so the online poker sites provided no mechanism for generating or storing hand histories locally, on the user's machine, and they provided no mechanism for automatically requesting hand histories from the poker site.

PokerStat didn't change any of this. What PokerStat did do was:

Popularize the notion of the hand-history analyzer as a valuable standalone tool.

Invent the concept of automated hand history acquisition (see below)

Inspire other developers to build bigger-and-better hand history analyzers.

In other words, PokerStat paved the way for PokerTracker. The PokerTracker.com domain name was registered on 19 March 2002, roughly one year after the release of PokerStat. Shortly after that, the community took up the PokerTracker vs. PokerStat debate, as in the following rec.gambling.poker missive from 2003:

Hello, I am a registered user of PokerStat. I haven't used it for about three months, and have some questions about it and the new software called PokerTracker. PokerStat Questions: 1. Is there an update coming that will support Party and Stars?

2. If so, will it automatically capture and process histories from these sites like it does with Paradise (I love this feature)?

3. If #2 is coming, will this new version allow playing on tables from different sites simultaneously?

4. If #1 is true, will it allow different names on different sites (my Paradise login name was already in use when I signed up for Party and Stars)?

5 Am I a "lifetime" member now that I have version 2? I paid for the software initially, then I had to pay again to upgrade if I'm not mistaken. PokerTracker Questions: 1. Does it have an automated feature to request hand histories?

2. Does it have an automated feature to log in to a POP mail server (like PokerStat)?

3. Does it track multi-table tournament results on Stars?

4. If the answer to any of the above is "no", are these features planned for future versions? Questions about both: 1. Do these packages track session statistics?

2. If so, how well do they handle multiple tables at a time?

3. How do they handle hourly win/loss with multiple tables? Finally, what are your overall impressions of these packages? How useful are they compared to, say, tracking sessions on a spreadsheet? Thanks for any info. Phil

Again, at this point in time you still had to request hand histories manually from the poker site. But notice two of Phil's questions:

1. Does it have an automated feature to request hand histories?

2. Does it have an automated feature to log in to a POP mail server (like PokerStat)?

PokerStat was the first online poker tool to take a stab at automated hand history acquisition. You gave it your incoming mail server address along with a username and a password, and PokerStat would go out every so often and fetch/analyze your hand history emails. This functionality was duplicated in PokerTracker v1, and still exists today, six years and two major version releases later:

The appearance of this sort of feature in both of the major hand history analyzers of the day is indicative. What had happened is that players had started to realize that hand histories are a potential goldmine of information. It suddenly became important to request ALL of your hand histories, boring hands and interesting hands alike. And as you can imagine, doing this manually (by requesting hand histories, fiddling with email, etc.) over the course of an intensive 8- or 12-hour poker session was both tedious and error-prone. So in response to that, first PokerStat and then PokerTracker incorporated Import from Email and Auto-Request Histories functionality, and for a while this was a big selling point.

Now this idea of automated hand history acquisition and analysis is powerful; even revolutionary, at least within the world of poker.

It meant that hand histories were no longer strictly a subject for post-mortems. If you could automate the process of delivering and analyzing those hand histories, you could run your hand history analyzer on hands as you played them, in punctuated real time. If you could then complete the feedback loop by providing the player with real-time information on the opponents he's facing at the table right now, you'd have a full-fledged online poker assistant, an intelligent agent whose job is to:

Parse incoming hand histories Analyze the resulting data Communicate it to the player

This would ultimately lead to the development of the next piece of the online poker equation, the HUD (heads up display), first in primitive versions such as PokerStat's "manual HUD", and later with full-fledged HUDs such as PokerAce HUD (now integrated with PokerTracker) capable of delivering customizable visual information in all the colors of the rainbow. But that's another story.

The thing to take away from all this is that the hand history analyzers created a demand for instantaneous hand histories.

More and more players started requesting more and more hand histories, and more and more programmers built tools to help those players automate the process of requesting those histories (giving rise to an entirely new species of software, the standalone hand history grabber), and in general the number of histories being requested started trending sharply upward, and I think at some point the online poker companies realized that email is a clunky mechanism for hand history delivery. High-volume email consumes server resources, gobbles network bandwidth, and suffers from high latency. It's difficult to secure. It's messy. And there's absolutely no excuse for it, since by definition:

Every last piece of information necessary to construct a complete poker hand history is present locally, in the poker client running on the user's machine.

In other words, people in a position to do something about it realized that hand histories could and should be generated locally rather than transmitted remotely. It's better from the user's perspective, because he gets instant access to his hand histories; and it's better for the site, because all that expensive hand history processing logic is offloaded onto the client. Sort of a classic client/server scenario, really, and one wonders why the poker sites didn't implement it this way from the beginning. Whatever the reason, they eventually came around, and one by one, the major poker venues started generating hand histories locally. (There are still many sites today that do not, however.)

And it was good.

PokerTracker went on to win the First War of the Hand Analyzers, of course, and within a couple years it would corner the hand history analysis market (at least until the arrival of Hold'em Manager). PokerStat continued to enjoy (and still enjoys) a small following, and holds the honor of being the first widely-available hand history analyzer; but PokerTracker was the first hand history analyzer to achieve mass popularity, or what counts for mass popularity in the world of poker. I haven't been able to find reliable figures for how many copies PokerTracker has sold, but based on its sheer ubiquity, I believe it to be the most profitable piece of third-party poker software built to date, and one of the top three or four most interesting examples of indie development success I've ever come across. So Jeff Vogel, eat your heart out.

But to get back to PostgreSQL.

Pretty much every hand history analyzer that's ever been built can be described as a graphical user interface sitting on top of a database.

That database might be a proprietary or embedded data store (as in PokerStat); it might be a simple Microsoft Access database (as in early versions of PokerTracker); it might be a full-fledged PostgreSQL instance (as in Hold'em Manager and v2/v3 of PokerTracker). But there will always be a database of some kind. The mechanism works like this:

Input. The user imports his textual hand histories. Processing. The analyzer parses each history and stores a compacted representation in the database along with various other computed statistics. Output. Later on, the analyzer executes SQL against the database and displays the results to the user.

Once the hand history files have been imported and converted to database format (a process which usually involves taking the data for each hand and splintering it out across multiple tables, a process the programmers and DBAs out there will know as normalizing) the hand history analyzer never touches them again. It does all its work (statistical analysis and reporting and the displaying of charts and tables and graphs) from data stored directly in the database.

You might wonder why this sort of two-stage processing is necessary. Why even use a database at all? Why not just analyze the hand history text files directly anytime the user wants to see some data? (After all, the hand history contains all the information. Like a can of Prego spaghetti sauce, "it's in there".) And the answer of course is that:

Text files are opaque (they can't be easily queried/manipulated)

Parsing text is time-consuming

Text files live on a physical hard drive

Disk I/O is time-consuming

In other words, as various hobbyists discovered early on, parsing raw text files takes a lot of time. An active player can easily chalk up hundreds of thousands or even millions of hands over the course of his career, which means that somewhere on his machine there will be a folder containing multiple gigabytes of hand histories: endless streams and rivers and oceans of text (a lot like this article, come to think of it). And even though a robust parser can burn through hundreds of hand histories a second, it can still take hours to import a large collection. When it comes to online poker hand history analysis, you could say that everything is slow for large N.

And it means you have to have a storage bin to persist extracted history data in a durable format that allows it to be queried. There are several ways to handle this:

You could treat all hand history files as a single flat-file database.

You could build your own proprietary data store.

You could use an embedded relational database, invisible to the user.

You could use a full-fledged relational database, visible to the user.

Well no, I'm sorry. Pet peeve. There's exactly one way to handle this. Hand history data cries out for an open relational database with a documented schema running a standard SQL dialect and is extremely pissed off when it doesn't get it. You have to store everything in an well-documented standard format so that:

Players can query and manipulate the database directly

Players can (if necessary) administer/maintain very large databases

Third-party toolmakers can build add-ons that work with your database

You have a mechanism for the creation of custom statistics (custom SQL or SQL-like expressions)

Otherwise your data will kill you. No, I'm not saying give the user the option of using an RDBMS. I'm saying hog-tie the user and force him to use an RDBMS. It's what the user wants, even if he doesn't know it. Consider the following random forum post published two days before Christmas, 2003:

Also, does anyone know what PokerTracker uses as a back-end to store the data. I'm a database guy and was disappointed that I couldn't write my own queries against the PokerStat database because it is not just a standard Access (or other db type) database. Lastly, I guess I'm wondering if anyone has a stat tracker that is just in a regular db (I'd take Oracle or even DB2 if that's all I had access to...). Thanks

dave

And it's fascinating from a software development standpoint, because as software developers we've been instilled with a healthy fear of exposing databases to the end-user. I can hear it now.

Product Manager: So wait, let me get this straight. You want us to force our users, who are some of the laziest people on the planet, to install a full-fledged relational database management system??? On their home computer??? Like what, they're going to become DBAs? And you're calling that a feature? Well, why stop there? Why not just ship them our source code directly and force them to compile it on the COMMAND LINE? Every user is a programmer, right? Well? ARE YOU OUT OF YOUR F--KING MIND??

But online poker isn't a typical industry, and online poker players are anything but typical users. And apparently the PokerTracker people (well, at that time I think it was more of a single-man operation) saw the writing on the wall, because when PokerTracker was released, it did, in fact, ship with a relational database.

The only problem? That database was Microsoft Access.

Now the programmers out there will be rolling their eyes. I don't want to rant about how terrible a database Microsoft Access is, because it's not fair to make those kinds of value judgments when it was never intended to be a robust production database. Another mitigating factor is this: when PokerTracker was released, large hand history databases were a rarity. The size limitation of Microsoft Access and its choppy performance under heavy load didn't seem quite the deal-breakers then that they would today, now that multiple-hundreds-of-thousands-of-hands databases are commonplace.

Nevertheless, as PokerTracker started to gain market share, that innocuous Microsoft Access database started causing problems:

Its size limitation meant that large hand history collections had to be broken across multiple databases.

Performance degraded as the database grew in size

There were data durability/corruption issues

It got to the point to where you could almost hear the Microsoft Access back-end JET engine groaning under the strain. And before too long, the thing which had set PokerTracker apart from PokerStat—its open database—became it's single biggest flaw (the unresponsive single-threaded PokerTracker v1/v2 UI being a close second). Armchair programmers around the world started to make comments like: I could build a better PokerTracker. I'm a database professional and if I were to design PokerTracker I'd do X, Y, and Z. I can't BELIEVE they used Microsoft Access. And so on.

And then one day PokerTracker version 2.0 was released.

Poker Tracker comes ready to use with a Microsoft Access database that stores the statistics and text of your hand histories. For most users, the Access database is sufficient and works just fine. However, for more advanced users and users that want to store a huge amount of hands, a more robust database option has been a requested for a long time. Enter PostgreSQL...

Finally, at long last, PostgreSQL joined the fray.

Now my belief is that the nature of the hand-history-text-file-analysis problem meant that PostgreSQL was destined to enter the picture sooner or later. Consider:

You can't use a commercial database like Oracle or SQL Server because commercial database licenses cost money. You can't use SQL Server Express, MSDE, or other "light" versions of commercial databases. First, they're hobbled in terms of performance and capacity. Second, your power users would still have capacity problems. Third, "light" versions of commercial databases often have tricky licensing issues. Fourth, light versions of commercial database are almost always structured to upsell you into a commercial database. Fifth, the company behind the database can discontinue it at any time. You can't use any database which doesn't have a large developer and user community. You can't use a database which hasn't been field-tested across multiple versions. (I.e., no betas or avant-garde databases.)

There are only two databases that I know of which meet the above requirements. They are:

PostgreSQL

MySQL

Both of these databases are robust enough out of the box to handle copious hand history data. MySQL has the larger user base and development community. But when you consider MySQL's web-centric history, its lack of transactional and foreign key support, and its relaxed attitude with regard to data integrity, PostgreSQL emerges as the clear winner—at least when it comes to the task of storing and retrieving hand history data in a Windows desktop environment. In fact, let's just go ahead and say it:

PostgreSQL is the ideal database for hand history storage.

And as I was saying, PokerTracker 2 was a hybrid. It had support for Microsoft Access, and it had support for PostgreSQL. And for a long time PokerTracker was the only game in town, the only product that was functional enough and had enough of a user base to allow for the creation of lots of supporting "how to use PokerTracker" content in the form of forum threads, linkbait, blog posts, e-books, and so forth. And more and more players started jumping on the PostgreSQL bandwagon as by this point the problems with MS Access were pretty well known throughout the community. It was a sort of you mean you're still using an Access database? Come on man, you gotta go with Postgre. Get with the program! type thing.

Time passed...

And then in 2007, a guy by the name of Roy Goncalves released the beta for Hold'em Manager, the first serious challenger to the PokerStars throne.

It's funny how history repeats itself. In 2002-2003, PokerTracker was introduced as a "better" PokerStat. Five years later, Hold'em Manager was introduced as a "better" PokerTracker. This sort of technological leap-frogging is great for innovation, and most of the innovations Hold'em Manager brought to the table were intended to address some of the unpopular quirks of PokerTracker v2:

The single-threaded user interface which froze whenever hand histories were being imported

The lack of an integrated HUD (PokerTracker v2 HUDs were add-ons)

Various layout and display issues

Various database/schema issues

Now I don't want to get into the subject of whether PokerTracker was/is better than Hold'em Manager or vice-versa. I just want to point out that Hold'em Manager followed PokerTracker's lead and used PostgreSQL natively from day one. Around the same time (2007), the PokerTracker team started gathering feedback for the long-awaited (some would say overdue) PokerTracker v3.0; which is the current version of the product as of this writing. And whereas PokerTracker 2 had given the user a choice, PokerTracker 3 drew a line in the sand:

Attention all users: PostgreSQL is mandatory!

Which meant that the two most popular tools in online poker—a market famous for shoddy software and grumpy, stubborn users—have both standardized on an open-source database for the prosecution of real-money, real-time online poker intelligence, or whatever you call the blend of services that PokerTracker and Hold'em Manager provide. Which means that all that robust open-source code running in the PostgreSQL transactional engine? It's running on behalf of a competitive poker player named Joe Bob living in Utah. And twenty thousand more like him. Those JOINS are helping poker players improve their game.

As if that wasn't random enough: Joe Bob the poker player is actually learning SQL.

With the advent of custom statistics in the PokerTracker 3 (and to a degree, custom reports in Hold'em Manager), with the publishing of the PokerTracker 3 database schema and statistical reference, and with the increasing sophistication of PokerTracker/Hold'em Manager setups and custom HUDs and statistical arrangements in general, the utility of SQL has gone way up for the serious player. You don't need to know a lick of SQL in order to use PokerTracker 3 and/or Hold'em Manager; you can point and click your way to some pretty advanced setups. But if you want to get the most out of PokerTracker... You Will Need To Know:

What a table is

What a column is

What a primary/foreign key is

What a JOIN is

Etc.

And if you plan on doing any serious custom work with PokerTracker then SQL is an absolute necessity.

All of this, all of the above, if you're still reading, all of it; it makes me realize what I've suspected for years without ever really vocalizing: the online poker industry is without a doubt the strangest application of computer science and software development technique to a particular domain that I've ever witnessed or come across...with the possible exception of EVE Online. Poker has always been a darling of the research crowd, from the days of Johnny von Neumann; but when you add the Internet and the prospect of real money to the equation, the game of poker tumbles down the rabbit hole of weirdness, never to emerge. I haven't even touched on:

How the widespread availability of hand histories coupled with a demand for large hand history databases produced a black market for illicit hand histories shared/pooled across many users

How that same demand gave rise to a new species of software called hand history data miners: tools (such as the cleverly-named IdleMiner) whose only purpose is to "watch" one or more tables of poker and squirt out a textual hand history.

How those hand history data miners enabled the creation of independent and/or unauthorized "player tracking" sites like Sharkscope, OPR, PTR, and the PokerDB.

How the poker botting crowd learned to use hand history databases to bootstrap basic strategy and opponent modelling.

But I hope I've made it clear that the online poker user experience has, to a large degree, been shaped by the need to generate, transmit, store, parse, convert, analyze, anonymize, visualize, and broadcast the humble text file; and that this has created a sort of gravity which slowly and over time has pulled users into the orbit of PostgreSQL, a segue which is random, bizarre, slightly perverse, and completely awesome. Hand histories are the pieces of the online poker jigsaw puzzle. We were just waiting for the right database to put that puzzle together...