Applying Artificial Intelligence to Handicapping NES Tetris

[ About | Sanity Check | Hacking the Game | Training the AI | Calculating Handicap | Videos | Download | Run ]

About

What would happen if you sat down for a head-to-head match against one of the NES Tetris masters in the Classic Tetris World Championship? Unless you possess lightning-fast fingers and savant-level stacking skills, expect nothing less than total and complete defeat. But what if there were a way to even the playing field, to enable players of different proficiency to compete on equal terms? In an email discussion with Trey Harrison—the distinguished CTWC pro and the techie behind the graphics streamed to the world during the competition—he proposed a way:

To make the match more even, the really good player could have the floor of their well raised up N rows. So, instead of 20 rows of space to work within, they only have (20 - N). I think it's obvious that this will lower the overall total score you can expect from that player, but by how much? Would 1 row make much of a difference? How about 4? Seems like 4 rows would make a massive difference at level 19.

With the help of my NES Tetris AI, I explored these questions. And in this article, I present my findings.

Sanity Check

Before I got started, I contemplated if eliminating rows would make the game unplayable. And, perhaps more importantly, if the skills that a player attained while practicing on the standard 20 rows are translatable to shorter playfields. Game Boy Tetris provided some insights. Due to the handheld system's limited screen resolution, its playfield is only 18 rows tall. And pieces spawn on the second row from the top, effectively shrinking the playfield to 17 rows. Nonetheless, it’s considered one of the best versions of the game, even the one favored by Tetris creator Alexey Pajitnov. Most players are oblivious to the smaller size, suggesting that difficulty is not significantly impacted by the removal of a few rows.

What about more than a few? Tetris DX, the Game Boy Color enhanced version of the original Game Boy Tetris, supports a mode that challenges the player to clear 40 lines at a chosen drop speed and playfield height. The height can be reduced all the way down to 8 rows. The result is a lot more difficult, but it’s playable and surprisingly fun.

Another thing that I considered is the rise and fall of the stack as standard games plays out. The surface of the stack is a barrier, a sort of artificial floor, that intermittently emulates the effects of shorter playfields. This is particularly notable when skilled players build deep wells while waiting for long bars; gameplay gets squeezed into a very tight space. These transient yet frequent experiences might equate to training on all playfield sizes.

With these encouraging observations in mind I wanted to experience NES Tetris with different heights...

Hacking the Game

Since the game does not provide an option to adjust the playfield height, it was time to dive into the source. A single byte change is all that is necessary to prevent the pieces from falling all the way to the bottom. The snippet of code below shows the vertical range check applied to the 4 squares of each Tetrimino. The value 20 ($16 in hex) at $94B3 defines the deepest row. And, if you’re wondering, the −2 check applies to the 2 hidden rows above the playfield.

94A6: LDA #$04 94A8: STA $00AA 94AA: LDA $8A9C,X 94AD: CLC 94AE: ADC $0041 94B0: ADC #$02 94B2: CMP #$16 94B4: BCS $94E9

The image below shows the result of halving the value.

I experimented with different ways of making the lower portion of the playfield appear solid. Since I ultimately intended to connect this to the AI project that I developed earlier and that project interfaces with the game through the Nintaco NES emulator, I ended up simply drawing a gray rectangle using the graphics functions provided by that emulator's API.

The first thing I noticed while playing is how often the music tempo changes since the stack frequently approaches the ceiling. More annoyingly, when the switch happens, the song starts over from the beginning. Tetris DX’s 40-Lines mode suffers a similar effect when the playfield is very small. And that game contains 2 intensities of panic music resulting in an audible mess.

I decided to disable the tempo changes. When a piece locks into place, the routine listed below executes. It scans across row 5 (the sixth row from the top) and if any solid blocks are detected, then the music speeds up. If that row is empty, the music resumes normal pace.

9D17: LDX #$05 9D19: LDA $96D6,X 9D1C: TAY 9D1D: LDX #$0A 9D1F: LDA ($B8),Y 9D21: CMP #$EF 9D23: BNE $9D3C 9D25: INY 9D26: DEX 9D27: BNE $9D1F 9D29: LDA $00BA 9D2B: BEQ $9D50 9D2D: LDA #$00 9D2F: STA $00BA 9D31: LDX $00C2 9D33: LDA $85D2,X 9D36: JSR $9E07 9D39: JMP $9D50 foundBlockInRow: 9D3C: LDA $00BA 9D3E: BNE $9D50 9D40: LDA #$FF 9D42: STA $00BA 9D44: LDA $00C2 9D46: CLC 9D47: ADC #$04 9D49: TAX 9D4A: LDA $85D2,X 9D4D: JSR $9E07 9D50: RTS

The scanned row index is at $9D18. I set it to 0. Meaning, game over will occur before the tempo gets an opportunity to change.

I also doctored the score counter to display values beyond max out. The hack is only relevant to the AI and human Tetris masters.

The 6-digit score is stored using packed BCD. And while doing BCD addition, digits may temporarily contain values larger than 9. Normally, the overflow gets carried into higher digits. However, as seen in the code fragment below, if the uppermost digit is 10 or greater, then all digits become 9.

9C84: LDA $0055 9C86: AND #$F0 9C88: CMP #$A0 9C8A: BCC $9C94 9C8C: LDA #$99 9C8E: STA $0053 9C90: STA $0054 9C92: STA $0055

Part of the comparison involves isolating the uppermost digit from the packed byte. This is done by masking the byte with $F0. By changing the mask to $00, the check will always fail. And due the arrangement of tiles in the pattern tables, that modification causes the uppermost digit to display as a hexadecimal value.

The digits A, B, C, D, E and F correspond to 1,000,000, 1,100,000, 1,200,000, 1,300,000, 1,400,000, and 1,500,000, respectively. Beyond that, the score will roll over.

On various websites, I’ve encountered Game Genie code ENEOOGEZ that does something similar. But, instead of clearing the mask, it changes the value compared against the uppermost digit from 10 to 15. That means if the score ever reaches 1,500,000, the check will succeed and the score will erroneously max out at 999,999.

My version is equivalent to Game Genie code AEEPNGEY. The other changes above could also be expressed as Game Genie codes; however, I implemented all these hacks using the AccessPointListener provided by the Nintaco API.

With the modifications in place, I got back to playing. As expected, removing a few rows from the playfield is hardly noticeable. But the experience also revealed that there’s a tipping point where difficulty ramps up.

Training the AI

When a piece spawns, the AI simulates every possible placement of the current piece and the next piece. The resultant piles are ranked using a weighted sum of various metrics such as height, density, surface roughness, well depths, and so on. Ultimately, the AI directs the current piece to the position accordant with the highest ranked pile.

“Training the AI” amounts to tuning the weights used in the sum. This was accomplished using Particle Swarm Optimization (PSO). It evaluated candidate weights by simulating 100 games, each of which started on level 19 and either deliberately ended at the transition to level 29 or ended prematurely with a top out.

I initially setup the PSO to rate each candidate by its average score. However, that method caused the candidates to migrate toward cautious play. To rake in points, an AI needs to diligently pursue simultaneous line clears. And that demands repeatedly building deep wells, a technique that can bring the game to an abrupt end, dragging the average down with it. Getting a decent average meant being risk-averse.

To get around that, I configured the PSO to use the average of the best 33 out of the 100. That method focused on the subset of candidates that were exposed to favorable piece sequences, the ones that didn’t need to act conservatively. The result was an aggressive, risk-taking AI capable of stratospheric scores. However, when I plotted its probability density function—a normalized histogram, showing the odds of attaining any particular score—I ended up with this:

The large peak confirms its high-scoring potential. But the broad region on the left reveals that the AI often fails early. The plot shows a bimodal distribution indicative of 2 separate groups: the upper-third that hung on long enough to score high and the lower-two-thirds that took too many risks.

To suppress this effect, I instructed the PSO to use a weighted average across all 100 scores, biasing it toward the high end. That produced something much closer to a normal distribution:

It’s mean, median and standard deviation are 1,036,706, 1,061,820, and 149,458, respectively. The low standard deviation and high mean implies that it plays well consistently. Since the mean rests a bit below the median, the curve is somewhat negatively skewed. Fitting a Gaussian to it yields:

The Gaussian’s mean is 1,066,136 and its standard deviation is 67,656. On the left side, the red line rides slightly above the blue, exposing that AI can still fail early, though not very often.

The Gaussian fit also provides answers to some interesting questions. For example, reaching level 29 via exclusively achieving Tetrises happens roughly only once every trillion games! Consistent play, unfortunately, means that absolutely extraordinary games are just as rare as the bad ones.

Integrating the original probability density function from a given score to infinity produces this cumulative distribution:

It depicts the total probability of the AI getting or exceeding a particular score. For instance, the chances of a max out (999,999 points) is 80%.

I extended this training technique to obtained separate optimized weights for each playfield height.

Calculating Handicap

Below, I plotted the average score of the AI at all playfield heights.

Since it’s S-shaped (sigmoidal), I fitted a logistic curve to the values:

Finally, I normalized the logistic curve’s scale:

A player’s expected average score at a particular height is the handicap percent of the player’s average score at the standard height. For example, at Tetris DX’s playfield height (18 rows), you should be able to get about 96% of what you normally achieve at 20 rows. But at Game Boy Tetris’s height (effectively 17 rows), expect about 91%. For the smallest playfield setting in Tetris DX’s 40-Line mode (8 rows), anticipate only around 4% of your typical score. The curve reveals everything in between.

To calculate a meaningful average score, all the runs must begin on a common starting level. This is due to the fact the rate at which points accumulate depend on that level. For the curve above, the AI always initiated play on level 19. However, after every run, it reperformed the same sequence of moves under the scoring systems of all 20 starting levels. In other words, it computed what its final scored would have been had it commenced on any of the lower levels.

Each starting level produces a different set of average scores. But when they are fitted to logistic curves and normalized to the same scale, they end up nearly identical. In the following chart, the curves for all 20 starting levels are shown overlapping.

A logistic curve is defined as the function

where k is the growth rate or steepness, x 0 is the midpoint and L is the maximum value. If you want to compute the handicap percent for a specific starting level and height, refer to the table below for the associated constants.

level k x 0 L 0 0.635429136476100 13.623164176577156 101.738709037280190 1 0.633972899770532 13.620287025234648 101.751732068134570 2 0.631114654355038 13.612829587953332 101.775591024550740 3 0.627195897107233 13.601248616665740 101.807418477625220 4 0.622532395890976 13.586087913561940 101.844673187735510 5 0.617428140821707 13.567896570838954 101.884775139035870 6 0.612141984054329 13.547265835826323 101.925490406957280 7 0.606895937567774 13.524792810517173 101.964805479462770 8 0.601859525106324 13.501078168682877 102.001170338946780 9 0.597150840829724 13.476731302125021 102.033573922583870 10 0.595130004334894 13.464342732164772 102.045422660007920 11 0.593402147811258 13.453406179006349 102.055270258108080 12 0.591908492905426 13.443681556719616 102.063553748699860 13 0.590604878944027 13.434978730229760 102.070596217722500 14 0.589457522734079 13.427145215718749 102.076640591006680 15 0.588440151383866 13.420057323011818 102.081872483177430 16 0.585861067199916 13.403352684198659 102.096881727694470 17 0.583489124535944 13.386846672485404 102.109534148677570 18 0.581345726299131 13.370845964185436 102.119838312928080 19 0.579845427283322 13.348273093799587 102.113185988647810

Alternatively, when each of those columns are averaged together you end up a single curve that approximates the handicap percent across all starting levels (k = 0.603260297943194, x 0 = 13.490715607520380 and L = 101.978036758426840):

Does this chart really enable me to play against a CTWC pro? Well, if my average is 150,000 and the pro’s average is 800,000, then my typical score is 18.75% of the pro’s. From the chart, the pro would have to play on height 11 to even out our skill levels. That’s not impossible. But most players are going to find the game painfully difficult below height 14, to the left of the midpoint of the curve. Put another way, this system is only practical when the ratio of the players' averages scores is not smaller than 60%.

Videos

Watch the AI take on playfields of various heights in the videos below.

Download

HandicappedTetris_2019-08-17.zip

The .zip contains:

src - The source tree.

- The source tree. HandicappedTetris.jar - The compiled binary.

- The compiled binary. lgpl-2.1.txt - The free software license.

Run

Prerequisites

Nintaco - the NES / Famicom emulator.

Tetris (U) [!].nes - the Nintendo Tetris ROM.

Load the JAR

Extract HandicappedTetris.jar from the downloaded .zip . Start Nintaco and open Tetris (U) [!].nes . Open the Run Program window by selecting Tools | Run Program... Enter the path to the file in the JAR name field or browse to it using the Find JAR... button. Press Load JAR to load it.

Either Start Human Mode

Set Main class to handicappedtetris.bot.HandicappedTetrisPlugin Optionally, in the Arguments field, specify the playfield height (4–20). The default is the standard 20 rows. Press Run. Return to the main window and play Tetris as you normally would.

Or Start AI Mode

Set Main class to handicappedtetris.bot.HandicappedTetrisBot Optionally, in the Arguments field, specify a playfield height (4–20). By default, the height randomly changes between games. Press Run. Return to the main window to watch the AI play.

Tips

The program window can be hidden by clicking its Hide Window button or its standard eXit button. The plugin will continue to run while the window is hidden.

To unhide the window, select Tools | Program Controls...

To kill the plugin, press Stop in the window. Reset the game (Machine | Reset) after stopping the plugin to fully restore Tetris.

To watch the AI play faster, select Machine | Speed | Max.

Upon reaching the end of a game, the AI immediately starts playing a new one. If you intend to leave it running perpetually, disable history tracking (uncheck Tools | Track History ) to avoid running out of memory.