A few weeks ago, Samuel Arbesman wrote an article in Wired touching on the mathematical properties inherent in LEGO structures. In it, he discussed the results of a 10-year old study of natural and human-made networks that described how the number of distinct components in a network increased with the overall size of the network.

The study showed that the LEGO systems did indeed follow this rule. However, Arbesman noted that the relationship increased sublinearly, suggesting that LEGO systems were under some form of selection pressure (like the economics of production) that made it more expensive to grow the system and create new types of pieces. He was curious to see whether or not these findings would hold true with a more complete list of LEGO sets available today (n=389 in the 2002 study).

After using a webcrawler to pull the data for the available sets and their component pieces, I was presented with a list of over 6,800 individual toys or kits. Not all of these kits fit the criteria of the original study, which investigated sets that were designed to build somthing specific as opposed to generic collections of pieces.

Paring down this list turned out to be the most difficult part of this excercise. I ended up eliminating any set had words like “accessories,” “supplemental,” or “universal building set” in the name. I also removed entire toy lines such as DUPLO, Clikits, and Primo/Baby which didn’t seem to fit in the standard LEGO system. Basically, I tried to include anything with a brick, plate, or tile that had a picture of a single object on the box. I ended up with about 3,750 sets … or about ten times the number in the original study.

So, do the results hold up with the new data? At first glance, it appears they do. Both the log-log and semi-log plots described in the study are reproduced here with the larger counts. Note that a power-law relationship still appears to fit the data better than a logarithmic relationship.

﻿

Once I had access to all of that cool LEGO data, of course, I couldn’t resist a few more visuals. The first thing I developed was an interactive chart that lets you navigate the size and complexity data to see specific kits. Check out the links for pictures and parts lists.

This display was interesting because the LEGO kits with the most pieces tended to be elaborate secret bases or fortresses while the LEGO kits with the most variety of pieces were cultural artifacts like the Taj Mahal or the Statue of Liberty. Ironically, the Death Star (which might be considered both a cultural icon and a fortress) fits neatly in the upper right corner.

The following charts look at the trend of unique pieces over time as well as the distribution of color over the distinct LEGO sets available (this includes all LEGO products, not just the specific “objects” used in the logarithmic plots above). Note both the increasing variety of the LEGO pieces and the move away from the traditional color palette. The mottled gray represents the “other” category.

It is interesting to note that the shift toward more complexity in both pieces and colors corresponds with the deal LEGO inked with Lucasfilm in 1999 that allowed the company to sell toys based on the “Star Wars” universe. These changes came at a time of turmoil for LEGO as it struggled to remain true to its roots while competing with a flood of specialty toys and video games. Licensing products from Lucasfilm was a big step for LEGO but one that seems to have paid some creative dividends … four five of the top ten largest LEGO structures ever released commerically are spaceships from the “Star Wars” series.

This trend toward replicating such specific visions (LEGO has also licensed themes from Harry Potter, Toy Story, Pirates of the Caribbean, and others) explains some of the incredible variety of pieces now in circulation. Items from these new kits introduced many pieces used only once.

On the opposite end of the spectrum, the most commonly shared LEGO piece in the database is a black 1 x 2 plate (part number 3004). The other pieces in the top 10 are also very simple and very monochromatic. I found it interesting that all the colors in the top ten reflected the sequence of Berlin and Kay’s basic color terms (in which Stage I cultures have only the colors black (dark–cool) and white (light–warm) and Stage II adds Red).

One thing this database does not cover is the huge market for non-standard kits and free-form LEGO bricks. According to Chris Anderson’s Long Tail blog:

“… 90% of Lego’s products are not available in traditional retail. They’re only available in the catalogs and online … [o]verall, those non-retail parts of the business represent 10-15% of Lego’s annual $1.1 billion in sales. “

User-created structures represent an amazingly creative use of the standard set of parts available. Check out this footbal stadium or this minifig-scaled Saturn V rocket. Some of these models were created using the old LEGO Factory/Design by Me software but some are done on the fly. It would be interesting to see if some of the above findings apply to these custom structures.

For more stats and a company timeline, check out this site.

Updates: