Spoilers Warning

Intro

The Marvel Cinematic Universe (MCU) has grown to the point where it is the most successful movie franchise ever. Stories, legends, myths, and fictions form a key part of the human experience, and stories with this scale and reach are almost unprecedented. So digging into the mathematical structure of the collaborative universe is something I plan to do for a while. In this post we examine the structure of the MCU timeline (see above).

I talked some more about the Marvel Cinematic Universe (MCU) and why it’s interesting in a previous post. Have a read if you want to get started on this. Today I want to talk about a way of visualising the relationships between the characters and the movies in a timeline (shown above). It’s just a starting point – I plan few follow up posts. At the least, it’s a way you can track your favourite character, and plan a movie marathon to watch all of their movies in order.

I’ve been working on this for a while, so it’s possible you saw a previous version. My aim was to get this out on the release of Captain Marvel, and I thought I had it down. Then I saw the movie, which is brilliant, but required a few last minute alterations to the timeline, so slight delay here to get the final version up.

Visualising the MCU

How do we visualise something as complex as the MCU? Well, there are a few other folks who’ve had a go:

These are classic network analyses. A network (maths peeps sometimes call it a graph) shows the relationships between a set of entities. These analyses are aiming to establish relationships between characters, or between movies. From those we can work out central characters, and so on. But I think you already know the central characters, more or less. Frankly, you can just count the number of movies each character is in.

Network analyses are great, and these ones have classier graphics than I do. But I want to show a little more than just who’s who. Time is a very important element of any narrative, but is sometimes overlooked in network views of the data. So, inspired by the map of Napolean’s Russian campaign by Minard (Tufte called Minard’s figure “one of the best statistical graphics ever” in Beautiful Evidence).

Minard's graphic of Napolean's losses in the Russian campaign.

and the “Movie Narrative Charts” of XKCD,

I had a go at a timeline/flow diagram/narrative chart of the MCU (see above for the picture, and below for some details).

My Timeline of the MCU

There’s a lot I want to say about this timeline. First some footnotes 1 2 3 4 5 6 7 . Go and read them before you get upset about mistakes. I’m not saying there aren’t any mistakes, just that I some things that look like mistakes are deliberate choices.

There are plenty of people doing timelines for the MCU, ranging from Marvel’s own coarse-grained timeline, to amazingly detailed fan sites. Some of these are almost at the level of telling us what Iron Man (Tony Stark) had for breakfast today. Here are links to a quite a few:

However, despite the title, I am not just doing another timeline. In XKCD terms it’s a narrative chart and I think of it as a flow diagram, and there are other names as well (see links at the end of this), but story timeline makes sense here. If we define a story as a sequence of events, and a narrative as how it is told, then I am trying to get at the story not the narrative. The pieces here are drawn in the order they happen (to my best ability) not the order they are shown (for instance the second Guardians movie appeared out of order compared to when it is set). So a (story) timeline seems the right term.

The goal of my timeline is to track each of the heroes through the movies they participate in. So we can see each character’s personal timeline. Each coloured line shows the path of one character (or sometimes a small group).

Beyond this the chart has a few other objectives:

Horizontal location is indicative of the time at which the movie was set 2 , though I had to take liberties in a few places to make it all fit. I tried to make my timeline consistent with as many of the timelines above as I can, but they disagree in places. Where possible I give precedence to Marvel’s official timeline, and for detailed placement take advice from Collider, which has the lots of detail, and is at least internally consistent. I’m suspicious that I should move Infinity War into 2018, to be consistent with Collider, but for the moment let’s stay with the official Marvel version where possible.

Sequences (direct sequels) for a particular hero are horizontally aligned, and non-sequences avoid alignment (as much as possible).

The field is vertically separated into human tech on the bottom, and magic, gods and space aliens on the top, though this is a coarse discriminator (the Tesseract in the 1st Captain America might seem to place it with the latter, but this doesn’t fit with the other placement criteria).

Node colours indicate the phase of the MCU. The phases are disjoint from the point of view of release dates of movies, but not so in terms of the timeline.

Ideally, charts like this are laid out by a clever computer program, but the programmer in this case was only just clever enough to do it manually. There are tools to draw these (see links below), but graph visualisation with as many constraints as I added (I’m also trying to reduce the number of edge crossings) isn’t easy at all. Set it up with so many goals and you almost certainly end up with an NP-hard problem, which is likely insoluble in any reasonable amount of time.

I do have some code/data that goes into this. This CSV file gives the information I compiled (by hand) from the existing timelines, and the table from my last post. The data is also shown below and you can get the CSV file here.

Code Phase Release year Setting year Setting month Title a 1 2008 2010 October Iron Man b 1 2010 2011 May Iron Man 2 c 1 2008 2011 May The Incredible Hulk d 1 2011 2011 May Thor e 1 2011 2012 April Captain America: The First Avenger f 1 2012 2012 May The Avengers g 2 2013 2012 December Iron Man 3 h 2 2013 2013 November Thor: The Dark World i 2 2014 2014 April Captain America: The Winter Soldier j 2 2014 2014 August Guardians of the Galaxy k 3 2017 2014 October Guardians of the Galaxy Vol. 2 l 2 2015 2015 May Avengers: Age of Ultron m 2 2015 2015 July Ant-Man n 3 2016 2016 June Captain America: Civil War o 3 2018 2016 June Black Panther p 3 2017 2016 September Spider-Man: Homecoming q 3 2016 2016 May Doctor Strange r 3 2017 2017 June Thor: Ragnarok s 3 2018 2017 July Avengers: Infinity War t 3 2018 2017 July Ant-Man and the Wasp u 4 2019 2018 Captain Marvel v 4 2019 2018 Avengers: Endgame w 4 2019 2018 Spider-Man: Far From Home A 4 2019 Guardians of the Galaxy Vol. 3 B 4 2019 Black Panther 2 C 4 2019 Eternals D 4 2020 Black Widow E 4 2020 Captain Marvel 2

The “setting” year and month refer to my best estimate of where the movie should be placed in the timeline, given all of the caveats mentioned above, and the footnotes below.

Building the Picture

The timeline (the picture above) was drawn manually using Inkscape, which is a brilliant tool for vector graphics. In theory I could automate drawing of these and I definitely plan to have a go at it, but let’s do that later.

In the last post I explained how to download data via Web APIs. I used the same approach, but in addition used IMDbPY to download cast data from IMDb (which doesn’t have such a nice API). The raw cast data has some problems: apart from a couple of spelling mistakes, there is no consistency on how to name any of the heroes. At the simplest level this is because heroes have alter egos, but even then, there are multiple variants used. To keep track of these I have a little file full of aliases. Here’s the first few entries:

Character Aliases Aaron Davis Aaron Davis,Prowler Abomination Abomination, The Abomination, Emil Blonsky, Blonsky Abu Bakaar Abu Bakaar Agent 13 Agent 13, Kate / Agent 13, Sharon Carter Agent Garrett Agent Garrett, Jonathan 'John' Garrett Agent Scott Kelly Agent Scott Kelly Agent Stoltz Agent Stoltz,Stoltz

The list doesn’t include all named characters (yet), but has most that appear in more than one movie.

I use these aliases (via a Julia dictionary – see my last post) to map the cast names in IMDb to a consistent set of names. I mostly use the standard “superhero” name. This has limits (in the future it is possible that more than one person will take the roll of, for instance, Captain America), but it is acceptible for the current MCU.

There is one problem with this. You always need to spend time cleaning your data. In this case I went through it a few times, checking details. Most of the problems occurred because my alias list wasn’t complete, and these were easy to fix. But there are a few weirdnesses:

Black Widow appears in Ragnarok but is not listed in IMDb. Presumably that is because she appears only in a recorded video, not live, however the similar appearance of Captain America in Homecoming is listed.

They ret-conned Iron Man 2, saying that the kid in the Iron-Man mask is a younger Peter Parker, and again this does not appear in IMDb.

Phil Coulson is implicitly in The Incredible Hulk through the Marvel one-shot movie The Consultant, whish is essentially an extension of the end credits of The Incredible Hulk.

There is, in my list, an alias connecting JARVIS (Stark’s AI) to Vision because JARVIS becomes Vision. But in the final timeline, I chose to place Vision’s origin in Avengers: Age of Ultron, where it fits in the story.

The current code fixes these pieces semi-manually, and unfortunately, when we are dealing with real data, often there are exceptions that need a semi-manual fix like this.

After all this, we have a list of movies that each character participates in, and from these, we use the sequence listed above, or here to put the movies in sequence for each character, and then we have a path for each. The first few are listed below, and the file is here.

Character Titles Codes Abomination The Incredible Hulk c Agent 13 Captain America: The Winter Soldier, Captain America: Civil War in Aldrich Killian Iron Man 3 g Alexander Pierce Captain America: The Winter Soldier i Ant-Man Ant-Man, Captain America: Civil War, Ant-Man and the Wasp, Avengers: Endgame mntv Betty Ross The Incredible Hulk c Black Panther Captain America: Civil War, Black Panther, Avengers: Infinity War, Avengers: Endgame nosv

The “codes” column is a mapping of the movies to the codes I gave in the sequence listing of the movies. It’s a key to allow sorting of the movies in sequential order, but the strings formed by listing the codes of the movies in the path will be useful to us when we start to learn more about these paths in future posts.

Note that the file has paths for many of the characters in the film, but I only included a selection of the most important heroes 7 in the timeline.

There is one hiccup still. The appearance of Captain America and The Winter Soldier in Ant-Man is in an end-credit scene. That is OK, and happens often in other movies. One of the cutenesses of the MCU is their clever use of these to lure fans forward. But that end-credit scene is (uniquely) actually a cut from the next Captain America film Civil War. Hence it is the only place in the timeline where we get a non-simple path, i.e., a path with a loop. Once we add that loop, the timeline is finished (except for endless fiddling to get it to look OK).

I know I’ve glossed over details here, in order to make this post a reasonable length, but I will go through more technical detail in a subsequent post. Stay tuned.

Summary

I want to keep these post to a reasonable length, and this one was already over my target, so this is really just a description of the timeline. I have some ideas of what to do with it though, and you will see them in future posts.

In the mean time, enjoy the timeline. If it get’s popular, I aim to post a link to the raw SVG file, so people can mash with it. So do the usual social media stuff, if you care.

Other narrative charts

Generating charts

Other similar concepts

Sankey diagrams

Activity timelines

Event charts

Transaction charts

Footnotes: