Games

Civ VI Epic Series, Part 1: Model Thinking and the Interaction of Complex Systems, Part A

If you wanted to build a video game to model the evolution and complexity of human civilisations, how would you do it?

That is a question so broad, it is virtually impossible to answer without some kind of framework. So let’s establish one: we want a way to represent territory, something that reflects the diversity of terrain on Earth. Some terrain is better than others for farming; some terrain is better than others for mining. Hills are harder to traverse than flat land, while mountains are impassable for regular travel. We need a way to represent fresh water in the environment, because that affects human settlement patterns. We need some way to represent terrain features such as forests and jungles and swamps, because these affect the possible uses for the terrain, and they can also be removed if you wish: forests and jungles can be cleared, and swamps can be drained. As far as the sea is concerned, we need to at least make a distinction between the shallow coast, where fishermen work and where littoral brown-water navies can operate, and the deeper, treacherous ocean that is out of reach without more advanced technology. Oh — and we need some way to reflect that parts of the world are more abundant in some resources than others.

There are many ways to skin this cat, but here’s how Civ VI does it:

Click/tap to enlarge: the details are important

(Note: this image is from a True Start Location Europe map, a Civ-generated map of Europe where civilisations spawn at the location of their historical capitals. Most Civ games take place on maps that are Earth-like, but not recognisably Earth.)

The game uses hexagonal tiles, and each tile has a base terrain type and features that sit on top of that base terrain. Some tiles have resources, such as stone, fish and olives. The northern part of Italy contains a mountain range (mimicking the real-life Alps), and this mountain range includes a natural wonder, the Matterhorn.

To make this clearer, I’ve annotated some of the terrain features (white) and resources (red) on the map. The Matterhorn is marked in blue:

Click/tap to enlarge: the details are important

This is the first step in trying to model the evolution of civilisation: modelling the terrain. The next question is, how do we quantify the value of the terrain?

Yields Part I: Food, Production, Gold

In Civ, there is a game concept known as tile yield. The yield refers to what each tile produces for the city it belongs to. The three basic yields are food, production, and gold. I’ve turned on the yield icons in the screenshot below, so you can see what each tile yields:

Click/tap to enlarge: the details are important

The basic flat grassland tile yields two food (2F, represented by ears of corn). Hills on any tile adds one production (1P, represented by hammers), so a grassland hill tile yields two food and one production (2F + 1P). Woods on any tile adds one production, so a grassland woods tile yields two food and one production (2F + 1P), and a grassland woods hill tile yields two food and two production (2F + 2P).

The basic coast tile yields one food and one gold (1F + 1G, with gold represented by coins). An ocean tile yields just one food. Mountains yield nothing.

The presence of resources and natural wonders affects yields as well. Stone adds 1P to the tile yield, while fish add 1F. Wine adds 1F + 1G, olives add 1P + 1G, while horses add 1F + 1P. Marble adds one culture (1C) which is a yield we’ll discuss later; the Matterhorn also adds 1C to the yield of adjacent tiles.

This is a way for the game to quantify what each tile produces. Obviously this is an abstraction — in real life, not all farming is equal, and not all raw production is equal either — but in a game that operates on such a macro scale, this is an acceptable simplification.

You may notice that the popup on the stone says, in angry red, “Requires Mining”. This has to do with the technological tree, which I’ll talk about in a separate article.

Tile Improvements: Farms, Mines, Quarries, Plantations

The tiles we’ve seen so far are the Civ equivalent of greenfield or undeveloped land, but they do not remain so over the course of the game. Players can build tile improvements on tiles owned by their cities. These include (but are not restricted to):

  • farms
  • mines
  • quarries
  • plantations
  • pastures
  • camps
  • fishing boats
  • lumber mills
  • oil wells

Farms, mines and lumber mills can be built on any suitable tiles (flat grasslands/plains, hills, and woods respectively), while the others can only be built on a resource. Improving a resource tile gives your empire access to that resource.

In this picture here, I’ve built a quarry on each of the two stone tiles. This adds 1P to each of the two tiles:

The stone on the flat grassland was previously producing 2F + 1P; it is now producing 2F + 2P. The stone on the grassland hill is even more productive: its 2F + 2P have now increased to 2F + 3P.

Different tile improvements provide different bonuses. For example, the unimproved Olives tile provides 2F + 1P + 1G (and the angry red words tell us the tile “requires Irrigation”):

After building a plantation on the Olives tile, the tile now yields 2F + 1P + 3F:

(The eagle-eyed may notice that the tile’s appeal has fallen, from 5 to 4.)

In terms of modelling, it’s pretty clear what tile improvements model. Some areas are naturally richer in resources than others, which is where the base bonus to food or production or gold comes from. Imagine a region rich in, say, wild rice. That area provides you more food than an average grassland region without that wild rice. However, if you could domesticate that rice… that would provide you even more rice, even more food, per square metre.

Of course, tile improvements require technology. Before I discuss technology modelling in Civ VI in a future article, there’s something else that’s important to explain about the Civ VI model.

Modelling Time: The Turn

Full Civ VI games start in the year 4000 B.C., and end in the year 2050 A.D. These dates are arbitrary, of course — civilisation didn't begin in 4000 B.C., the empires featured in the game didn't (and don't) span that time frame. Some Civs in the game have histories that extend before 4000 B.C. (in particular: Sumer, China, India). Nonetheless, the game has to start somewhere, and it might as well be 4000 B.C.

To me, the more interesting question is: how do you approximate the passage of time in a game like Civilization?

Many simulation and grand strategy games have a panel with pause, play and fast forward buttons:

 Left: Project Highrise (top: paused; bottom: running). Right: Europa Universalis 4 (top: paused; middle: running at speed 4; bottom: running at speed 1)

Left: Project Highrise (top: paused; bottom: running).
Right: Europa Universalis 4 (top: paused; middle: running at speed 4; bottom: running at speed 1)

When the game starts, the player starts positioning their assets (whether roads, troops or anything else). The player can pause the game and spend some time crafting their next moves, then hit “play” and set the simulation in motion. During particularly slow stretches, the player can fast-forward and run the simulation at 2x to 5x the speed of regular play.

This is the most obvious solution in any game that is explicitly a model of some real-world analogue. Civ could have done the same, starting the game in the year 4000 B.C. and advancing the clock at a base rate of, say, ten years per minute, with the option to pause or fast-forward gameplay as it suits the player.

The problem with this game mechanic is that the speed of human technological, cultural and economic development has accelerated over time. Scientific and social developments have a multiplicative effect on human civilisation’s technological, cultural and economic output. The technological progress that humans made in the year 4000 B.C. is a fraction of the progress that we made in the year 2017. A pause-play-fast-forward game mechanic for modelling time would create a lethargic early game, with an overly dense late game that would test players’ reflexes and attention much more than their strategic abilities.

 A Civ VI game on Standard speed lasts 500 turns and ends on Jan 1, 2050 A.D. At the end of the game, each turn represents six months.

A Civ VI game on Standard speed lasts 500 turns and ends on Jan 1, 2050 A.D. At the end of the game, each turn represents six months.

Instead, the Civilization game series uses a turn-based rather than a real-time game mechanic. Players take turns to make their moves, and each turn represents a predefined timespan. In Civ, earlier turns represent a longer timespan, while later turns represent an increasingly shorter timespan. For example, Turn 1 lasts from 4000 B.C. to 3960 B.C., while Turn 499 represents January to June 2049 A.D. This way, the game remains engaging from start to finish, and the game model more accurately reflects the evolution of human civilisation with respect to time.

Population, working citizens and yield per turn

The combination of yield and turn-based time mechanics gives us the concept of yield per turn. At the beginning of every turn, the game calculates how much of each yield each tile produces, and adds up the totals per city.

Not all cities are the same size, though. A city with a low population cannot adequately work all the resources around it. As the city’s population increases, its citizens start working more and more tiles around the city. A city with a population of 1 receives the yield from the city tile and one other tile within its borders. A city with a population of 5 receives the yield from the city tile and five other tiles within its borders — and so on.

Logically, this would mean that you’d want your cities to be as big as possible, right? That is usually true, but bigger cities also require more food to feed, and that’s where the food yield per turn matters. Each citizen in a city consumes 2F per turn, so a size 2 city requires at least 4F per turn to maintain its population, and a size 5 city requires at least 10F per turn to maintain its population.

It’s hard to explain all of this abstractly, so let’s take a look at an example of this mechanic:

Let’s take a look at an example of this mechanic:

Click/tap to enlarge: the details are important

This is the city screen for the city of Ravenna. The number “2” next to the city’s name on the map tells us the city has a population of 2. (This doesn’t mean there are only two people in the city, obviously — the city population number is an abstraction, just like everything else in this model.)

The tile that the city sits on is always worked — that is, the city always receives the yield from the city tile (highlighted in green). In this case, that tile yields 2F + 1P. Additionally, each of Ravenna’s two citizens can work one tile within Ravenna’s city boundaries. In this case, the citizens have been assigned to work the two tiles highlighted in red. The northern red tile yields 2F + 1P + 1S (science, a yield we’ll discuss later in the series), while the southern red tile yields 2F + 1P + 1G.

This gives Ravenna a base tile yield of 6F + 3P + 1S + 1G. Tile yield isn’t the only way for cities to generate yield, but it is one of the most important yield mechanics in the game. (Buildings, city population and specialists are the other main ways to increase city yield, but I’ll talk about those mechanics another time.) After all the yield from tiles, buildings, city population and specialists is added up, various modifiers are then applied to the base yield to generate the city's actual yield per turn.

1 14 Ravenna City Screen Yield Bar.png

If you look at the bar above the “Ravenna” interface, you’ll see that each turn, Ravenna generates:

  • 2.7 culture (don’t worry about this for now)
  • 1.1 surplus food
  • 3.1 production
  • 2.1 science (don’t worry about this either)
  • 0 faith (don’t worry about this either)
  • 1 gold

The numbers aren’t all round numbers because there are various multipliers at play here (another topic for another post), but you can see that Ravenna’s gold yield per turn comes entirely from its olives tile (1G). The production base of 3.1P per turn comes from its three worked tiles (1P per turn each), plus a 5% multiplier because the citizens are happy (yet another game mechanic for another post). The city is building a granary, so the 3.1P per turn goes towards finishing that building, which requires a total of 65P to complete. Ravenna's already invested 36P in the granary, so at 3.1P per turn, it will take 9 more turns for Ravenna to finish the granary.

1.1 food per turn might seem surprisingly low, but the city screen is kind enough to break down how the food surplus is calculated, so we’ll take a look at that.

Food surplus per turn: modelling population growth and decline

Breaking up a map of terrain into tiles and assigning each tile a food value — that’s a pretty simple idea. Figuring out how food affects a city’s population is slightly more complicated. All things considered, Civ VI’s model is a very simplified one.

Take a look at the “Citizen Growth” panel on the left of the city screen. The city, with its three worked tiles of 2F each, generates 6 food per turn. The city has a population of 2, so the city consumes 2 x 2 = 4 food per turn. That leaves a surplus of 2 food per turn that contributes towards city growth.

Because the citizens are happy, there’s a 10% bonus applied to city growth, giving us 2.2 surplus food per turn. However, because there is barely enough housing in Ravenna (yet another mechanism to be discussed in a future post), the population growth rate is halved. That’s how we arrive at the final number of 1.1 surplus food per turn.

Under “Total Food Surplus”, there’s a “Growth” bar. Think of the growth bar as a food basket or a granary: the excess food gets added to the food basket every turn, and when the food basket is full, a new citizen is born and the city’s population increases by 1.

A city’s population can also drop if it generates less food than its citizens consume. In that situation, the food surplus becomes a food deficit. A food deficit reduces the amount of food in the city’s food basket, and when the food basket is empty, the city loses a citizen and its population drops by 1.

This model has the effect of scaling city yield with city population, but also produces a natural upper limit for each city based on how much food a city generates. If a city cannot feed itself, it cannot grow. Mature Civ cities reach an equilibrium where the total food surplus is at or close to 0, and stay at that size until the end of the game.

What Comes Next?

Civ VI is a complex game that is pretty challenging to learn. In this post, I’ve laid out some of the key game mechanics that we’ll need to understand in order to do a truly deep dive into Civ VI as a model of civilisation, but there’s a lot more to it. In the next post, I’ll explore two of the most fundamental game mechanics in Civ VI: the technological and civics trees. We’ll talk about how they work, and what kinds of assumptions they make about how progress happens.

Modelling Reality Through Games

In the (underrated) musical Chess by Björn Ulvaeus, Benny Andersson and Tim Rice, the cast explains how the game of Chess came to be:

Not much is known of early days of chess beyond a fairly vague report

That fifteen hundred years ago two princes fought

Tough brothers for a Hindu throne

The mother cried, for no one really likes their offspring fighting to the death

She begged to stop the slaughter with her every breath

But sure enough one brother died

Sad beyond belief, she told her winning son

“You have caused such grief, I can’t forgive this evil thing you’ve done!”

He tried to explain how things had really been

But he tried in vain

No words of his could mollify the queen

And so he asked the wisest men he knew

The way to lessen her distress

They told him he’d be pretty certain to impress

By using model soldiers on a checkered board

To show it was his brother’s fault —

They thus invented chess

This is a rather elaborate fiction, at least as far as I’m aware. There were no such duelling princes, and the idea that a mother would be consoled by a bunch of ivory pieces on a grid is laughable. What is true, and what is relevant to our discussion here, is that the game of chess did most likely originate in India, and it was likely used to some degree or another in the study of military strategy.

Chaturanga, the game that is considered to have evolved into chess, used six distinct types of pieces, as does modern chess: the Rajah (king), the Mantri (counsellor), Ratha (chariot), Gaja (elephant, Ashra (horse) and Padati (footsoldier). Their movement on the game board approximates their movement on the battlefield:

  • The Ratha moves like a rook in chess, in straight horizontal or vertical lines, mimicking a charging chariot.
 Rook icon created by LAFS from Noun Project (CC-BY).

Rook icon created by LAFS from Noun Project (CC-BY).

  • The Ashra moves like the modern chess knight, a slower heavy cavalry unit that is more adept at close quarters combat than the lighter chariot.
 Knight icon created by LAFS from Noun Project (CC-BY).

Knight icon created by LAFS from Noun Project (CC-BY).

  • The Gaja’s moves are less well-defined. The great fount of knowledge that is Wikipedia describes three variants:
 Elephant icon created by Icon Fair from Noun Project (CC-BY).

Elephant icon created by Icon Fair from Noun Project (CC-BY).

Conceptually, any of these movements would fit the role of a war elephant perfectly: slow movement that is adept at wreaking havoc up and down enemy lines. (The modern chess bishop, with diagonal movement limited only by obstructions in its path, is a Renaissance European invention.)

  • The Padati, the lowly footsoldier, can only move forward one step at a time, just like on the battlefield. They cannot retreat. I’m not sure why they capture diagonally — if there was a specific reason for it, it has long since been lost to the sands of time. To make up for the Padati’s handicapped movement, there are lots of them — eight, to be exact — and they start on the board in a horizontal line.
 Pawn icon created by LAFS from Noun Project (CC-BY).

Pawn icon created by LAFS from Noun Project (CC-BY).

What’s striking about this configuration is that this line of pawns mimics infantry movement in unexpected ways. Pawns are individually weak but in aggregate are used to form a strong defensive line. The middle pawns often advance to form a salient, claiming control of territory in the middle of the board, but can sometimes become isolated from the rest of the line. Two lines of pawns often face off in what is effectively a deadlock, unless a player can make a diagonal capture (mimicking flanking) or bring in one of the “heavy” units to bear on the line of pawns.

Of course, this comparison has its limits. Infantry does often form the frontline, but rarely does infantry enter the battlefield first. That role goes to reconnaissance units and light cavalry. Since chess is a perfect information game, there is no need for reconnaissance, but this limits its utility as a tool for exploring real-world military strategy. Additionally, in combat, long-range units engage before close-range units, but chess’s capture system models only close-range combat.

All models are inaccurate, some models are useful

Of course, none of this should reflect negatively on chess. Whatever its origins, it has long ceased to be a model for military strategy. What interests me is the abstraction of chess. In trying to create a high-level model of military strategy, many aspects of warfare had to be abstracted away. First of all, the terrain: a chess board does not model the advantages and disadvantages provided by different types of terrain. Secondly, the game does not account for asymmetry of manpower, matériel, or other force multipliers such as training or technology: both sides start with the same number and type of pieces. Thirdly, the game does not account for any asymmetry of information: you always know exactly where your opponent’s pieces are at any given time. This isolates the importance of battlefield tactics.

For a game, this is ideal, because this means that the differentiator between two players is the quality of their tactical and strategic play, and nothing else. As a model of battle, however, this limits the utility of chess.

Games as models of reality

With the rise and dominance of video games, the potential of games as models has gotten a lot more interesting. In particular, simulation, base-building and strategy games are often designed to be semi-realistic models of something that exists — or could exist — in the real world: SimTower is a model of elevator traffic, Cities: Skylines is a model of urban traffic, and SimPark (a sorely underrated game) is a model of the ecology of a park habitat. Some strategy games are models of hypothetical or counterfactual situations: Frostpunk is about surviving a volcanic winter in the year 1886, Surviving Mars is a model of a future colonisation of Mars, and Jurassic World Evolution is a model of a series of dinosaur zoos.

What’s interesting about these types of model-building is that they tend to be based on complex systems.

You may have heard of the Cynefin framework, which divides problem-solving situations into four types: simple, complicated, complex and chaotic. The chaotic condition doesn’t interest us here, as it’s typically not modellable. The best breakdown of the remaining three conditions I’ve seen comes from a paper on healthcare reform by Sholom Glouberman and Brenda Zimmerman:

 From Glouberman and Zimmerman (2002)

From Glouberman and Zimmerman (2002)

The Cynefin framework is designed to explain problem-solving circumstances or conditions, but that’s just the flip side of working on a system. A rocket is a complicated system and building one is a complicated problem. A child is a complex system and raising one is a complex probleem. A habitat’s ecology is a complex system and managing one is a complex problem. Urban traffic is a complex system and managing it is a complex problem. International diplomacy is a complex system and navigating it is a complex problem.

Games with the highest replay value are often built on complex systems, with many interrelated variables that are not strictly solvable through maths alone. This gives the player complex problems to solve that are never quite the same on each playthrough.

(An aside: if a computer is a logic machine built on maths and maths alone, it’s worth asking if any computer games are truly complex, as opposed to merely complicated: given an infinite amount of time and computing resources, couldn’t an optimal solution to any game be found? Well, yes — that is what speedrunners try to do. The most complex games have an element of randomness to them, but computers can’t really generate “true” randomness, which is also why random number generator (RNG) manipulation is even a thing. So, if you want to a total pedant: such computer games are, strictly speaking, systems that are so complicated they are, for all practical purposes, complex systems.)

Emergent patterns in complex systems: Cities: Skylines

A distinguishing feature of complex systems of any kind is that they will produce emergent patterns. Consider the pawn on a chessboard. The rules of pawn movement are simple: One square forward at a time. Two squares forward, if you wish, on any given pawn’s first move. Capture one square diagonally. And yet, within the game of chess, this simple movement gives rise to a whole range of theories and strategies revolving around pawn structure.

What some of the best games do is to create a model with similarly simple rules that create emergent patterns. Look at a city-simulation game like Cities: Skylines. You lay out roads, zone residential, commercial and industrial districts, place amenities such as schools and hospitals, and implement a public transport network. The game simulates up to one million individual citizens’ movements: people wake up, go to work, go to the shops, go home. At work, they take deliveries or make them. Students go to school and, in the mid-afternoon, hang out in town or go to their friends’ homes. Tourists go from airport to hotel to tourist attraction.

(click on any of the following images to enlarge)

A city I built: the main city in the north-east, with two separate “centres”, a large industrial zone in the west, and the tourist district in the south-east.

City rhythms emerge out of up to one million individual modelled movements. Here, citizen Ashleigh Dixon is going to Club de la Crème, and she’s on foot for this part of her journey.

At Ashleigh’s workplace, Aero Designs, a company van (with a donut on top) is leaving the industrial zone to deliver goods to Busy Corner Shop.

Outside the Busy Corner Shop, a public bus is running its route. This one is almost full.

This tourist has gotten hold of a bicycle, and he’s cycling to the Expo.

A tourist is walking to the Stadium.

Another tourist walking to the Stadium. There must be an event happening there.

There’s no way to see if there’s an event at the stadium, but you can see that it draws both residents and tourists.

A truck brings forestry products to the cargo train station near the industrial park for export…

… while another trucks moves forestry imports from the cargo train terminal to The Lumber Mill.

Imports can also travel the entire distance by truck. (If your city has coal or oil power plants but no coal or oil, they will have to be imported — and if the truck can’t get there in time, the plant stops running.)

Of course, exports can travel out of the city entirely by truck, too.

This commuter blimp is empty in the middle of the day…

… while this one, plying the route from the city to the tourist district, is full.

This tourist is leaving the city. Once he reaches the airport, he will despawn (I think).

Municipal facilities also generate their own traffic. Here, a fully-loaded garbage truck makes its way to the incineration plant.

Municipal services also employ people, adding to the realism of the simulation and the patterns that emerge from it.

By modelling individual movements, the game creates an emergent picture of traffic patterns in the city. People move en masse from residential to commercial and industrial districts, and then they go home en masse. Industrial districts generate traffic to commercial districts, as well as to and from other cities in the form of exports and imports. The sum of individual movements produces consistent patterns, and the joy of the game is to figure out how to make the adjustments that modify or constrain that emergent pattern.

In the case of Cities: Skylines, the key to managing your city’s traffic turns out to be isolating your industrial districts. Residential and commercial districts can mix, but industrial districts must be separate and have their own routes into and out of the city. (The fact that this is exactly how actual cities tend to be laid out is not a coincidence.)

Another Example: SimTower

An older and clunkier example of simple rules producing emergent patterns, perhaps, is SimTower. In SimTower, you’re tasked with building a tower that turns a profit and keeps all its tenants happy. Low rent makes people happy. Long elevator wait times make them unhappy. Having to travel far to get to a food or retail outlet makes them unhappy. People will use escalators where available, but never more than seven; people will use stairs where available, but never more than four. People will only change modes of transport once.

What these rules create is an emergent pattern of play that encourages players to keep their office tenants separate from the residential and hospitality tenants in order to manage traffic. Here’s an excerpt of a post from r/SimTower titled Optimal Tower Design:

Office workers should never be using regular elevators to get anywhere. Office worker population density is too high. They must be forced to take the stairs (disable elevator access to their floors!)

David Wolever reached a similar conclusion. Here’s a passing comment from his post detailing how he beat the game:

… You will need to make sure that the lazy sims working in the offices don’t use these elevators.

These two players have very different styles of play, but they converged on a principle of play that is not immediately suggested by the rules. Furthermore, this happens to be the solution that real towers employ to manage traffic: separating the modes of transport used by office tenants from other tenants.

 Optimal tower design: elevators do not run to offices.

Optimal tower design: elevators do not run to offices.

You might argue that no real-life tower would prevent office workers from using the elevators. This is obviously true, and points to limitations in even well-designed complex models: any model is an abstraction, and will abstract away important information to some degree or another. In this case, SimTower’s model assumes that all units of a floor are accessible from any other unit of the same floor. There is no way to create segregated office blocks or areas with a shared retail area. (Project Highrise, a more recent tower simulation game, allows this.) This leads to players artificially separating them by forcing office workers to use the stairs while residents get to use the elevators.

The educational value of games

If game models are flawed, what value do they provide? What can they really tell us about the thing they are modelling?

That’s what I want to look at next, with a thorough examination of Civilization VI and the gameplay that emerges from it.

Speedrunning and the Scientific Method

In a previous post, I wrote about speedrunning and I asked myself why I found it so enthralling.

Well, I should clarify that. I’ve watched a couple of Pokémon speedrunners grinding runs, which was supremely uninteresting. I watched pokeguy84 swear and reset repeatedly, and dropped in on an Exarion stream right after his main Pokémon died to a critical hit and the very promising run died with it. The only sound on the stream was the Pokémon battle music playing over and over. A viewer typed in Twitch chat, “He’s not coming back, is he?” Exarion did come back, and he did attempt a few more runs, but none of them were as promising as the dead run had been.

Huh, I thought. So this is speedrunning — long stretches of grinding boredom, punctuated by a few flashes of elusive euphoria.

When I wrote my last speedrunning post, there was no real insight to it. Obviously there was a link to be made between speedrunning and competitive sport, but to my mind, they differ in one key respect: there is a very clear reason to play sport besides the simple desire to be elite at something, or even to simply improve at something. Being able to run farther or lift heavier loads or respond faster to external stimuli has clear advantages outside of the arena of sport. I can’t think of any specific skills in speedrunning, whether mental or motor, that are in any way comparable. (Of course, I don’t speedrun, so perhaps there are real mental advantages that speedrunners develop that I can’t see from here.)

I sound like I think speedrunning is a waste of time, which is not the case. It’s worth repeating that my question is not: “Why do people speedrun?” My question is, “Why do I find good speedruns compelling?”

I was still pondering this question when I decided to watch a video by Summoning Salt, about the most infamous level in Super Mario Bros:

Summoning Salt is a real historian of speedrunning, and a really good storyteller, able to carve out the broad arcs of each game’s story while clearly explaining the little details that motivate each breakthrough. I quickly became a fan. I watched his Super Mario Bros video, then his Pokémon Red/Blue video, then his Sonic 2 video, then his Metroid and Super Metroid videos, then his Portal and Half-Life 2 videos, Donkey Kong (which is something special), Mike Tyson, Legend of Zelda

And I started to realise, the real comparison here wasn’t speedrunning and sport, it was speedrunning and science.

The Incremental Nature of Science

As a kid, the stories of scientists that I read about tended to focus on the big breakthrough discoveries. I was especially interested in physics, and the history of early 20th century physics feels like it turns on a few major personalities: Max Born, Max Planck, Marie Curie — wait, how about this:

This was my impression of what physicists did: they made earth-shattering, groundbreaking discoveries that altered the landscape of human knowledge.

Only in college, when I was taking classes in linguistics and found myself loving the discipline, did I start to understand that this vision of science wasn’t the full picture. The paradigm shifters in any field get all the glory, but science is primarily incremental in nature: for every scientist whose name goes in the history books, there are a thousand others working on the little advances that make the big ones possible.

Speedrunning as a scientific endeavour

The process of speedrunning is, in many ways, similar to the process of science. It isn’t simply about playing a video game as fast as possible. Most of us, when playing video games, are content to live in the virtual world that’s placed before us and take it at face value, but speedrunners — they want to push their understanding of that virtual world to the absolute limit.

In an interview on the speedrunning podcast My Insane Pace, Pokémon speedrunner Shenanagans divided the Pokémon speedrunning community into three groups: the glitch hunters, the routers and the runners. I’m writing here as a total outsider, but I doubt these are strict categories: I imagine most Pokémon speedrunners will straddle two of these categories to some degree. The glitch hunters find ways to break the game that are useful to speedrunners. The routers are the theorycrafters who use their knowledge of the game (including any glitches that are allowed) to plan the fastest route through the game, and the runners are the ones who actually execute the routes.

Shenanagans was talking about Pokémon, but it’s clear from Summoning Salt’s videos that this is true of all speedrunning communities. The first runs in any game are usually superbly-executed playthroughs by highly skilled players, similar to what a casual player might do except highly refined and nearly error-free. At this stage, the runners are the dominant force. They bring the target time down, until someone does a run that is so flawless, so optimal, that the possibilities of the route are exhausted.

Routers: The Engineers

At this point, the routers take over. They examine their assumptions about what the fastest path through the game is, and work to cut out any parts of the game that are not essential to finishing the game. Depending on the nature of the game, this work can take many forms. A game that is relatively linear, like Super Mario Bros or Metroid, might present relatively few routing possibilities. A game that is relatively open, like the main series Pokémon games, presents a multitude of possible routes, but very few of them will allow for a competitive speedrun.

Of the games that are commonly speedrun, there are only a few that I have completed myself and can talk with any degree of confidence about: Pokémon Generations 1-3, and Jet Set Radio Future (JSRF). JSRF is a relatively linear game, where the goals needed to unlock the next game levels are clear, and the fastest path through the game presents itself relatively quickly, even to a novice player. From a routing point of view, there are a few things you can do to save time: not all the characters need to be unlocked, some fights and races can be avoided, and an entire level (the sewers) can be skipped. However, because the game itself gives you a fairly limited set of permutations to progress through the game, the fastest route through JSRF is straightforward to learn.

The main series Pokémon games, on the other hand, are much more complex and present much more elaborate routing possibilities. A regular Pokémon player wants to maximise the total amount of experience points that the six Pokémon in their party have, and to catch as many Pokémon as possible along the way. This entails fighting as many battles and walking through as much tall grass as possible. This takes a great deal of time and makes up the bulk of a Pokémon playthrough.

A Pokémon speedrunner, however, is not trying to maximise total experience points. They are trying to do the opposite, to minimise the total needed to complete the game.

For a certain kind of gamer, this kind of question is an intellectual playground. What is the least amount of battling needed to get through the game? Six strong Pokémon take too long to train. One strong Pokémon is better than six weak Pokémon. What should that one Pokémon be? The biggest advantages you can have in battle are stats, moveset and type, so this one strong Pokémon should have high base stats, a powerful moveset, and preferably have few type weaknesses.

That narrows the 150 (or more) possibilities down to a much smaller pool of candidates, but the work of really optimising the route has only just begun at this point. Which battles are unavoidable? What Pokémon are encountered in these battles? What stats do these Pokémon have? What moves? Working from this information, what is the minimum amount of damage needed at each stage of the game to get through these battles, and the minimum amount of defense, special defense and health needed to survive? In a way, the game becomes an engineering problem, and all routers are trying to do is to find the most time-efficient solution.

Glitch Hunters: The Field Researchers and Lab Scientists

It is impossible to discuss routing in many games without discussing the glitch hunters. Glitch hunters look for ways to break the game, often as a way to try to understand how the game is built. Think about a child playing with wooden blocks who builds a structure and then removes blocks, one by one, until it collapses: this is not too different from what glitch hunters do. They look at a game as a carefully-constructed structure, and try to find which cornerstones and support pillars hold up the game, and how the structure holds up or falls apart when some of these elements are modified or removed.

(There is some discussion about what constitutes a glitch and what doesn’t, but I’m not going to get into that here. For our purposes, I’m going to group glitches, exploits and level skips together, because the differences are not important in this post.)

I could try to give you an example of how glitches help speedrunners to understand the game better, but it’s easier to let the experts do it. If you’ve ever played Pokémon Red/Blue/Yellow, I urge you to watch Shenanagan’s masterclass in catching all 151 Pokémon in Pokémon Blue from Summer Games Done Quick 2015:

Many glitches have effects that are undesirable in normal gameplay, sometimes irreversibly breaking the game. However, glitches of all stripes are of interest to speedrunners because they suggest ways that the game can be manipulated, and therefore open up new routing possibilities that non-speedrunners would not even dream of considering. Some speedrunning communities maintain separate glitched and glitchless categories for speedrunners who prefer their games glitchless, but glitches are still useful discoveries in glitchless speedrunning because discoveries about the structure of the game may alter the calculus of routing possibilities in glitchless speedrunning.

Replicability

Here is where, to my mind at least, the parallels to the scientific method really start to take shape. A glitch is useless unless it is replicable.

In this respect, a glitch is no different from any other software bug. If you encounter a bug in a piece of software, the developer needs to know what you were doing right before the bug happened, because that’s the key to figuring out why it happens and how to debug the code. The critical difference between a software bug and a gaming glitch is how it is viewed. Developers understand how their software works, and bugs reveal oversights in the logic of the software. Glitch hunters do not have that privileged view of the software, and glitches reveal insights into the game logic.

Developers are watchmakers who craft a complex piece of machinery, while glitch hunters are more akin to field researchers and lab scientists, rigorously documenting new discoveries and tweaking experimental variables.

If a glitch cannot be replicated, it might as well be an accident caused by the sneezing of the universe. If it can be replicated, however, routers can add it to their arsenal of game knowledge and look at how this glitch or exploit can help gain time on existing routes.

Tool-Assisted Speedrunning: The Theoreticians

In some speedrunning communities, there is an additional group of speedrunners often critical to helping lower records: the tool-assisted speedrunners. A tool-assisted speedrun (TAS) is a run done in an emulator by a player who has meticulously planned out the game inputs (i.e. button presses) frame by frame, so that routes requiring multiple frame-perfect inputs that might be impossible for human reflexes can be run. In other words, the route is run by a computer.

TASes do not qualify for records, but I noticed that they often feature prominently in Summoning Salt’s world record progression videos. If a route is theoretically possible but very difficult to execute, a TAS is often presented as a proof of concept. Sometimes TAS routes are run, then shelved as being too difficult for a human to run, then revisited when the potential of easier routes is exhausted, as in the case of Super Mario Bros Level 4-2 (this link goes to the relevant TAS section, but you may want to start from the beginning for context). Other times, a difficult route is run by TAS to set a mark for human players to work towards, as in the case of Mario Kart’s Choco Mountain’s Weathertenko (this link also goes to the relevant TAS section, but you may want to start from the beginning for context).

The obvious scientific analogue here is the theoretician, the scientists who work with models and predictions, and whose work helps to advance understanding of what is possible but is not immediately applicable.

TASes are not useful for all speedrunning games. They are useful for testing extremely challenging routes in games with almost no randomness in the gameplay, like Super Mario Bros. In a game like Pokémon with a considerable amount of randomness, however, there are too many moving parts for TASes to be useful. In such speedrunning communities, the routers are the ones who take on the job of presenting these proofs of concept.

The Structure of World Record Progressions

Thomas Kuhn is credited with introducing the idea of the paradigm shift in his 1962 book, The Structure of Scientific Revolutions. He argues that “normal science” is punctuated by periods of “revolutionary science”, when existing models of science prove to be inadequate, and a period of tumult follows as new models are constructed to accommodate paradigm-breaking discoveries. These new paradigms demand the re-evaluation of old data, and eventually one will become the basis for the next period of normal science.

Kuhn’s presentation of the idea was controversial, but the concept is nonetheless useful in relation to speedrunning. If we think of a route as a paradigm, then the rest of the analogy falls in place easily.

In the early days of a game’s speedrunning history, skilled players converge on one optimal strategy to speedrun the game, and successive runs refine this route/paradigm until its potential is exhausted. Then, routers start examining assumptions about the best way to run the game, theorycrafting new routes/paradigms and incorporating lesser-known tricks and more difficult manoeuvres. In doing so, they work closely with glitch hunters and tool-assisted speedrunners to figure out how the game is built and what the theoretical limits are. They look at known glitches and explore ways to incorporate them into new and better routes, re-evaluating all their assumptions about established routes along the way.

Then, a breakthrough: either a new glitch is found, or an old glitch is “rediscovered” and successfully incorporated into an improved route/paradigm that might lower the theoretical speed record once more, so that the runners can have another go. The cycle begins again.

Obviously, this is a very abstract and overly simplified picture. The real process of lowering a speedrun record is much more nuanced, and all of the processes I describe in this post — glitch-hunting, tool-assisted speedrunning, routing, and the actual speedrunning — occur simultaneously, not sequentially. If you consider the scientific analogues — engineers, field and experimental researchers, theoreticians — it’s the same thing: progress in each domain happens simultaneously, not sequentially. Theoreticians don’t stop working while they wait for their colleagues in the lab to publish. Nonetheless, I think this division of labour is a useful lens through which to look at speedrunning.

The speedrunners: what about them?

So far, this analogy hasn’t accounted for the runners themselves. The runners are the ones who actually perform the speedruns, so what would their scientific analogues be? Truth be told, I don’t know. The analogy breaks down here, because so much of science is predicated on progressing knowledge through the accumulation of controlled, repeatable experiments, and speedrunning records are, by definition, exceptional. The methods to perform the speedruns may be rigorously documented, but few people on the planet will ever be able to repeat the methods to the required precision, and even then they may not match or beat the speedrun records of any given category. This is antithetical to the idea of replicability.

This is where speedrunning diverges from science and converges with sport. In fact, speedrunning uses the language of speed-based sports: pace, split, personal best, world record.

This is the face of speedrunning, the only dimension that most of us ever interact with twice a year when Games Done Quick rolls around and millions of gamers watch the world’s best speedrunners performing jawdropping feats of gaming. It’s probably why the comparison of speedrunning with sport seemed so natural to me at first.

Every GDQ run includes a commentator’s couch, where other speedrunners explain what the runner is doing and, very very briefly, the underlying mechanics that allow each section of the game to be optimised to such a degree. In a way, the couch commentators represent the real experience of speedrunning — the experimentation, the refinement of each route, and the collective knowledge of the community.

It would be easy to regard speedrunners as the leading edge of the speedrunning community, the ones who push past the limits and establish new frontiers, like the star scientists at the Solvay Conference. Perhaps it is more fitting, though, to think of the star speedrunners at GDQ as the keystone of an arch. Each block, each voussoir, is an incremental step towards the keystone. The keystone is merely the final piece of an architectural puzzle that rests upon all the blocks below it and gives the arch its shape. The speedruns are the raison d’être of the speedrunning community, but the joy of speedrunning lies in the process of discovery and inquiry that makes the speedruns possible.

Additional thoughts

  • Obviously there isn’t a true one-to-one analogy with science here, especially since many speedrunners play multiple roles. This is just a useful way to frame the process of lowering speedrunning records.
  • Kuhn’s paradigm shift model is often contrasted with incremental science. In this post, I have blended the two, just to avoid a long and unnecessary philosophy of science discussion.
  • Another area that’s worth a comparison with speedrunning is speed climbing. I know very little about climbing, having never climbed myself (two half-days of climbing at Outward Bound doesn’t count). What prompted this idea was an article by Kelly Cordes in Outside magazine about the risks of speed climbing. Like other speed-based sports that occur on established routes, speed climbing involves route-based records, a continual search for time saves, and the need for near-perfection on record climbs. What’s unusual about speed climbing — and what it shares with speedrunning — is the use of techniques never used by traditional climbers. Recreational runners, swimmers, cyclists, rowers, skiiers, etc. all use the same techniques as the fastest professionals, just not as fast. Speed climbers and speedrunners, on the other hand, engage in their hobbies in ways that traditional climbers and gamers would never consider doing, or even actively object to.