Pathfinder: Kingmaker (PF:K for short) is a role-playing video game created by Owlcat Games, released in Fall 2018 on Steam and GoG. Inspired by classic Bioware games, this project uses a popular board game system ruleset, combat takes place in Real-Time with Pause, follows an isometric camera, and has a non-linear story with multiple unique endings.

In this article, I will share a little about how we worked on designing the audio throughout the game’s development including task management, the search for inspiration, and troubleshooting. An experienced specialist may not find anything particularly groundbreaking in this recap, but beginners and enthusiasts will definitely discover some points of interest.

Initial Steps

Audio design begins with the search for key ideas that then become the foundation for balanced resource development. This includes music, sound effects, and voice acting. The rule of thumb here is to look at other projects which are similar to the one you’re working on.

Our goal for PF:K was to become a project that would stand out from other new generation RPGs and prove itself to be the spiritual successor to the Baldur’s Gate series. We started with a role-playing system that has withstood the test of time (tabletop Pathfinder is based on a modified version of D&D 3.5e), and its first chapter was essentially a mod for Neverwinter Nights 2 constructed by the game’s designers. Meanwhile, Victor Surkov, our art director, created a beautiful concept showing how the upcoming game would look and feel. During the early development stage, the whole team played Pathfinder every week in order to master the ruleset and to also get a grasp on how the game felt for them: the physical process of throwing dice and moving game pieces created a real feeling of adventure.



The concept art for our upcoming game.

With this in mind, it took us several weeks to discuss the games, movies, and books we could use for inspiration and references. We ended up deciding upon the following guidelines:

1. General Feel

Unlike Pillars of Eternity and Divinity: Original Sin, which had already been released, our project was set to respect the high fantasy genre. That means, more or less, a 50/50 balance of serious story and fun without any dramatic shifts to one or the other throughout the gameplay. We wanted to recreate the feeling of discovering an old, all but forgotten fairytale someone had dug up in the attic and decided to read on a winter evening.

2. Musical Score

We decided to base the musical atmosphere on the works of Inon Zur, who has previously amassed a considerable repertoire with games in this genre (Throne of Baal expansion for Baldur’s Gate 2, Dragon Age series). Our starting point for the first compositions was Basil Poledouris’s score for Conan the Barbarian, which used a combination of traditional orchestra and a variety of ethnic musical instruments.

Inon reflected upon the visual and story materials that we had provided him and then he wrote several key musical themes which were used as a foundational reference for sound and atmosphere when working with other composers (unfortunately, we couldn’t afford to have Inon Zur compose it all).

Inon Zur talks about working on music for Pathfinder: Kingmaker.

We did, however, allow ourselves to pull away from the musical reference when it made sense to do so. For example, the game features something called the First World – a plane of existence the Gods use as a draft for creation. As such, everything within it is in a constant state of transformation. This kind of location demanded a unique approach to music, and was far from being the only unique location that needed special care. We decided that our own rules would not limit our creative work when we deemed there to be a reasonable judgement call.

3. Principles of Sound Design

For sound design, our reference points were Diablo 3 and Dragon Age. The former had very distinct combat effects (the clanking of weapons, magic effects, enemy voice-over) coupled with a great balance of audio throughout the game. Meanwhile, Dragon Age helped us understand how we wanted cities, most notably our capital, to sound like. We established the following criteria:

Combat accounts for the largest part of the gameplay and must be entertaining. This is why the sounds of battle should be distinct, juicy, and maybe even a little rough, but not too real. We wanted the clear clangs of metal, bones cracking and blood spurting as you finish off an enemy.

At the same time, magic effects should still be clear and comprehensible during an intense battle. To maintain this focus, magic sounds have a narrow stereo field with a distinct impulse. We determined that it would be acceptable to add synthetic sounds and textures to natural sounds – this helped us create more recognizable sounds that were immediately memorable, distinct and specific.

Locations set in nature should be non-intrusive, well-detailed and without repetitive patterns. The audio still contains hints to what could be taking place just out-of-sight (wolves howling in a nearby forest at night, the rustles and moans in a forgotten dungeon, growls and rumbles in a troll’s fortress, etc.).

4. Voice Acting

Our budget did not allow for cinematic cutscenes with detailed facial animation for dialog (like in Dragon Age). To compensate, we decided to emphasize the characters’ voices: every hero had a distinctive trait that would always be highlighted by the actors’ delivery. This kind of approach creates consistency that makes it easy to track the feel of each character.

For example, Harrim the Dwarf is always melancholy, and in some scenes his gloominess reaches the point of becoming his true ecstatic faith. Jaethal’s comments reek of disdain 24/7. Halfling Linzi’s banter shows her enthusiasm and frivolity.

This kind of approach is not something that you often find in modern games. This ‘theatrical’ style is traditional in older games and cartoons from the 70s and 80s. For us, this approach helped us find the right atmosphere for the tone we were looking for.

Let’s examine how these ideas evolved throughout development.

Nature Sounds

Our game has more than 200 different locations which all require ambient sound. We started by splitting out all locations a player can visit. The settings include typical combinations of nature and plot variables. If several locations exist within one setting, we use the same combination of sound with some variations.

The settings in our game are as follows: forest, plains, hills, dungeons, ancient ruins, swamps, and The First World. Artists and narrative designers chose photos and art that best depicted the appropriate mood and then drafted up a list of locations and quests. There was no shortage of material since our game is based on the Pathfinder tabletop game.



Art reference for the Plains setting.

The sound team used these documents as a reference point and started thinking about how to expand and deepen the feel of the setting. It was important to make sure that the ambient sounds would not become annoying after 50-80 hours of play. This meant making them as ‘alive’ as possible, which, in turn, meant striving to maximize various small random details to be as immersive as possible.

We ended up establishing the following components to determine the sound layout for each location:

Basic setting sound : Several (three or more) sound layers are played in equal balance across the stereo field. These layers typically include the main sounds in the setting that are not affected by the player’s positioning: various gusts of wind, a light breeze, insects, birds and animals somewhere in the distance. Some locations require additional specific elements that can be heard from time to time: falling rocks, rustling leaves, etc.

: Several (three or more) sound layers are played in equal balance across the stereo field. These layers typically include the main sounds in the setting that are not affected by the player’s positioning: various gusts of wind, a light breeze, insects, birds and animals somewhere in the distance. Some locations require additional specific elements that can be heard from time to time: falling rocks, rustling leaves, etc. Invisible local sound sources : Birds and insects, reptiles, screeching trees, howling wind. Sounds like these create more interest in exploring the location. The player can separate distinct sounds from the ambience and follow them to their source, making the ambient sounds an integral interactive part of gameplay. Think of this as landscaping with sound: with a bird’s song or a cricket’s chirp we illuminate an otherwise not-too-special part of the map and attract the player’s attention to potential secrets hidden within. Or, for example, picture two identical locations, except with one you have a choir of jays, the drumming of woodpeckers and the occasional cuckoo, but with other there are just crows endlessly cawing. These locations have a very different feel, despite similar visual references.

: Birds and insects, reptiles, screeching trees, howling wind. Sounds like these create more interest in exploring the location. The player can separate distinct sounds from the ambience and follow them to their source, making the ambient sounds an integral interactive part of gameplay. Think of this as landscaping with sound: with a bird’s song or a cricket’s chirp we illuminate an otherwise not-too-special part of the map and attract the player’s attention to potential secrets hidden within. Or, for example, picture two identical locations, except with one you have a choir of jays, the drumming of woodpeckers and the occasional cuckoo, but with other there are just crows endlessly cawing. These locations have a very different feel, despite similar visual references. Specific location sounds : These usually come from the location’s inhabitants and hint at certain events at that location. Trolls growling, kobolds hissing and arguing, someone hammering in a dungeon, eerie sounds in ancient ruins, etc. Most often, these sounds are also located somewhere not too far away and provide hints at what may be encountered without overly specific audio cues.

: These usually come from the location’s inhabitants and hint at certain events at that location. Trolls growling, kobolds hissing and arguing, someone hammering in a dungeon, eerie sounds in ancient ruins, etc. Most often, these sounds are also located somewhere not too far away and provide hints at what may be encountered without overly specific audio cues. Visible sources of sound: Characters, animated birds and animals, rivers, bobbing boats, campfires, swinging signs.

A mix of all of these components gives a rich, diverse sound palette. The game’s environment does not begin to feel old even after the player has stuck around in one location for several hours.

Example of game location audio: the fishermen’s village.

Bringing a City to Life

We had to take a different approach with the capital and other settlements. Unlike natural locations, cities need to sound full of life and action depending upon the time of day as well as the protagonist’s alignment.

Alignment is a core gameplay element in any D&D-based game. A character’s moral/ethical stance can be mapped on two axes: Evil vs Good and Chaotic vs Lawful. Both axes have a neutral state in which the character doesn’t lean strongly in either direction and their decisions individually are dictated situationally.

Upon completion of the first chapter, our protagonist forms his own little state and builds his capital on the site of a partially-destroyed fort. Actions performed during the game are reflected in the protagonist’s alignment and go hand in hand with the establishment of laws which are in line with this alignment. Law and order are coupled with the behavior of citizens in the capital, which ultimately determines the character of the city in the protagonist’s personality. We wanted sound to help demonstrate all of this.

Naturally, we weren’t able to find any finished product that could help us in this task (we were breaking new ground here). So we hired ten actors (five men and five women) to record various sorts of crowd clamor, also known as ‘walla’ in the movie-making industry. Each actor also recorded an individual set of reactions to game events. All of these recordings served as building blocks that we used to assemble the sound atmosphere of the capital.

Before we began, we needed to settle on the language we wanted the crowd to speak. Russian didn’t work for us (as dialog was recorded solely in English), but working with foreign actors would have been too costly for this task. We ended up asking our actors to pick several syllables they would use to improvise words and phrases to go with certain emotions. The goal is to convey a feeling, without drawing attention to the specific ‘words’ being spoken. This worked in the end, as international players have said they do not recognize the words but are able to decipher their meaning.



A list of topics for actors improvisation.

Here’s how the sounds of the capital are composed:

The crowd : Ambient people’s voices are equally spread across the stereo field engage in banter. There are three variations: calm (neutral), lively-positive (dominant laughter, drunken exclamations, friendly tones), and lively-negative (dominant frightened shrieks, whispering, and occasional weeping).

: Ambient people’s voices are equally spread across the stereo field engage in banter. There are three variations: calm (neutral), lively-positive (dominant laughter, drunken exclamations, friendly tones), and lively-negative (dominant frightened shrieks, whispering, and occasional weeping). Resident conversations : If you approach certain houses, you can hear the faint voices of their inhabitants (usually married couples). When they are neutral, you can hear them talking without any abrupt emotional outbursts. When they are positive, you hear them laugh and cheer, and when they are negative you hear them argue, cry, or sulk.

: If you approach certain houses, you can hear the faint voices of their inhabitants (usually married couples). When they are neutral, you can hear them talking without any abrupt emotional outbursts. When they are positive, you hear them laugh and cheer, and when they are negative you hear them argue, cry, or sulk. Cries from the distance : A set of random phrases that can be heard in various parts of the capital, depending on the alignment. Most important here is that we will never be able to track these phrases to the speaker: the player can only hear them from a distance. This allowed us to save money on animations and performances while dramatically livening up the sound palette.

: A set of random phrases that can be heard in various parts of the capital, depending on the alignment. Most important here is that we will never be able to track these phrases to the speaker: the player can only hear them from a distance. This allowed us to save money on animations and performances while dramatically livening up the sound palette. Urban activities : A blacksmith hammering away, pets making noises, a cart passing by, etc. These do not have a specific locale, but add to the understanding that the area is populated contextually.

: A blacksmith hammering away, pets making noises, a cart passing by, etc. These do not have a specific locale, but add to the understanding that the area is populated contextually. Nature sounds: Our capital begins as a small village and eventually grows into a city built of stone. Natural elements aren’t completely erased as the population grows: we can still hear birds chirping on rooftops, crickets chirping in the evening, and blowing wind with the increasing clamor of the city.

Sounds of the capital.

Weather and Seasons

Around a year and a half into the development cycle, we began working on weather in the game, including rain and snowfall of various degrees of intensity. More than half of the locations in the game had already been prepared by then without a dynamic weather system in place with audio. So, how do we add weather effects to these without creating problems with the content we have already created and implemented?

Allow me to go on a tangent for a moment: all ambient sound in the game exists in its own Unity Scene which is loaded along with the location. Before starting to think about adding weather effects, we had two scenes for every location – for day (morning and afternoon) and night (evening and night). Naturally, we made these for clear weather.



Example of an audio scene.

It became clear that we wouldn’t have time to craft two more versions for every audio scene (adding rain and snow), and that it would be better to have these effects on top of what we already had. We also needed to adapt separate elements of ongoing sounds so that nothing would wash out the weather effects.

We added the universal states of Rain and Snow into the game along with a weather effects intensity controller to solve this issue. This taught the game to turn on weather effects on top of what was launched when the location loaded and to adjust the volume of other sounds in the location dynamically.



The audio busses set for ambient sounds.

The picture above illustrates how we used our own audio bus called ‘LocalLive’ to play insect and bird sounds. This channel was split into two separate buses for non-migratory birds (AllSeasons) and migratory birds (Summer).

During the winter, all insects, toads and migratory birds are muted, but sedentary birds can still be heard that stay for the winter (in our game, these are mostly crows and owls). During other seasons, when it starts to rain, all insects and birds gradually fade out and then fade back in when the weather clears. All of this is regulated by the weather effects intensity controller, which adjusts the volume of the aforementioned audio buses.

The controller also helps adjust rolling thunder during a storm. At times of light rain, thunder comes in 2-3 seconds after the lightning and as the rain intensifies the thunder arrives faster and becomes more powerful. At full intensity, there is no lag between lightning and thunder.

Demonstration of the weather effects.

Dialog and Character Lines

Voice acting is one of the most expensive aspects of developing games: a good actor’s hourly rate starts at around $200 and the sky's the limit. Also, not forgetting to add in the costs of studio time along with the sound engineering work (editing and mastering).

More than a million lines were written for PF:K with dialog accounting for more than half of the content. We couldn’t afford to have all of these lines voiced, and after lengthy debate, we agreed to have actors voice only 10% of the dialog in the game, as well as all standard character reactions at the camp since the player will need to hear these sounds very often.

Standard Reactions

Standard reactions are heard from the protagonist as well as their companions whenever the player does pretty much anything in the game. First and foremost, they inform us of the latest order being accepted or rejected or indicate when a character is in danger. These sounds serve an important game function. Every line is written in a way that captures the character’s true nature and is equally important to there activities. Ideally, you can pretty much understand a lot about the character after listening to 2-3 of their lines. We’ve also included a few gags you can hear if you keep clicking on the character’s portrait, which is an age-old audio design trick going back at least to Warcraft II.

One interesting fact is that we had all regular (non-combat) lines recorded in whispers, too. This was required for the game’s stealth mechanics that allow characters to move and act undetected.

Character reactions.

Campfire Talks

Campfire talks are a set of mini-dialogs in which we can learn more about our characters and their pasts, habits, or what they think about recent events. If there’s nobody to talk to (the protagonist doesn’t engage in campfire talk), a lonesome companion can express some opinion on a pressing issue with a few lines of text.

Character conversations during Rest/Camp.

Magic Effects System

PF:K uses around a thousand magical effects. Once we calculated how much time and how many resources would have to be allocated to work on them, we decided to optimize the process. Since many of the effects are similar in terms of mechanics and visualization, it was reasonable to re-use sounds for combining them on the fly: take the sound of blazing fire from one effect, a small explosion from another and a side-tone from the third. This approach helped us process many new but not totally unique magical effects.

In order to do this, we created a Unity component and called it the Magic Constructor, allowing us to line up any number of ready sounds and set their volume, pitch, and delayed playback.



An example of the Magic Constructor Unity component.

After sorting out the mechanics, we started working on the basic elements that would form the final sound. Magical effects are usually split into three main layers:

Impulse: this sets the power and range speed Domain: the force of nature or the school of magic – fire, water, air, transmutation, enchantment, etc. Decorative elements: low-pitch blows, crystals tinkling, transitional gusts of wind, etc.

All three layers do not have to be used, but this kind of partition grants flexibility and simplifies your task when basic magical schools get additional variants in which only one element changes (e.g. acid instead of fire).

The Magic Constructor in action.

Amplifying What Is Important in Battle

Combat in Real-Time with Pause is a mode in which you can have 12-18 characters fighting simultaneously, but the death of just one hero can lead to the opponent’s victory. To help the player better find their bearings we tried to make combat as informative as possible with the sound cues, i.e. amplifying that which is important and leaving out what is unimportant.

Using High Dynamic Range Audio

Audio in PF:K runs on the Audiokinetic Wwise engine. One of its greatest advantages is High Dynamic Range Audio (HDR audio) which expands the range and pinpoints which sounds are louder or prioritized compared to others.

Wwise analyzes all sounds triggered in the game at all times and adapts the audio mix by having the most important events turn down the volume on less important sounds whenever the sound director deems it necessary. For instance, if a grenade goes off right next to you or when you’ve just slain a dragon, you definitely do not need to hear birds chirping in the forest or the sound of your own footsteps at that moment. The mix clears out to make way for important events and then restores to its regular state.

We used this system cautiously so that the player would notice it working only in extreme situations. All sounds were split into groups with various virtual volumes and it took us a while to adjust various settings to find the most comfortable solution. This is the hierarchy of battle sounds that we came up with:

Magical effects of great power and voices of gigantic creatures Critically important messages from characters (low health, unconsciousness) Regular magical effects (flares, explosions, transformations, etc.) Spoken incantations of magical spells Regular lines from characters and enemies: reactions to engaging in battle, orders, battle cries, etc. Weapon sounds: swishing, shots, inflicting damage Other character-related sounds: footsteps, armor clinks, sheathing and unsheathing of weapons

The system allows the player to always have a chance at noticing magical effects or cries for help in the midst of battle, providing an opportunity to take time to pause the game and give orders as needed.

Prioritizing Volume of Protagonist Weapon Sounds Over Enemy Weapon Sounds

Here’s a small detail that proved to be useful: when battle breaks out primarily amongst melee fighters (swords, axes, daggers, maces, etc.), the combat audio consists of swishes and blows up to 90%.

In order to find some kind of way to distinguish our characters in all of this mayhem, we decided to lower the volume of opponent weapons by 3 decibels: players hear their own attacks allowing them to hit better and focus on their own successes. If they end up receiving massive damage, that will be conveyed by character reactions and the game’s interface.

Finishing Off Enemies

Another small detail: every finishing move in the game triggers a special audio cue of a low-pitch kick along with splattering blood (or whatever runs through the enemy’s veins). The player receives an audio reward for being victorious in battle.

Sounds of combat.

Working with Music

How much music does a game need for 60-80 hours of gameplay? The obvious answer is as much as possible! For us, the realistic answer was around 90 minutes of musical material. This meant, first of all, we needed to avoid repetitiveness in which the same melodies are looped for hours on end, leading the player to switch off the in-game music and turn on their own playlist in the background.

Our solution:

We split all of our music into three categories: story, exploration, battle themes. Story music can be heard in dialog and important cutscenes. This provides the right emotional tone. Exploration music is heard during gameplay which takes place outside of combat. Battle themes are, obviously, reserved for battles.

Exploration music accounts for the majority of game music, and that’s why it plays an important supporting role. These compositions have been written to remain subtle and blend in with nature sounds. Story and battle music can be both expressive and memorable: we hear these compositions far less, usually at emotion-inducing moments.

In most locations, exploration music is played with pauses of 40-50 seconds between tracks. This allows the player to get a better feel of the location through two equal streams of information – music and a carefully designed audio atmosphere. This means that the music doesn’t get annoying and old even after a long time.

Battle and story music are played in a cycle all the way until the end of the event, maintaining a constant feeling of suspense.

Otherwise, music in PF:K takes a rather traditional approach. We sacrificed additional interactivity for a better structure and a more memorable experience (relying also on our robust environmental audio ambiences). One thing that does change in the tracks from time to time is the opening sequence for battle themes. There are usually 2-3 versions.

Numbers and Charts

Finally, here are some general stats:

Music:

Six composers worked on the game: two in-house studio employees and four professionals were contracted for the project.

A total of 60 compositions were written, amounting to 105 minutes of music.

A large part of the music was written with the help of virtual instruments with some elements using “live” vocals and brass instruments.

Tavern compositions were mostly recorded live.

Sound:

Eight sound designers took part the game’s audio design, with six of them under contract for the project. Their help was needed for creating sound effects from a very long list of monsters and magical effects.

The project uses more than 23,000 sound files in the following categories:

○ Locations (ambience)

○ VFX, magical effects

○ Monsters, NPCs

○ Dialog events

○ Weapons

○ Interface

○ Interactive actions

○ Character reactions

○ Dialogs

Voice acting:

More than 15 hours of dialog was recorded. It took about 150 studio hours to do this.

The voice acting was done with the help of 45 actors. Most of the actors worked with us through two studios in New York while others we chose to contact directly.

Conclusion

A lot of work went into the audio for PF:K over the span of two years. So, what have we achieved? A lot, I believe, as most reviews cite sound, music and voice acting in the game as some of its greatest features. Many players share small details of their audio discoveries in online discussions. I believe we achieved all of this mostly thanks to the following:

We started planning the game’s audio during pre-production when we could still discuss and approve all key ideas with other members of the team. This is when we established what the game’s sound would be like and what role it would have, as well as its main aesthetic principles. We knew what we wanted to see from the start, so we started building the whole process from the start and continued it all the way to the game’s release. None of this would have been possible as an afterthought midway or near the end of production.

We licensed a third-party high-functionality audio engine for the game (Audiokinetic Wwise). This allowed us to integrate and fully support sound in the project without the constant need to bother programmers. For instance, integrating the weather system into the game’s audio only needed a few lines of code to toggle the needed processes – the rest was done within the audio engine, under the control of the audio team.

The sound designers had the opportunity to work with the game first-hand and study it ‘from within,’ which provided them the opportunity to suggest their own ideas on improving the game’s sound and fuse it with various game mechanics. Using Wwise helped, as the sound team could test their work on their own without needing a lot of help from programmers and long delays from audio creation to implementation and tweaking fixes. It is also hard to outsource this kind of a task, as it requires a profound knowledge of the project and quick access to the team.

Well, that’s it. I hope that you managed to learn something new and useful from this article, or found it entertaining at the very least. I am eager to hear your thoughts and feedback as well as any advice!

About the author

Sergey Eybog has been working on video games as a composer and sound designer since 2004. He has taken part in dozens of projects (casual and mid-core PC games, mobile games, MMORPG). In his free time, he records and synthesizes audio for his personal commercial audio libraries and writes music under his Silent Owl alias. He is currently the lead audio designer at Owlcat Games of the Mail.Ru Group.