Diegetic and Non-Diegetic Sound - Threat Level Midnight

In film we make a distinction between sounds that originate within the story and sounds that come from outside. The sounds from inside the world of the film - radio music, a character humming, a gunshot - are called diegetic sounds. The musical score, existing outside of the film and played behind the action of the characters, is non-diegetic sound. The choice between a diegetic and a non-diegetic source for audible information can impact how viewers interpret it and what significance they give it. In video game development we come up against the same choice. There’s a little more at stake here, however, given the interactivity of the medium, the practical gameplay function of sound, as well as the increased impact that the different audio sources can have on immersion.

Music in games plays several roles. While it can be (and often is) used purely cinematically, it can also encourage players to act, prompt them to make particular decisions, or signal shifts in gameplay. As with film (and theater before that), music often accompanies danger, signaling to the player that there is a threat nearby while setting the mood and raising the player’s adrenaline. In a more functional sense, the aim of this non-diegetic music seems to be to help players transition into fight-or-flight mode as required by the gameplay. We call this 'threat music.'

The problem that many games run into is that the systems designed to handle the cueing of threat music are often underdeveloped, and in these instances the developers run the risk of doing more harm than good to the players' experiences. The music may begin or end too abruptly, mesh poorly with the ambient music, or improperly represent the scale of the threat.

Ark: Survival Evolved (Studio Wildcard) suffers from all of these issues to some extent, but the last is the most noticeable. The game plays threat music when you enter combat with an enemy, followed by a little victory fanfare if you win. The threat music plays regardless of what you’re fighting against or how strong you are. There’s no differentiation between severe threats, such as raptor packs or a t-rex, and minor threats like giant ants and flies. The issue is compounded in Ark due to the fact that the player’s relative survivability can vary drastically depending on whether they’re running around alone or accompanied by tamed dinosaurs. The music cueing system doesn’t take this into account at all. While it may feel appropriate to have intense battle music playing when a couple of massive meat-eaters are chasing you down, it feels like a bad joke when the music accompanies a turtle biting your ankles while you’re riding a 12-foot-tall brontosaurus.

Another issue that games run into is sticking to convention without good reason. Matching your design decisions to your genre is important, and this means questioning the features of other similar games to see if they're appropriate for your own. Threat music isn't always necessary, and in some cases it can take away from the experience. Survival games such as Ark are driven by the player's ability to work hard and be smart. The very name of the genre suggests a certain difficulty level and a focus on self-reliance. Players should be expected to pay attention to their environment and use diegetic information to ensure their survival. But players are crafty, and they'll always exploit things for their own gain.

Threat music gives players an opportunity to take the easy road. Why keep an eye on the world around you when you can just listen for the music to tell you when there’s danger? This is a generalization, but it holds true most of the time. In Ark, as soon as a creature becomes hostile towards the player the music starts, only ceasing if the enemy dies or stops chasing the player. The player only has to run away and wait for the music to stop before they can consider themselves safe. Instead of pushing the player to think for themselves, to check to see if they’ve lost the monster, or to set traps and take precautions in case the creature is still stalking them, the game informs them directly when it’s safe via the music.

A potentially wonderful opportunity for player-environment interaction is lost. A non-diegetic barrier has been raised that dampens the immediacy of the experience, reducing the immersive value of the world. A quick survey of forum posts suggests that a number of players are already switching music off when given the option. In a survival game context, developing experiences without threat music seems to be the ideal way forward.

While threat music may not be appropriate for all games, we can still learn a lot from it. There are ways we can explore and cash in on its merits without damaging immersion. Its unintended use as an early warning system for approaching danger is interesting. It has the effect of letting players know that an enemy is aware of them, throwing players into fight-or-flight and giving them an adrenaline boost. What if we did this in the game world, using diegetic sound? This could be appropriate if the enemy is one that players expect to be less stealthy. The wolves in Skyrim are a great example of this, giving off a big howl before engaging the player. We might not see the wolves at all before they let out this noise, but when they do we immediately know that we only have a second or two to prepare. There's nothing here that doesn't exist within the game world. What if all enemies had some sort of call or battle-cry? Some sort of subtle but distinctive noise that would alert the player to its presence if the player was paying attention? We do want to reward them for making an effort.

Games aren’t films, and it’s important to remind ourselves that we’re not bound to film conventions. Explore what you can do with sound - diegetically and non-diegetically. Find out how players respond to the noises coming from their speakers, and see if you can improve on that response. We’ve taken a look at survival games this week - are there any other genres you’ve played where music had an important role? What about a lack of music, or a heavier focus on diegetic sound? Let me know in the comments!