Wwise SDK 2024.1.0
|
A listener is a game object that represents a microphone position in the game. Designating a game object as a listeners allows 3D sounds to be assigned to the speakers to mimic a real 3D environment. Similarly, an emitter game object represents a virtual speaker, and when assigned to a listener, the emitter’s positional information is mapped into the listener's coordinate system to render a 3D sound. Game objects in Wwise, whether acting as emitters or listeners (or both) are assigned a transform - a position vector, as well as front and top orientation vectors. Game object’s transforms must be updated on each frame to ensure that sounds are rendered through the correct speakers.
In order to hear sound, at least one game object must be registered and assigned as a listener. You may use AK::SoundEngine::SetDefaultListeners
to assign a listener to all other game objects, or AK::SoundEngine::SetListeners
to assign a listener to a specific game object, and override what has been set using AK::SoundEngine::SetDefaultListeners
. Here is how we register a game object, and assign it as a default listener:
You may inspect the emitter-listener associations that have been assigned in code by looking at the Emitter-Listener tab of the Advanced Profiler in the Wwise authoring tool. A simple game will elect a single game object as the default listener for all game objects; however, it is possible to use multiple listeners to output to a single output device. See Multiple Listeners In A Single Output Device below. It is also possible to use listeners for 3D positioning of submixes. To do so, it is necessary to assign listeners to game object that are also listeners creating a directed graph of game objects, connected by emitter-listener associations.
The AK::SoundEngine::SetPosition()
function is used, like for all game objects, to set the listener's position. This should be done every time any of the listener's position or orientation vectors change.
The AkTransform classes hold the information that define the listener's location and orientation in the game's 3D space. The listener's location (Position), OrientationFront, and OrientationTop vectors may be accessed and set using the getters and setters of the AkTransform
class.
Note: The OrientationFront vector defines the direction that the listener's head is facing. It should be orthogonal to the OrientationTop vector, which defines the incline of the listener's head. For a human listener, one could think of the OrientationFront vector as the listener's nose (going away from the face), while the OrientationTop vector would be orthogonal to it, going up from the nose, between the listener's eyes, past the forehead and beyond. |
Refer to X-Y-Z Coordinate System for information regarding how the X, Y, and Z axes are defined in the Wwise sound engine.
The orientation vectors must be defined for the audio to be rendered properly. They cannot be zero vectors and need to be unit vectors. They also need to be at right angles.
Note: The listener's position is updated at most once per frame. Even if multiple calls to the AK::SoundEngine::SetPosition() function were made, only the last value will be considered when AK::SoundEngine::RenderAudio() is called. |
Tip: If you are experiencing unexpected sound rendering, for example what was expected on the left speakers is actually heard on the right speakers, check the listener's positional information that is provided to the sound engine through the AK::SoundEngine::SetPosition() function. You may try to set a known constant listener's position and check that the rendering is correct in that case to rule out any mix-up in the X, Y, and Z axes. For more information about this, refer to X-Y-Z Coordinate System. |
When implementing audio in a game or simulation that uses a third-person perspective (TPP), it’s not always obvious where to place the Listener Game Object; some would suggest the position of the camera, while others would suggest the position of the character controlled by the player. Despite having different positions, both the camera and the character controlled by the player are in some ways “you”, the player. Associating a Distance Probe with the main camera Listener allows both of these positions to contribute to sound computations, each in their own way. To understand this approach, it’s necessary to analyze the various aspects of the simulation.
In almost all scenarios, panning and spatialization, including spread and focus in Wwise, should be based on the camera’s position and orientation. Any disconnect between the camera and the relative orientation of the sounds in the simulation, with respect to the final speaker array (whether physical or virtual binaural), results in a loss of immersion. For example, if the camera is looking directly at a sound, then that sound should come from the center speaker channel. A sound to the left of the camera should come from the left speaker(s), and so on.
To achieve this goal, the Listener Game Object must be placed on the active camera, and the orientation of the Listener must be updated to match.
A 3D sound is typically attenuated according to the distance between the Emitter and the Listener Game Objects, applied to the sound’s attenuation curve to get volume, high-pass and low-pass filter values. The result is that closer sounds, which are more important to the experience, are louder.
In a TPP game, however, the focus of attention is not the camera itself, but is instead the character that the player controls. For this reason, a greater sense of immersion is experienced when sounds attenuate according to the distance between the Emitter and the player character, instead of between the Emitter and the camera.
To understand why this is so, it’s helpful to consider a scenario where we get undesirable volume fluctuations when distance attenuation is based on the position of the camera. Picture a TPP game where a camera is following the player character down a hallway lit with torches. Each torch emits a low-intensity sound with a sharp falloff. To turn the camera around and face the other direction, it’s necessary to orbit the camera around the character. In doing so, the camera passes closely to one or more of the torches, getting louder and then quieter again. The player character hasn’t moved, the relative importance of the torches in the scene hasn’t changed, and yet the volume fluctuations suggest otherwise.
To achieve the goal of having sounds attenuate based on the distance to the player character, a Game Object must be placed at the position of the player character and designated as the Distance Probe for the main Listener.
A Distance Probe is a Game Object that is an optional, designated counterpart to a Listener Game Object. When a Distance Probe is assigned to a Listener, the attenuation distance applied to all sounds routed to the Listener is based on the distance between the Distance Probe and the Emitter Game Object.
Panning, spatialization, spread, and focus are always based on the position and orientation of the Listener Game Object, regardless of whether or not a Distance Probe is assigned.
Additionally:
A Distance Probe is assigned to a Listener Game Object using the AK::SoundEngine::SetDistanceProbe
API.
All assigned Distance Probes are visible in the Listeners tab of the Advanced Profiler. The following image shows that a game object named “Distance Probe” has been assigned to “Listener L” using the AK::SoundEngine::SetDistanceProbe
API.
The Distance Probe shows up as an icon in the Game Object 3D Viewer. Note that for convenience, the visibility of the Distance Probe in the Game Object 3D Viewer is bound to the visibility of the Listener; all filters that apply to the Listener also apply to the Distance Probe.
You are not required to place the Distance Probe at the exact location of the player character in TPP experiences. Feel free to experiment with positioning to achieve the desired results. Some suggestions include: Experiment with positioning the Distance Probe at various ratios between the camera and the character. This ratio could be exposed to designers as an adjustable value to interpolate between the character position and the camera position. During cutscenes and cinematic moments, it may be necessary to move, switch off, or transfer the Distance Probe to a different Game Object. The Distance Probe need not be static.
For detail on how the various Spatial Audio features operate when a Distance Probe is assigned to the Spatial Audio Listener, refer to Third-Person Perspective and Spatial Audio.
In a single-player game where you always see only one point of view in the game, one listener is enough. However, if multiple players can play on the same system, or if multiple views are displayed at the same time, each view requires its own listener so audio is appropriately rendered for all of these views.
The main difficulty involved with implementing multiple listeners comes from the fact that the positioning of the sound sources doesn't always makes sense in relation to what players are seeing. This is mostly caused by a game using only a single set of speakers to reproduce a 3D environment for several players.
A simple representation of this problem is shown in the following figure. It is very hard to tell in which speakers the source should be played, because Listener 0 expects to hear the source in the left speaker while Listener 1 expects to hear it in the right one.
Wwise can have any number of listeners, and by default all listeners will mix in the main output device, unless:
AK::SoundEngine::SetListeners
, orAK::SoundEngine::AddOutput
.The following sections cover the cases where all listeners merge into the same output device, and describe how the Wwise sound engine lets the programmer manipulate these listeners to achieve the expected behavior.
Note: Everything related to multiple listeners is only available through game programmer implementation via the SDK. There are no special options in the Wwise authoring application to manage the in-game positioning of sources for multiple listeners. |
Each listener spawns a mixing graph. For each source, distance and cone attenuation are computed individually relative to each listener on which they are active.
When multiple listeners capture a source, the source is mixed successively in each bus instance corresponding to its respective listener. As it is mixed, the attenuation volume is applied independently for each listener.
As opposed to attenuation volume, attenuation LPF and HPF are applied directly on sources; therefore, Wwise has to choose a single value based on all emitter-listener associations for a given source. Here is how the sound engine computes the final low pass filter to apply on each source:
In the example detailed in the following table, the value for listener 0 is max( 10, 40 ) = 40, and the value for listener 1 is max( 50, 10 ) = 50. The lowest of the two is 40, which is then added to the object's value of 5 to produce the final value, 45:
the source | |||||
10 | 40 | 50 | 10 | 5 | 45 |
3D Spatialization pans sounds across the various speakers based on the positions of those sounds relative to the listeners.
However, if the game is played by two players on a split screen, you might want to hear listener 1 (the first player) in the left speakers and listener 2 (the second player) in the right speakers, completely bypassing regular positioning of sounds across speakers based on their positions relative to each listener.
To give more control and flexibility, Wwise allows the game programmer to disable spatialization for a given listener and, optionally, set custom volume offsets for each channel, thus specifying how the sounds captured by this listener will be heard in each speaker.
These settings can be modified for each listener by calling AK::SoundEngine::SetListenerSpatialization()
:
The first parameter is the listener ID. The second parameter must be set to True
to enable spatialization for this listener and False
to disable it. Finally, the two last parameters represent a vector that contains the attenuation, in dB, for each channel on that listener. If in_bSpatialized
is False
, then it sets the volume for each channel, which are 0 dB by default. If in_bSpatialized
is True
, it offsets the volume computed by default 3D spatialization computation by a given amount for each channel.
The volume vector is tied to the channel configuration in_channelConfig
. If in_channelConfig
means 5.1, then the volume vector should have 6 values. Use functions defined in the AK::SpeakerVolumes::Vector namespace to manipulate it. The channel ordering corresponds to the channel mask bits defined in AkSpeakerConfig.h, except for the LFE which is always at the end.
For the example where two players use a split screen, the programmer could use the following code:
If the bus in which sounds are routed has a channel configuration other than 7.1, as per its user-defined channel configuration, the vector will be downmixed internally, using standard downmix recipes, before being applied to sounds.
To go back to regular spatialization, you would call:
The following figure shows, in order, the different operations performed on every source for each listener to compute the final volume in each speaker:
Questions? Problems? Need more info? Contact us, and we can help!
Visit our Support pageRegister your project and we'll help you get started with no strings attached!
Get started with Wwise