Open an interactive online version by clicking the badge Binder badge or download the notebook.

Assignment 3: Ambisonics#

Sebastian J. Schlecht (1), Nils Meyer-Kahlen (2)

Notebook: Cristóbal Andrade (1)

(1) Friedrich-Alexander-Universität Erlangen-Nürnberg (2) Aalto University

Contact: sebastian.schlecht@fau.de, cristobal.andrade@fau.de

In this assignment, you will learn to encode and decode Ambisonic signals and binauralize them. Additionaly you will explore how different Ambisonic orders affect spatial audio quality.

Duration: 12 Hours

Requirements: Basics of digital signal processing, spatial hearing and spherical harmonics

Reference
Zotter, F., & Frank, M. (2019). Ambisonics: A Practical 3D Audio Theory for Recording, Studio Production, Sound Reinforcement, and Virtual Reality

Dependencies pip install numpy matplotlib soundfile scipy pyfar sofar nbgrader watermark ipython

[ ]:
import pyfar as pf
import pooch
import os
import soundfile as sfile
import numpy as np
import numpy.matlib as npm
import matplotlib.pyplot as plt
from scipy.io import loadmat

from scipy.special import lpmv, factorial, legendre
from IPython.display import Audio, display

%load_ext autoreload
%autoreload 2

%matplotlib inline

Let’s start by downloading all the necessary files for running this assignment.

[ ]:
# Leave this as it is: This is the URL from which the data will be downloaded

url_py = 'https://github.com/pyfar/open-educational-resources/raw/main/courses/Virtual_Acoustics_Lab_FAU/Assignment3'
url_large_files = 'https://github.com/pyfar/files/raw/main/education/VAL_FAU'
url_hrtfs = 'https://github.com/pyfar/files/raw/main/education/VAR_TUB'

# Get current working directory (where the notebook was started)
notebook_dir = os.getcwd()

# Create a Pooch object using that directory
my_pooch_py = pooch.create(
    path=notebook_dir,
    base_url=url_py,
    registry={
        "plotSHSpheres.py" : None,
        "Binaural_DSP.py" : None,
        "AmbiDecoder_DSP.py" : None,
        "AmbiEncoder_DSP.py" : None,
        "RotatorScene.py" : None,
        "VBAP_DSP.py" : None,
        "cart2sph_vec.py" : None,
        "Config.py" : None,
    }
)

my_pooch_large_files = pooch.create(
    path=notebook_dir,
    base_url=url_large_files,
    registry={
        "guitar.wav" : None,
        "assignment_AmbiEncoder_DSP_Reference.wav" : None,
    }
)

my_pooch_hrtfs = pooch.create(
    path=notebook_dir,
    base_url=url_hrtfs,
    registry={
        "FABIAN_HRIR_measured_HATO_0.sofa" : None
    }
)


# Download all files
for fname in my_pooch_py.registry:
    fpath = my_pooch_py.fetch(fname)
    print(f"Downloaded: {fpath}")

for fname in my_pooch_large_files.registry:
    fpath = my_pooch_large_files.fetch(fname)
    print(f"Downloaded: {fpath}")

for fname in my_pooch_hrtfs.registry:
    fpath = my_pooch_hrtfs.fetch(fname)
    print(f"Downloaded: {fpath}")

Now let’s import all the necessary files.

[ ]:
from plotSHSpheres import plotSHSpheres
from cart2sph_vec import cart2sph_vec
from Config import Config
from VBAP_DSP import VBAP_DSP
from RotatorScene import rotator_scene
from AmbiEncoder_DSP import AmbiEncoder_DSP
from AmbiDecoder_DSP import AmbiDecoder_DSP
from Binaural_DSP import Binaural_DSP

1 TASK: Evaluate Spherical Harmonics#

Spherical Harmonics are the base used for Ambisonics processing. Implement a function evalSH(maxOrder, azi, ele) that evaluates real Spherical Harmonics up to an arbitrary order.

Recall the Spherical Harmonics formula:

\[\begin{split}Y_{n,m}(\theta,\phi) = \sqrt{\frac{(2n+1)}{4\pi}\frac{(n-|m|)!}{(n+|m|)!}}\, P_{n,|m|}(\cos\theta)\, \begin{cases} \sqrt{2}\,\sin(|m|\phi), & m < 0,\\[6pt] 1, & m = 0,\\[6pt] \sqrt{2}\,\cos(|m|\phi), & m > 0 . \end{cases}\end{split}\]

with \(P_{n,|m|}\) the associated Legendre polynomial. This can be obtained usinc scipy’s lpmv.

[ ]:
def evalSH(maxOrder, azi, ele):
    """
    Evaluate real-valued Spherical Harmonics.

    Parameters:
    - maxOrder : int
        Maximal Ambisonics order
    - azi : ndarray (Q x 1)
        Azimuth angles in radians
    - ele : ndarray (Q x 1)
        Elevation angles in radians

    Returns:
    - Ysh : ndarray (Q x (maxOrder+1)^2)
        Matrix of spherical harmonics
    """
    azi = np.array(azi).flatten()
    ele = np.array(ele).flatten()

    if azi.ndim != ele.ndim:
        raise ValueError('azi and ele must be of the same dimension')

    numDirs = len(azi)
    phi = azi
    theta = ele

    Ysh = np.zeros((numDirs, (maxOrder + 1)**2))
    Ysh[:, 0] = 1  # Dummy solution for n = 0

    for n in range(0, maxOrder + 1):
        m_vals = np.arange(-n, n + 1)

        # Azimuthal part
        # YOUR CODE HERE
        raise NotImplementedError()

        # Zenithal part (associated Legendre polynomials)
        # YOUR CODE HERE
        raise NotImplementedError()

        # Condon-Shortley phase and Normalization
        # YOUR CODE HERE
        raise NotImplementedError()

        # Final SH for this order
        # YOUR CODE HERE
        raise NotImplementedError()

    assert Ysh.shape == (numDirs, (maxOrder + 1)**2), "Check dimensions"
    return Ysh

# Parameters
ambi_order = 3
azi = np.linspace(0, 2 * np.pi, 100)
ele = np.linspace(-np.pi / 2, np.pi / 2, 100)

AZI, ELE = np.meshgrid(azi, ele, indexing='ij')

# Evaluate SH matrix
Y = evalSH(ambi_order, AZI.flatten(), ELE.flatten())

# Plot
plotSHSpheres(AZI, ELE, Y)

Define Config#

With a regularly spaced array of loudspeakers

[ ]:
config = Config('regularLsArray')

Define Scene#

The scene consists of a sound source rotating counter-clokwise around the listener, starting in the front (x, y, z) = (1, 0, 0). This scene will be used to test the encoder/decoder you are about to implement.

[ ]:
scene, numBlocks = rotator_scene()

Create Source Signal#

Some noise for testing

[ ]:
signal_length = config.blockSize * numBlocks
signal = np.random.randn(signal_length, 1)
signal = signal / np.max(np.abs(signal))
display(Audio(signal.T, rate=config.fs))
[ ]:
# signal, fs = sfile.read('guitar.wav')
# signal = signal[:signal_length].reshape(-1,1)
# display(Audio(signal.T, rate=config.fs))

2 TASK: Implement AmbiEncoder_DSP.process()#

With your evalSH function at hand, this should be very straight forward. The encoder takes the direction of arrival at each block and encodes the signal as a virtual source from that direction. Hence, the input is a mono signal, and the output is the Ambisonic signal with (ambiOrder + 1) ^2 channels

[ ]:
def process(self, inSig):
    """
    Process input signal through the ambisonics encoder

    Args:
        inSig (np.ndarray): Input signal matrix [S x numSignals]

    Returns:
        np.ndarray: Output SH signal matrix [S x (N+1)^2]
    """
    # Get spherical harmonics coefficients for current direction and encode the signal with them
    # YOUR CODE HERE
    raise NotImplementedError()
    return outSig

AmbiEncoder_DSP.process = process

ambi_encoder = AmbiEncoder_DSP(ambiOrder=config.ambiOrder)

block_size = config.blockSize
num_ambi_ch = (config.ambiOrder + 1) ** 2

ambi_out = np.zeros((numBlocks * block_size, num_ambi_ch))

for it_block in range(numBlocks):
    block_index = slice(it_block * block_size, (it_block + 1) * block_size)
    frame = signal[block_index] # Extract block

    # Convert Cartesian to spherical coordinates
    doa = cart2sph_vec(scene[it_block]['sourcePosition'])  # shape (N, 3)
    azimuth = doa[:, 0]
    elevation = doa[:, 1]

    # Set direction of arrival
    ambi_encoder.set_doa(azimuth, elevation)

    # Apply SH encoder
    ambi_out[block_index, :] = ambi_encoder.process(frame.reshape(-1, 1) )

# Sabe ambi_out as a wav file
sfile.write('ambi_out.wav', ambi_out, config.fs)

This will be the (ambiOrder+1)^2 channel Ambisonics file. You can compare it to the asignment_AmbiEncoder_DSP_Reference.wav by examining it in reaper/audacity/matlab. Listening to it does not make sense before decoding

TASK 3: Describe Ambisonics encoding and the output signal#

  1. Briefly describe how encoding of a source is done

  2. Why are some channels empty?

  3. Why is the envelope of the first channel constant?

[ ]:
# Write your answers here
# YOUR CODE HERE
raise NotImplementedError()
[ ]:
num_ambi_channels = (config.ambiOrder + 1) ** 2

# Offset each channel for better visibility in plot
offset = np.arange(num_ambi_ch).reshape(1, -1)  # shape (1, num_ambi_channels)
signal_to_plot = ambi_out + offset  # broadcasting applies row-wise offset

plt.figure()
plt.plot(signal_to_plot)
plt.ylabel('Ambisonics Channel')
plt.xlabel('Time (samples)')
plt.title('Ambisonics Output')

TASK 4: Implement Loudspeaker Decoder#

If we want to listen to an Ambisonics signal, we need to decode it to loudspeakers or headphones

The decoding matrix \(D\) maps Ambisonics channels to loudspeaker channels. Use the following matrix structure for decoding

\[D = \sqrt{\frac {4 \pi}{L}} Y_{n,m}^T(\theta,\phi) \mathcal{diag}\{a_n\}\]

takign the max-rE panning function for the weighting of spherical harmonics contribution

\[a_n = P_n\!\left[\cos\!\left(\frac{137.9^\circ}{N + 1.51}\right)\right]\]

With the sampling decoder, the mapping of the omnidirectional component (0th order channel) is straightforward and can be used as a sanity check.

[ ]:
def calculateSamplingDecoder(self):
    """
    Calculate the sampling decoder matrix
    Returns:
        np.ndarray: Decoder matrix [numSH x numLS]
    """
    # YOUR CODE HERE
    raise NotImplementedError()
    return decoderMatrix

def getMaxReWeights(self):
    """
    Compute max-re weights
    Returns:
        np.ndarray: Weights array [numSH x 1]
    """
    # YOUR CODE HERE
    raise NotImplementedError()
    return weights


AmbiDecoder_DSP.getMaxReWeights = getMaxReWeights
AmbiDecoder_DSP.calculateSamplingDecoder = calculateSamplingDecoder
# Create an AmbiDecoder Object
config.maxre = True
ambiDecoder = AmbiDecoder_DSP(config)

# Plot decoder matrix
plt.figure(figsize=(8, 6))
plt.imshow(ambiDecoder.decoderMatrix, aspect='auto', cmap='viridis', interpolation='nearest')
plt.colorbar(label='Decoder Matrix Value')
plt.xlabel('Loudspeaker Index')
plt.ylabel('SH Channel Index')
plt.title('Decoder Matrix')
plt.tight_layout()

# Plot maxre weights if enabled
if config.maxre:
    weights = ambiDecoder.getMaxReWeights().flatten()
    plt.figure(figsize=(8, 4))
    plt.stem(weights)
    plt.xlabel('SH Channel Index')
    plt.ylabel('Weight')
    plt.title('MaxRe Weights')
    plt.tight_layout()

Use the decoder to obtain loudspeaker signals

[ ]:
ls_out = np.zeros((numBlocks * block_size, config.numLoudspeakers))
for it_block in range(numBlocks):
    block_index = slice(it_block * block_size, (it_block + 1) * block_size)
    ls_out[block_index, :] = ambiDecoder.process(ambi_out[block_index, :])
ambiDecoder.plotLoudspeakerSignals(ls_out)

sfile.write('assignment_AmbiDecoderLs_DSP.wav', ls_out, config.fs)

TASK 5: Describe the loudspeaker output#

Also try to deactivate max-re weighting and describe the difference seen in the loudspeaker signals here.

[ ]:
# Write your answer here

# YOUR CODE HERE
raise NotImplementedError()

Binauralization#

Since you probably do not have a large loudspeaker array at hand, we will now convolve the loudspeaker signals with HRIRs in order to create a binaural signal. Have a look at the code, it uses our Binaural_DSP object.

[ ]:
binauralizer = Binaural_DSP(config, config.numLoudspeakers)
lsPositions_sph = cart2sph_vec(config.lsPositions)
azi = lsPositions_sph[:, 0].reshape(-1, 1)
ele = lsPositions_sph[:, 1].reshape(-1, 1)
binauralizer.set_doa(azi.flatten(), ele.flatten())
binaural_out = np.zeros((numBlocks * block_size, 2))
for it_block in range(numBlocks):
    block_index = slice(it_block * block_size, (it_block + 1) * block_size)
    frame = signal[block_index]
    doa = cart2sph_vec(scene[it_block]['sourcePosition'])
    azimuth = doa[:, 0]
    elevation = doa[:, 1]
    ambi_encoder.set_doa(azimuth, elevation)
    ambi_frame = ambi_encoder.process(frame)
    ls_frame = ambiDecoder.process(ambi_frame)
    binaural_out[block_index, :] = binauralizer.process(ls_frame)

sfile.write('binauralOutPython.wav', binaural_out, config.fs)
plt.figure(figsize=(12, 4))
plt.plot(binaural_out)
plt.title('Binaural Output')
plt.xlabel('Sample')
plt.ylabel('Amplitude')
plt.legend(['Left', 'Right'])
plt.grid(True)

display(Audio(binaural_out.T, rate=config.fs))

TASK 6: Describe the binaural ouput for different orders#

Try at least 1st, 3rd and 7th order

  1. Pay attention to coloration in comparison to the mono signal, as well as continuity (best heard with the noise). Which differences do you hear?

  2. For which order would you use this loudspeaker layout?

[ ]:
# Write your answers here

# YOUR CODE HERE
raise NotImplementedError()

TASK BONUS: Allrad Decoder#

In the example above, we used a quasi-uniform loudspeaker array and finally used it to create a binaural signal. In practise, one may also encounter non-uniform arrays for which the simple sample decoding we implemented is oftentimes not ideal. Another decoding approach more suitable for this is AllRAD

Implement the AllradDecoder in calculateAllradDecoder you may use your vbap_DSP.The main difference from the sampling decoder is that the decoder matrix coefficients are derived from VBAP gains computed for t-design points

Hint: you may insert one imaginary loudspeaker below the array to create a convex hull and later discard its signal

[ ]:
def calculateAllradDecoder(self):
    """
    Calculate the AllRAD decoder matrix using t-design points and VBAP gains.
    Use the t-design file specified in self.config.tdesign_filename.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

AmbiDecoder_DSP.calculateAllradDecoder = calculateAllradDecoder

bonus_config = Config('fivePointZeroPointFourLsArray')
bonus_config.maxre = False
ambiDecoder5point0point4 = AmbiDecoder_DSP(bonus_config)
ambiDecoder5point0point4.calculateAllradDecoder()

lsOut_bonus = np.zeros((numBlocks * bonus_config.blockSize, bonus_config.numLoudspeakers))
for it_block in range(numBlocks):
    block_index = slice(it_block * bonus_config.blockSize, (it_block + 1) * bonus_config.blockSize)
    lsOut_bonus[block_index, :] = ambiDecoder5point0point4.process(ambi_out[block_index, :])

ambiDecoder5point0point4.plotLoudspeakerSignals(lsOut_bonus)

TASK BONUS:#

Compare the Allrad Decoder for the 5.0.4 layout to the sampling decoder. Which differences do you see?

[ ]:
# Write your answer here

# YOUR CODE HERE
raise NotImplementedError()

License notice#

This notebook is licensed under CC BY 4.0

Watermark#

The following watermark might help others to install specific package versions that might be required to run the notebook.

[1]:
%load_ext watermark
%watermark -v -m -p numpy,scipy,pyfar,sofar,nbgrader,watermark,matplotlib,soundfile,nbgrader
Python implementation: CPython
Python version       : 3.12.9
IPython version      : 8.12.3

numpy     : 1.26.4
scipy     : 1.15.2
pyfar     : 0.7.1
sofar     : 1.2.1
nbgrader  : 0.9.5
watermark : 2.5.0
matplotlib: 3.10.1
soundfile : 0.13.1

Compiler    : Clang 13.0.0 (clang-1300.0.29.30)
OS          : Darwin
Release     : 24.5.0
Machine     : arm64
Processor   : arm
CPU cores   : 12
Architecture: 64bit