Blog

Deterministic Audio Fixtures for End-to-End Testing

January 16, 2026

Designing Robust Spectral Validation for Audio Pipelines

Testing audio systems is deceptively hard.

Unlike text or structured data, audio pipelines are often lossy, time-sensitive, and highly stateful. Codecs introduce quantization noise, transports introduce jitter, buffers may reorder or drop frames, and decoders may subtly alter timing or amplitude. Traditional byte-level comparisons or waveform diffs are therefore brittle and misleading.

In this article, I present audio-test-fixtures, a deterministic, spectral-based approach to testing audio pipelines end-to-end. The result is a small but robust toolkit that generates known audio fixtures and validates decoded output using FFT-based frequency analysis, designed to work reliably even with lossy codecs and imperfect transports.

The Core Problem

Let’s define the problem precisely:

How can we mechanically and reliably verify that an audio signal survives encoding, transmission, and decoding without unacceptable distortion?

Key constraints:

  • Bitwise equality is impossible with lossy codecs
  • Waveform comparison is extremely sensitive to phase, gain, and timing
  • Perceptual metrics (PESQ, POLQA) are heavyweight and opaque
  • Manual listening does not scale and is not CI-friendly

What we need instead is:

  • Deterministic input
  • Known ground truth
  • A validation method tolerant to amplitude and phase drift
  • Machine-verifiable results
  • Clear pass/fail semantics

Design Overview

The solution is split into two clearly separated components:

  1. Audio Fixture Generator Generates a deterministic WAV file containing a known sequence of pure tones.

  2. Audio Transmission Validator Compares a reference WAV with a decoded WAV using spectral analysis.

This separation of responsibilities is critical:

  • Fixtures are generated once
  • Validation can be run repeatedly in CI, on-device, or in regression tests

Why Pure Tones?

Human voice spans roughly 80 Hz to 1.1 kHz. Instead of attempting to simulate speech, we use pure sinusoidal tones because:

  • Their frequency is mathematically unambiguous
  • FFT peak detection is reliable
  • Harmonics and distortion are easy to observe
  • They are codec-agnostic

Each tone becomes a spectral marker that we can later detect.

Audio Fixture Design

Format

The generated file has strict, predictable properties:

  • PCM WAV
  • 16-bit
  • Mono
  • 16 kHz
  • Exactly 10 seconds
  • 160,000 samples

This makes it compatible with:

  • Embedded systems
  • Mobile platforms
  • Voice codecs
  • Low-latency transports

Frequency Content

The file contains 27 ascending notes, from E2 (82 Hz) to C6 (1046 Hz), covering the full vocal range.

Each note consists of:

  • ~350 ms pure sine wave
  • 20 ms silence between notes
  • Short fade-in/out to avoid clicks

Generator Implementation

Below is a simplified excerpt of the tone generation logic:

def generate_tone(frequency, duration, sample_rate, amplitude=0.3):
    t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    return amplitude * np.sin(2 * np.pi * frequency * t)

Each tone is placed at a deterministic position in the final buffer, allowing us to later compute exact analysis windows.

The resulting WAV file is fully deterministic: generating it twice produces the same signal (modulo floating-point rounding).

Why Determinism Matters

Determinism enables:

  • Stable CI tests
  • Meaningful regression comparisons
  • Long-term maintainability
  • Debuggable failures

If your input changes every run, your test results become meaningless.

Validation Strategy

What We Validate

The validator checks multiple orthogonal dimensions:

  1. WAV Metadata

    • Sample rate
    • Bit depth
    • Channel count
    • Duration (with tolerance)
  2. Spectral Integrity

    • Dominant frequency per segment
    • Frequency deviation (Hz and %)
    • Accuracy ratio (% within tolerance)
  3. Signal Quality

    • Signal-to-Noise Ratio (SNR)

Each metric answers a different question:

  • Is the format correct?
  • Are frequencies preserved?
  • Is noise within acceptable bounds?

FFT-Based Frequency Detection

Instead of comparing waveforms, we extract the dominant frequency of each segment using FFT:

fft_result = np.fft.rfft(windowed_segment)
fft_freqs = np.fft.rfftfreq(len(segment), 1.0 / sample_rate)
dominant_freq = fft_freqs[np.argmax(np.abs(fft_result))]

Important implementation details:

  • Hann windowing to reduce spectral leakage
  • Frequency band filtering (50 Hz – 1200 Hz)
  • Analysis window centered on tone (avoids silence)

This approach is:

  • Phase-invariant
  • Gain-invariant
  • Robust to small timing drift

Frequency Tolerance

Lossy codecs will introduce frequency smearing. Therefore, validation uses a configurable tolerance:

--tolerance 5.0   # Hz

Typical values:

Scenario Tolerance
Lossless ±2 Hz
Light compression ±5 Hz
Heavy compression ±10 Hz

A note is considered valid if:

|detected_freq - expected_freq| ≤ tolerance

Aggregated Metrics

After analyzing all segments, we compute:

  • Frequency accuracy Percentage of notes within tolerance

  • Mean frequency error

  • SNR (dB) Based on power ratio between reference and decoded signals

Example output:

Frequencies correct: 27/27 (100.0%)
Mean frequency error: 0.82 Hz
SNR: 38.7 dB

CI-Friendly Results

The validator is explicitly designed for automation:

  • Exit code 0: validation passed
  • Exit code 1: validation failed
  • No human interpretation required

Example:

validate-audio reference.wav decoded.wav --tolerance 10.0 \
  && echo "PASS" || echo "FAIL"

This allows seamless integration into:

  • GitHub Actions
  • GitLab CI
  • Jenkins
  • Embedded test harnesses

Why Not Waveform Comparison?

Waveform diffs fail because:

  • Phase shifts invalidate comparisons
  • Gain normalization breaks equality
  • Minor resampling introduces drift
  • Codecs reorder samples internally

Spectral comparison answers the right question:

Is the information content preserved within acceptable limits?

Why Not Perceptual Metrics?

Perceptual metrics (PESQ, POLQA):

  • Are complex and opaque
  • Often require licenses
  • Are hard to debug
  • Are slow and heavyweight

This approach is:

  • Transparent
  • Deterministic
  • Explainable
  • Fast

Typical Use Cases

This methodology works well for:

  • Audio codec validation
  • Transport integrity tests (UDP, BLE, RTP)
  • Embedded and mobile pipelines
  • Regression testing
  • Hardware-in-the-loop testing
  • DSP algorithm validation

Final Thoughts

This project demonstrates that audio testing does not need to be fuzzy or subjective.

By:

  • Using deterministic fixtures
  • Focusing on spectral correctness
  • Accepting controlled loss
  • Producing machine-verifiable results

we can build robust, maintainable, and scalable audio tests that survive real-world conditions.

If you are testing audio pipelines and still relying on manual listening or fragile waveform diffs, it may be time to rethink your approach.

Note

Code was developed by Claude Sonnet 4.5, an AI language model by Anthropic, from an original idea of mine. Post was written by ChatGPT GPT-5.2, an AI language model by OpenAI. Final formatting and text edition was done by hand. You can download a detailed discussion of the process.

#human-ai-collaboration

Read More

Adding Backpressure to Python’s ProcessPoolExecutor

October 1, 2025

Recently I’ve hit a practical limitation with Python’s ProcessPoolExecutor: when feeding it tens of thousands of tasks from hundreds of producer threads, the executor happily accepted them all. The result? Memory usage ballooned, latency increased, and eventually the whole system became unstable.

Read More

How to build WebRTC for Android in Ubuntu 25.04

September 16, 2025

Google used to provide prebuild Android images of libWebRTC library, and in fact, it’s (still) the recomended way to use them on its own documentation. But starting on WebRTC M80 release (January 2020), they decided to deprecate the binary mobile libraries, and the reasons were that the builds were intended just only for development purposes, and users were already building it themselves with their own customizations, or using third party libraries that embedded them (where have been left developers that just want to build a WebRTC enabled mobile app?), and they just only provided another build in August 2020 (1.0.32006) to fill some important security holes, in case someone (everybody?) was still using the binary mobile libraries.

Read More

Minimal and secure Python distroless Docker images with Poetry

September 7, 2025

For a recent project, I needed to create a Docker image for a Python application that is being handled with Poetry. I already done it one year ago using distroless images, that provide minimal Docker images based on Debian without package managers, shells or any other tools commonly found in traditional images, and optimized for security and size. But after the release of Debian 12 and Poetry 2.0, and so much improvements on the ecosystem during this year, this time I wanted to take the opportunity to create a more secure and minimal image, and to know what would be the best practices for doing so.

Read More

Optimizing Git Branch Naming & Syncing with Upstream Repositories

April 30, 2025

When working with multiple remote repositories, especially when syncing changes from upstream (such as in a forked repository), it’s important to have a well-structured system for organizing and tracking branches. This ensures clarity, ease of maintenance, and the ability to manage branches effectively. In this post, we’ll walk through the decision-making process for setting up a clear naming convention and syncing branches between your repository and an upstream one.

Read More

How to install npm packages stored at GitHub Packages Registry as dependencies in a GitHub Actions workflow

May 10, 2023

When working on npm projects with multiple subprojects as dependencies, there’s a problem when you need to do frequent updates. Ideally, that dependencies should have their own tests and versioning, but that’s not always possible (for example, private packages) and sometimes we would need to publish multiple development versions while trying to debug some obscure issues. This is tedious and nasty, so that’s why so much people like monorepos.

Read More

How to migrate from Jest to node:test

April 24, 2023

Jest is one of the most populars testing frameworks for Javascript and Node.js. Originally developed by Facebook, it’s a one-stop-shop with testing, assertions, code coverage… but this implies some critics, like requiring more than 50mb of dependencies. Also, somewhat recently was shown to be maintained mostly by a single person, being that the reason why updates and maintenance was so much slow, so they decided to transfer it to OpenJS foundation. Also there has been several long standing critics about not providing a pure environment, or the fact that Jest parses the code, leading to some complexities when needing to configure transpilation. That has lead to several people looking for alternatives, and having now a built-in test runner in Node.js, I decided to see myself how to migrate to it.

Read More

Profiling `npm install` times

February 5, 2023

When installing Mafalda packets, a problem I’ve suffered several times are install times, specially since I’m using git dependencies. I tried to reduce times by publishing some of the most common packages to npm, so removing need to install and compile development dependencies like Typescript, but still install times were huge for no reason, so I needed some way to measure the install time of each one of the dependencies. This lead out options like UNIX time command or tools like slow-deps, so just by change, I found on StackOverflow a reference to gnomon.

Read More

How to use private repositories as npm git dependencies on Github Actions

December 25, 2022

I’m advocate of automatization, and that includes not only CI/CD pipelines, but also I wanted to do it for documentation publishing. Mafalda is split in a lot of packages (currently more than 30!), so I wanted to have a single place where to publish the documentation of all of them. Github Pages allows to host a website for your organization or username by free (this blog and personal site already makes use of it), and it can also host automatically a website for each repository as sub-paths of your username/organization main website. Problem is, that it only works for open source repositories or for paid plans, and most of the Mafalda SFU repositories are private ones. So since the Mafalda SFU project website is already hosted on Github Pages as a public repository, I decided to store and serve from it all the other repositories documentation as well… doing it in an automated way :-)

Read More

WebRTC Bugs and Where to Find Them

November 30, 2022

Also for the most basic use cases, WebRTC is a complex technology, with lots of moving parts and involved elements and parties working together at the same time, so when a WebRTC connection is not working properly, or directly it can not be created, there’s a series of not-so-obvious usual reasons that can make it fail. We are going to analize some of the most common ones, and when possible, see how we can fix them or find some alternatives solutions to minimize their impact.

Read More

How to (properly) deploy Node.js applications

October 1, 2022

Recently I’ve been involved in a new Typescript project all of my own where I would end up deploying it on production on a raw AWS machine, so no help from dev friendly PaaSs environments, as I usually prefer to work.

Read More

Linting @ Dyte

August 4, 2022

At Dyte we are now 44 persons, most of them developers, and each one has his own personal code style. This has lead sometimes to huge code conflicts when doing merges that create some annoyances and delays, so we decided to create an unified linting code style for all of Dyte projects (including a Jira ticket too!), just only we have been procrastinating it due to some other priorities. So, after the last merge conflict in a new project just created some days before, we decided to fix that issue once for all. Come and follow us to see how at Dyte we take code quality serious, and how at Dyte we don’t just simply apply a linter to our source code.

Read More

Mediasoup prebuilds

February 27, 2022

At https://github.com/versatica/mediasoup/pull/777 (the lucky number :-D) I’ve published a PR that allows to create and use prebuild images of mediasoup, not needing to compile it at install on the target platform. This is done by compiling the Worker executables in advance for multiple platforms, and bundling them in the distributed package.

Read More

How to do proper exceptions handling

February 2, 2022

An exception happens when something we expected that should happen, didn’t. We can log them, but printing logs everywhere add a lot of noise, so it’s better to throw errors and handle them in upper levels. Also, a thrown error is easy to check and test, but a log message is not, so they are better for unit testing.

Read More

Presenting Mediasoup Horizontal

January 2, 2022

Although Mafalda SFU is mainly focused on vertical scaling of Mediasoup and the WebRTC stack, the main problem I’ve found companies are facing is about how to easily implement Medisoup horizontal scaling. I’ve been working on a solution for this problem for a while on, and since Mafalda is build on top of Mediasoup, it’s also needed to help it to provide transparent vertical and horizontal scaling, so let’s see how it works.

Read More

Manifest of a perfectionist

December 7, 2021

I’m a bit obsesive with code and architecture quality, and having them done like they could be put down on a textbook, or at least about they being used by others as reference of how things can be done right. I’ve always feel a bit frustrated that newcomers get and perpetuate bad habits, just because they learned them that way on the first place by thinking that was the way to do the things… Later, if things are working, people don’t give a sh*t on thinking about if there’s a better way to do it, both to improve their work quality or processes, or for learning and improve themselves, they just move on… So it’s better to do things right from the beginning, since later they are more difficult to fix, or simply you forget to do it. And at the end, just by doing things right on a first aproach, you get used to it and does them that way by default :-)

Read More

WebRTC horizontal scaling

September 26, 2021

When aproaching the horizontal scaling of WebRTC servers, we have two main aproachs: decentralized P2P, and using a central server. Each one has its own drawbacks and advantages, and I had difficulties to identify what aproach was the best, since I usually have a personal preference for pure P2P architectures, but they are not the most simple nor always the more efficient ones. So when deciding how to aproach Mafalda horizontal scaling, I needed to consider the pros and cons of each use case I would need, and here we have my conclusions.

Read More

Abstract classes in Javascript

July 8, 2021

Javascript don’t have the concept of abstract classes, but it’s fairly easy to implement: don’t allow to instanciate them :-) Just check if the constructor of the instance we are creating is the own class instead of one of its childrens, and don’t throw an error if it is:

Read More

Types of WebRTC networks

December 30, 2020

When it comes to WebRTC architectures, there is no silver bullet. Depending on each use case, the optimal architecture may vary from project to another. For this reason, I am going to explain the main network architectures that are usually applied in projects based on WebRTC (and mainly applied to the streaming video), and what are the pros and cons of each one of them.

Read More

Tipos de redes WebRTC

December 30, 2020

Respecto a arquitecturas WebRTC, no hay una bala de plata. Dependendiendo de cual sea el caso de uso, la arquitectura óptima puede variar de un proyecto a otro. Por este motivo, voy a explicar las principales arquitecturas de red que suelen aplicarse en proyectos basados en WebRTC (y principalmente aplicadas al streaming de video), y cuales son los pros y contras de cada uno de ellos.

Read More

How to paint over a video with HTML

November 1, 2020

I recently got to work on a project where I need to capture a camera, sendback some drawed feedback, exchange commands and chat messages (and voice comments), and record everything. I’ve always been interested on using non-mainstream features of the Web Platform, and after taking a look of the current state of the art, I’ve found a way to implement this particular use case using ONLY open and readily available web standards.

Read More

How to simulate Chrome is running in a TTY

April 25, 2020

I’ve always loved terminals and retro-computing. I find they were a technology that didn’t got fully their full potential due to graphical interfaces (it’s strange I say this since my first computer was a Macintosh LC II at a time where everybody else had at most a PC with Windows 3.11…). That’s the main reason I added support for Unicode BPM plain in Linux kernel for NodeOS, specially to have available the Braille patters used by blessed-contrib to draw graphical diagrams in the terminal. That’s the reason why when I discovered BOOTSTRA.386 project, a Bootstrap theme that mimics a text-mode interface in a website similar to old BBSs (fathers of web forums, and grandfathers of current online walls), I got enthusiastic about the idea of making it compatible with real terminal web browsers like Links, w3m or Lynx.

Read More

What's `re-start`?

April 15, 2020

React Native is a framework derived from React that allow to program mobile native apps using Javascript. It’s only focused on Android and iOS, but its popularity has lead to other implementations of its API for other platforms like Windows, macOS or also web. Thing is, although they share the same APIs and source code is (almost) compatible between them, they are not integrated so it would surface difficulties to add a new platform to an existing code, or forcing to have several different projects that could lead to duplicated efforts or diverge the features of them.

Read More

Freelancer calculator

March 30, 2020

At the same time I work as employee, sometimes I get offers for freelance projects. It’s difficult to find a balanced rate between both schemes, so using Google Spreadsheet I’ve done a calculator to adjust them. The calculator is focused for Spain taxes (one of the reasons texts are in spanish), but should be easy to addapt to other normatives.

Read More

OS lifecycle

March 19, 2020

projectlint is a projects-wide linter and style checker I’ve been working on during the last weeks. As part of its set of rules, one of them ensures that the current version of the operating system where the code is running is maintained and updated. But, is there a npm package with info about the operating systems lifecycles? Nope… enter OS lifecycle.

Read More

Confirm deletion in RESTful APIs

March 1, 2020

When designing web services, it’s normal to include an option to delete an user’s account. Since this is an important action (the user and its data will dissapear from the platform), usually this is done by asking him to confirm the operation, with several endpoints one for each operation step. Navigating between different pages is so 2010-style, and there’s no direct mapping at this point between REST APIs and CRUD operations, that I’ve been thinking in a REST compatible alternative: use a token.

Read More

How to have a blog on Github

February 5, 2020

Since I was a child I never liked to write. I was more a thinker, a tinker and a doer, and found really tedious to start writing ideas that I could already do, explain or show. In fact, I hated the idea of receiving a diary as a present for making my first Communion (somewhat typical here at Spain, and luckily didn’t happen to me) because I found boring to write about things that already have happened while I would be creating new ones. The same reason why I’m not too much into blogs (both writing and reading) because I pay too much attention to what I say and how I do it, and get to be really slow to get fully polished my final text (I mostly did my bachelor thesis code in 6 months… and later spended other 14 months more just for writting the project memory. It’s ironic that the times I got to write something, people got surprised that I have a somewhat good style… and more ironic that having written so much (open) source code, probably in lines number I could be able to make both Dan Brown and J.K. Rowling to fall on their knees :-P Unluckily, they have got more revenues for their jobs than me, good for them :-)

Read More