Blog
Mafalda SFU receives “Best Scalable Real-Time Media Platform 2026”
I’m happy to share that Mafalda SFU has been recognised as “Best Scalable Real-Time Media Platform 2026” at the Spanish Business Awards organised by EU Business News.
You can find the official listing at https://www.eubusinessnews.com/winners/mafalda-sfu/, and the announcement published on the project website.
It’s always nice when a side project receives some external recognition, especially one that started mostly as an experiment.
How Mafalda SFU started
Mafalda SFU originally started in early 2021 as a way to learn about scaling Mediasoup infrastructures. At that time I was contacted twice within two weeks to help companies solve scalability issues around Mediasoup deployments. That made me curious enough to start experimenting with architectures and tooling around that problem, in case a third one came along.
At some point, I casually mentioned on Twitter that I was working on this kind of Mediasoup scalability technology…. and thanks to that tweet, within less than a month later nine different companies contacted me asking me for help about the same topic.
That was the moment when Mafalda SFU stopped being just an experiment and started to look like something that could become a proper product.
Real-time media is heating up again
After the pandemic many real-time media projects slowed down. The industry had gone through an enormous acceleration during those years, and things naturally cooled down afterwards.
Interestingly, over the last two years interest in WebRTC and real-time streaming seems to be picking up again, mostly driven during 2025 by new use cases around generative AI regarding audio and video assistants. During that time, I’ve worked as Fractional CTO of some startups on that fast-paced area, although my current projects are more related to Deep Tech and infrastructure projects, like satellite VoIP communication, Bluetooth-based audio streaming, or optimisation of large-scale video surveillance systems, so it’s nice to see that real-time media space is heating up again.
A small milestone
Mafalda SFU has always been a relatively small project, but it was able to make itself a name in the WebRTC ecosystem, and I was recognised as a WebRTC expert thanks to it, which is something I’m really proud of. Mafalda SFU has also opened me the door to work with some really interesting companies and projects, starting by Dyte that was the first one to contact me after I published that tweet, and after joining them for two years it became one of the best companies I’ve ever worked for (and I’m really grateful for that), but also because thanks to that WebRTC expertise recognition, I was invited to join Avrioc in Abu Dhabi as Comera WebRTC Architect, and it has been one of the greatest professional achievements of my career… and made me to desire to go back to UAE as soon I have another opportunity like that.
Receiving this award is a nice milestone for the project, and another small brick in a longer journey working on real-time communication systems and distributed architectures. Let’s continue and go for the next one.
Deterministic Audio Fixtures for End-to-End Testing
Designing Robust Spectral Validation for Audio Pipelines
Testing audio systems is deceptively hard.
Unlike text or structured data, audio pipelines are often lossy, time-sensitive, and highly stateful. Codecs introduce quantization noise, transports introduce jitter, buffers may reorder or drop frames, and decoders may subtly alter timing or amplitude. Traditional byte-level comparisons or waveform diffs are therefore brittle and misleading.
In this article, I present audio-test-fixtures, a deterministic, spectral-based approach to testing audio pipelines end-to-end. The result is a small but robust toolkit that generates known audio fixtures and validates decoded output using FFT-based frequency analysis, designed to work reliably even with lossy codecs and imperfect transports.
The Core Problem
Let’s define the problem precisely:
How can we mechanically and reliably verify that an audio signal survives encoding, transmission, and decoding without unacceptable distortion?
Key constraints:
- Bitwise equality is impossible with lossy codecs
- Waveform comparison is extremely sensitive to phase, gain, and timing
- Perceptual metrics (PESQ, POLQA) are heavyweight and opaque
- Manual listening does not scale and is not CI-friendly
What we need instead is:
- Deterministic input
- Known ground truth
- A validation method tolerant to amplitude and phase drift
- Machine-verifiable results
- Clear pass/fail semantics
Design Overview
The solution is split into two clearly separated components:
-
Audio Fixture Generator Generates a deterministic WAV file containing a known sequence of pure tones.
-
Audio Transmission Validator Compares a reference WAV with a decoded WAV using spectral analysis.
This separation of responsibilities is critical:
- Fixtures are generated once
- Validation can be run repeatedly in CI, on-device, or in regression tests
Why Pure Tones?
Human voice spans roughly 80 Hz to 1.1 kHz. Instead of attempting to simulate speech, we use pure sinusoidal tones because:
- Their frequency is mathematically unambiguous
- FFT peak detection is reliable
- Harmonics and distortion are easy to observe
- They are codec-agnostic
Each tone becomes a spectral marker that we can later detect.
Audio Fixture Design
Format
The generated file has strict, predictable properties:
- PCM WAV
- 16-bit
- Mono
- 16 kHz
- Exactly 10 seconds
- 160,000 samples
This makes it compatible with:
- Embedded systems
- Mobile platforms
- Voice codecs
- Low-latency transports
Frequency Content
The file contains 27 ascending notes, from E2 (82 Hz) to C6 (1046 Hz), covering the full vocal range.
Each note consists of:
- ~350 ms pure sine wave
- 20 ms silence between notes
- Short fade-in/out to avoid clicks
Generator Implementation
Below is a simplified excerpt of the tone generation logic:
def generate_tone(frequency, duration, sample_rate, amplitude=0.3):
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
return amplitude * np.sin(2 * np.pi * frequency * t)
Each tone is placed at a deterministic position in the final buffer, allowing us to later compute exact analysis windows.
The resulting WAV file is fully deterministic: generating it twice produces the same signal (modulo floating-point rounding).
Why Determinism Matters
Determinism enables:
- Stable CI tests
- Meaningful regression comparisons
- Long-term maintainability
- Debuggable failures
If your input changes every run, your test results become meaningless.
Validation Strategy
What We Validate
The validator checks multiple orthogonal dimensions:
-
WAV Metadata
- Sample rate
- Bit depth
- Channel count
- Duration (with tolerance)
-
Spectral Integrity
- Dominant frequency per segment
- Frequency deviation (Hz and %)
- Accuracy ratio (% within tolerance)
-
Signal Quality
- Signal-to-Noise Ratio (SNR)
Each metric answers a different question:
- Is the format correct?
- Are frequencies preserved?
- Is noise within acceptable bounds?
FFT-Based Frequency Detection
Instead of comparing waveforms, we extract the dominant frequency of each segment using FFT:
fft_result = np.fft.rfft(windowed_segment)
fft_freqs = np.fft.rfftfreq(len(segment), 1.0 / sample_rate)
dominant_freq = fft_freqs[np.argmax(np.abs(fft_result))]
Important implementation details:
- Hann windowing to reduce spectral leakage
- Frequency band filtering (50 Hz – 1200 Hz)
- Analysis window centered on tone (avoids silence)
This approach is:
- Phase-invariant
- Gain-invariant
- Robust to small timing drift
Frequency Tolerance
Lossy codecs will introduce frequency smearing. Therefore, validation uses a configurable tolerance:
--tolerance 5.0 # Hz
Typical values:
| Scenario | Tolerance |
|---|---|
| Lossless | ±2 Hz |
| Light compression | ±5 Hz |
| Heavy compression | ±10 Hz |
A note is considered valid if:
|detected_freq - expected_freq| ≤ tolerance
Aggregated Metrics
After analyzing all segments, we compute:
-
Frequency accuracy Percentage of notes within tolerance
-
Mean frequency error
-
SNR (dB) Based on power ratio between reference and decoded signals
Example output:
Frequencies correct: 27/27 (100.0%)
Mean frequency error: 0.82 Hz
SNR: 38.7 dB
CI-Friendly Results
The validator is explicitly designed for automation:
- Exit code
0: validation passed - Exit code
1: validation failed - No human interpretation required
Example:
validate-audio reference.wav decoded.wav --tolerance 10.0 \
&& echo "PASS" || echo "FAIL"
This allows seamless integration into:
- GitHub Actions
- GitLab CI
- Jenkins
- Embedded test harnesses
Why Not Waveform Comparison?
Waveform diffs fail because:
- Phase shifts invalidate comparisons
- Gain normalization breaks equality
- Minor resampling introduces drift
- Codecs reorder samples internally
Spectral comparison answers the right question:
Is the information content preserved within acceptable limits?
Why Not Perceptual Metrics?
Perceptual metrics (PESQ, POLQA):
- Are complex and opaque
- Often require licenses
- Are hard to debug
- Are slow and heavyweight
This approach is:
- Transparent
- Deterministic
- Explainable
- Fast
Typical Use Cases
This methodology works well for:
- Audio codec validation
- Transport integrity tests (UDP, BLE, RTP)
- Embedded and mobile pipelines
- Regression testing
- Hardware-in-the-loop testing
- DSP algorithm validation
Final Thoughts
This project demonstrates that audio testing does not need to be fuzzy or subjective.
By:
- Using deterministic fixtures
- Focusing on spectral correctness
- Accepting controlled loss
- Producing machine-verifiable results
we can build robust, maintainable, and scalable audio tests that survive real-world conditions.
If you are testing audio pipelines and still relying on manual listening or fragile waveform diffs, it may be time to rethink your approach.
Note
Code was developed by Claude Sonnet 4.5, an AI language model by Anthropic, from an original idea of mine. Post was written by ChatGPT GPT-5.2, an AI language model by OpenAI. Final formatting and text edition was done by hand. You can download a detailed discussion of the process.
#human-ai-collaboration
Routing Android Device Through a Laptop Using Bluetooth PAN and Tailscale
A Practical Walkthrough of a Surprisingly Hard Problem
Adding Backpressure to Python’s ProcessPoolExecutor
Recently I’ve hit a practical limitation with Python’s ProcessPoolExecutor:
when feeding it tens of thousands of tasks from hundreds of producer threads,
the executor happily accepted them all. The result? Memory usage ballooned,
latency increased, and eventually the whole system became unstable.
Bringing Class-Based Views to Fastify (Inspired by Django)
Why doesn’t Node.js have something like Django’s Class-Based Views (CBVs)?
How to build WebRTC for Android in Ubuntu 25.04
Google used to provide
prebuild Android images of
libWebRTC library, and in fact, it’s (still) the recomended way to use them on
its own documentation.
But starting on WebRTC M80 release (January 2020), they decided to
deprecate the binary mobile libraries,
and the reasons were that the builds were intended just only for development
purposes, and
users were already building it themselves with their own customizations, or using third party libraries that embedded them
(where have been left developers that just want to build a WebRTC enabled mobile
app?), and they just only provided another build in August 2020 (1.0.32006) to
fill some important security holes, in case someone (everybody?) was still using
the binary mobile libraries.
Designing “Almost-Autonomous” Reminders in ChatGPT (No Third-Party Bots)
How we went from a one-off ping to a nightly, varied, almost-autonomous reminder flow inside ChatGPT, and the three agent patterns you can use to build it — complete with runnable code.
Minimal and secure Python distroless Docker images with Poetry
For a recent project, I needed to create a Docker image for a Python application that is being handled with Poetry. I already done it one year ago using distroless images, that provide minimal Docker images based on Debian without package managers, shells or any other tools commonly found in traditional images, and optimized for security and size. But after the release of Debian 12 and Poetry 2.0, and so much improvements on the ecosystem during this year, this time I wanted to take the opportunity to create a more secure and minimal image, and to know what would be the best practices for doing so.
Optimizing Git Branch Naming & Syncing with Upstream Repositories
When working with multiple remote repositories, especially when syncing changes from upstream (such as in a forked repository), it’s important to have a well-structured system for organizing and tracking branches. This ensures clarity, ease of maintenance, and the ability to manage branches effectively. In this post, we’ll walk through the decision-making process for setting up a clear naming convention and syncing branches between your repository and an upstream one.
How to use a different SSH credential for a specific git repository
If you have multiple SSH keys and want to use a specific one for a particular Git repository, you can do so by configuring it on the SSH config file: