Blog
WebRTC bitrate is not what you think
Most WebRTC discussions about bitrate are wrong.
Not because people don’t know the APIs, but because they don’t control the experiment.
So instead of asking:
“what bitrate should I use?”
I built something slightly different:
a reproducible WebRTC benchmark to measure how codecs behave under controlled conditions
This is what came out of it.
The key idea: stop tweaking, start measuring
Typical WebRTC demos let you:
- pick a bitrate
- start a call
- watch a graph
But they don’t let you answer:
- how codecs compare under identical conditions
- how bitrate scales with resolution
- what happens at extreme low resolutions
- how packetization differs between codecs
So the goal here was not another demo.
It was:
a minimal, controlled measurement pipeline
You can get the source code at Content-PeerConnection-bandwidth: A WebRTC benchmarking sandbox for codec bitrate analysis.
Architecture: eliminate variables until only the codec remains
Loopback PeerConnection
Instead of testing over a network:
-
pc1sends -
pc2receives - both live in the same page
👉 No network noise. No congestion variability. Just encoder + RTP behavior.
Synthetic video (this is more important than it looks)
Real cameras are chaotic:
- lighting changes
- motion varies
- compression becomes content-dependent
So the system includes a synthetic video generator:
- deterministic color patterns
- controlled motion
- stable entropy
canvas.captureStream(frameRate);
👉 This is what makes results reproducible, not just observable.
Hard enforcement of codec selection
This is one of the most critical (and often ignored) parts.
The system:
-
queries
RTCRtpSender.getCapabilities('video') -
filters out non-codecs (
rtx,red, etc.) -
applies:
transceiver.setCodecPreferences([selectedCodec]); -
verifies the negotiated codec after SDP exchange
-
fails if multiple codecs are active
👉 If you don’t do this, your benchmark is invalid.
Measuring reality, not configuration
Every second, the system samples:
sender.getStats();
And computes:
Bitrate
$$ bitrate = 8 \cdot \frac{\Delta bytesSent}{\Delta t} \cdot 1000 $$
Where:
-
bytesSentis the total bytes sent by the RTP stream -
tis the timestamp of the stats report
Header bitrate
$$ header\ bitrate = 8 \cdot \frac{\Delta headerBytesSent}{\Delta t} \cdot 1000 $$
Packets per second
$$ pps = \Delta packetsSent $$
Then accumulates:
- average bitrate
- peak bitrate
- average pps
- peak pps
And exports:
codec,width,height,framerate,max_bps,avg_bps,max_packets,avg_packets
👉 This is not a demo anymore. This is a dataset.
The pipeline (this is where it becomes interesting)
- Run controlled experiment in browser
- Copy CSV row
- Append to dataset
- Generate chart with Node script
The plotting script:
- groups by codec
- sorts resolutions by pixel count
- generates:
- solid line → average bitrate
- dashed line → peak bitrate
👉 This separation (measurement vs visualization) is what makes the system clean.
The results (this is where things get real)
Here’s the generated chart:

There are three things that immediately stand out from this chart:
- Bitrate does not scale linearly with resolution.
- AV1 is consistently more efficient than other codecs.
- Some codecs (notably VP9) show significantly higher peak bitrate.
Let’s break that down.
What the data actually shows
1. Bitrate scales non-linearly with resolution
From your dataset:
- going from 320×240 → 640×480:
- pixels ×4
- bitrate ~×2–3 (not ×4)
👉 Compression efficiency improves with resolution.
2. AV1 is consistently more efficient
At 640×480:
- H264 ≈ 200 kbps avg
- VP8 ≈ 215 kbps
- VP9 ≈ 330 kbps
- AV1 ≈ 136 kbps
👉 AV1 shows noticeably higher efficiency in this setup (~30–40% less bitrate at VGA).
But…
3. Peak bitrate tells a different story
Example:
- VP9 peak ≈ 938 kbps
- AV1 peak ≈ 467 kbps
👉 Some codecs are more “bursty” than others.
This matters for:
- network buffers
- real-time latency
- congestion control behavior
4. Low resolutions behave weirdly
At tiny resolutions:
- 2×2
- 4×3
- 8×6
Bitrates collapse to almost zero.
But not uniformly:
- VP8 shows strange spikes at very low resolutions
- others stay smoother
👉 This is codec-dependent overhead + quantization effects.
Unexpected result: how little video you actually need
One of the most surprising findings came from extremely low resolutions.
At:
- 2×2 → you can still detect motion
- 4×3 → you start seeing shapes
- 8×6 → objects become identifiable
- ~15×10 → you can recognize faces
Not because the encoder is good, but because the browser is doing heavy interpolation when rendering the video.
This suggests that “minimum usable video” might be far lower than what we usually assume.
5. Packet rate matters (a lot)
We tracked pps, which most people ignore.
This reveals:
- RTP overhead differences
- fragmentation strategies
- encoder packetization decisions
👉 Two codecs with the same bitrate can have completely different packet rates.
And that difference directly impacts:
- CPU usage
- network overhead
- congestion control behavior
The most important insight
After building this, the biggest takeaway is:
WebRTC bitrate is an emergent property, not a parameter
It depends on:
- codec
- resolution
- fps
- content entropy
- packetization
- browser implementation
And most importantly:
you can’t understand it without measuring it
What this changes in practice
Most WebRTC advice online is cargo cult.
People copy bitrate values, codec preferences, and SDP tweaks without ever measuring the result.
This is how myths become best practices.
If you work with WebRTC:
Stop doing this
- “set bitrate = X”
- “VP9 is better than VP8”
- “we need 500 kbps for VGA”
Start doing this
- build controlled experiments
- collect real stats
- compare under identical conditions
Why this matters (real world)
This directly impacts:
- SFU scaling
- mobile data usage
- satellite / constrained links (👀 your case)
- latency vs quality tradeoffs
Final thought
Most WebRTC knowledge is anecdotal.
People copy values, tweak parameters, and assume results. But bitrate is not something you configure. It’s something that emerges from the system.
If you don’t measure it, you don’t understand it. And if you don’t understand it, you’re not controlling it.
You’re just hoping.
Citing
If you use this project or its data in your work, please cite it as:
Leganés-Combarro, Jesús (2026).
"WebRTC bitrate is not what you think"
https://piranna.github.io/2026/04/28/WebRTC-bitrate-is-not-what-you-think/
Mafalda SFU receives “Best Scalable Real-Time Media Platform 2026”
I’m happy to share that Mafalda SFU has been recognised as “Best Scalable Real-Time Media Platform 2026” at the Spanish Business Awards organised by EU Business News.
You can find the official listing at https://www.eubusinessnews.com/winners/mafalda-sfu/, and the announcement published on the project website.
It’s always nice when a side project receives some external recognition, especially one that started mostly as an experiment.
How Mafalda SFU started
Mafalda SFU originally started in early 2021 as a way to learn about scaling Mediasoup infrastructures. At that time I was contacted twice within two weeks to help companies solve scalability issues around Mediasoup deployments. That made me curious enough to start experimenting with architectures and tooling around that problem, in case a third one came along.
At some point, I casually mentioned on Twitter that I was working on this kind of Mediasoup scalability technology…. and thanks to that tweet, within less than a month later nine different companies contacted me asking me for help about the same topic.
That was the moment when Mafalda SFU stopped being just an experiment and started to look like something that could become a proper product.
Real-time media is heating up again
After the pandemic many real-time media projects slowed down. The industry had gone through an enormous acceleration during those years, and things naturally cooled down afterwards.
Interestingly, over the last two years interest in WebRTC and real-time streaming seems to be picking up again, mostly driven during 2025 by new use cases around generative AI regarding audio and video assistants. During that time, I’ve worked as Fractional CTO of some startups on that fast-paced area, although my current projects are more related to Deep Tech and infrastructure projects, like satellite VoIP communication, Bluetooth-based audio streaming, or optimisation of large-scale video surveillance systems, so it’s nice to see that real-time media space is heating up again.
A small milestone
Mafalda SFU has always been a relatively small project, but it was able to make itself a name in the WebRTC ecosystem, and I was recognised as a WebRTC expert thanks to it, which is something I’m really proud of. Mafalda SFU has also opened me the door to work with some really interesting companies and projects, starting by Dyte that was the first one to contact me after I published that tweet, and after joining them for two years it became one of the best companies I’ve ever worked for (and I’m really grateful for that), but also because thanks to that WebRTC expertise recognition, I was invited to join Avrioc in Abu Dhabi as Comera WebRTC Architect, and it has been one of the greatest professional achievements of my career… and made me to desire to go back to UAE as soon I have another opportunity like that.
Receiving this award is a nice milestone for the project, and another small brick in a longer journey working on real-time communication systems and distributed architectures. Let’s continue and go for the next one.
Deterministic Audio Fixtures for End-to-End Testing
Designing Robust Spectral Validation for Audio Pipelines
Testing audio systems is deceptively hard.
Unlike text or structured data, audio pipelines are often lossy, time-sensitive, and highly stateful. Codecs introduce quantization noise, transports introduce jitter, buffers may reorder or drop frames, and decoders may subtly alter timing or amplitude. Traditional byte-level comparisons or waveform diffs are therefore brittle and misleading.
In this article, I present audio-test-fixtures, a deterministic, spectral-based approach to testing audio pipelines end-to-end. The result is a small but robust toolkit that generates known audio fixtures and validates decoded output using FFT-based frequency analysis, designed to work reliably even with lossy codecs and imperfect transports.
The Core Problem
Let’s define the problem precisely:
How can we mechanically and reliably verify that an audio signal survives encoding, transmission, and decoding without unacceptable distortion?
Key constraints:
- Bitwise equality is impossible with lossy codecs
- Waveform comparison is extremely sensitive to phase, gain, and timing
- Perceptual metrics (PESQ, POLQA) are heavyweight and opaque
- Manual listening does not scale and is not CI-friendly
What we need instead is:
- Deterministic input
- Known ground truth
- A validation method tolerant to amplitude and phase drift
- Machine-verifiable results
- Clear pass/fail semantics
Design Overview
The solution is split into two clearly separated components:
-
Audio Fixture Generator Generates a deterministic WAV file containing a known sequence of pure tones.
-
Audio Transmission Validator Compares a reference WAV with a decoded WAV using spectral analysis.
This separation of responsibilities is critical:
- Fixtures are generated once
- Validation can be run repeatedly in CI, on-device, or in regression tests
Why Pure Tones?
Human voice spans roughly 80 Hz to 1.1 kHz. Instead of attempting to simulate speech, we use pure sinusoidal tones because:
- Their frequency is mathematically unambiguous
- FFT peak detection is reliable
- Harmonics and distortion are easy to observe
- They are codec-agnostic
Each tone becomes a spectral marker that we can later detect.
Audio Fixture Design
Format
The generated file has strict, predictable properties:
- PCM WAV
- 16-bit
- Mono
- 16 kHz
- Exactly 10 seconds
- 160,000 samples
This makes it compatible with:
- Embedded systems
- Mobile platforms
- Voice codecs
- Low-latency transports
Frequency Content
The file contains 27 ascending notes, from E2 (82 Hz) to C6 (1046 Hz), covering the full vocal range.
Each note consists of:
- ~350 ms pure sine wave
- 20 ms silence between notes
- Short fade-in/out to avoid clicks
Generator Implementation
Below is a simplified excerpt of the tone generation logic:
def generate_tone(frequency, duration, sample_rate, amplitude=0.3):
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
return amplitude * np.sin(2 * np.pi * frequency * t)
Each tone is placed at a deterministic position in the final buffer, allowing us to later compute exact analysis windows.
The resulting WAV file is fully deterministic: generating it twice produces the same signal (modulo floating-point rounding).
Why Determinism Matters
Determinism enables:
- Stable CI tests
- Meaningful regression comparisons
- Long-term maintainability
- Debuggable failures
If your input changes every run, your test results become meaningless.
Validation Strategy
What We Validate
The validator checks multiple orthogonal dimensions:
-
WAV Metadata
- Sample rate
- Bit depth
- Channel count
- Duration (with tolerance)
-
Spectral Integrity
- Dominant frequency per segment
- Frequency deviation (Hz and %)
- Accuracy ratio (% within tolerance)
-
Signal Quality
- Signal-to-Noise Ratio (SNR)
Each metric answers a different question:
- Is the format correct?
- Are frequencies preserved?
- Is noise within acceptable bounds?
FFT-Based Frequency Detection
Instead of comparing waveforms, we extract the dominant frequency of each segment using FFT:
fft_result = np.fft.rfft(windowed_segment)
fft_freqs = np.fft.rfftfreq(len(segment), 1.0 / sample_rate)
dominant_freq = fft_freqs[np.argmax(np.abs(fft_result))]
Important implementation details:
- Hann windowing to reduce spectral leakage
- Frequency band filtering (50 Hz – 1200 Hz)
- Analysis window centered on tone (avoids silence)
This approach is:
- Phase-invariant
- Gain-invariant
- Robust to small timing drift
Frequency Tolerance
Lossy codecs will introduce frequency smearing. Therefore, validation uses a configurable tolerance:
--tolerance 5.0 # Hz
Typical values:
| Scenario | Tolerance |
|---|---|
| Lossless | ±2 Hz |
| Light compression | ±5 Hz |
| Heavy compression | ±10 Hz |
A note is considered valid if:
|detected_freq - expected_freq| ≤ tolerance
Aggregated Metrics
After analyzing all segments, we compute:
-
Frequency accuracy Percentage of notes within tolerance
-
Mean frequency error
-
SNR (dB) Based on power ratio between reference and decoded signals
Example output:
Frequencies correct: 27/27 (100.0%)
Mean frequency error: 0.82 Hz
SNR: 38.7 dB
CI-Friendly Results
The validator is explicitly designed for automation:
- Exit code
0: validation passed - Exit code
1: validation failed - No human interpretation required
Example:
validate-audio reference.wav decoded.wav --tolerance 10.0 \
&& echo "PASS" || echo "FAIL"
This allows seamless integration into:
- GitHub Actions
- GitLab CI
- Jenkins
- Embedded test harnesses
Why Not Waveform Comparison?
Waveform diffs fail because:
- Phase shifts invalidate comparisons
- Gain normalization breaks equality
- Minor resampling introduces drift
- Codecs reorder samples internally
Spectral comparison answers the right question:
Is the information content preserved within acceptable limits?
Why Not Perceptual Metrics?
Perceptual metrics (PESQ, POLQA):
- Are complex and opaque
- Often require licenses
- Are hard to debug
- Are slow and heavyweight
This approach is:
- Transparent
- Deterministic
- Explainable
- Fast
Typical Use Cases
This methodology works well for:
- Audio codec validation
- Transport integrity tests (UDP, BLE, RTP)
- Embedded and mobile pipelines
- Regression testing
- Hardware-in-the-loop testing
- DSP algorithm validation
Final Thoughts
This project demonstrates that audio testing does not need to be fuzzy or subjective.
By:
- Using deterministic fixtures
- Focusing on spectral correctness
- Accepting controlled loss
- Producing machine-verifiable results
we can build robust, maintainable, and scalable audio tests that survive real-world conditions.
If you are testing audio pipelines and still relying on manual listening or fragile waveform diffs, it may be time to rethink your approach.
Note
Code was developed by Claude Sonnet 4.5, an AI language model by Anthropic, from an original idea of mine. Post was written by ChatGPT GPT-5.2, an AI language model by OpenAI. Final formatting and text edition was done by hand. You can download a detailed discussion of the process.
#human-ai-collaboration
Routing Android Device Through a Laptop Using Bluetooth PAN and Tailscale
A Practical Walkthrough of a Surprisingly Hard Problem
Adding Backpressure to Python’s ProcessPoolExecutor
Recently I’ve hit a practical limitation with Python’s ProcessPoolExecutor:
when feeding it tens of thousands of tasks from hundreds of producer threads,
the executor happily accepted them all. The result? Memory usage ballooned,
latency increased, and eventually the whole system became unstable.
Bringing Class-Based Views to Fastify (Inspired by Django)
Why doesn’t Node.js have something like Django’s Class-Based Views (CBVs)?
How to build WebRTC for Android in Ubuntu 25.04
Google used to provide
prebuild Android images of
libWebRTC library, and in fact, it’s (still) the recomended way to use them on
its own documentation.
But starting on WebRTC M80 release (January 2020), they decided to
deprecate the binary mobile libraries,
and the reasons were that the builds were intended just only for development
purposes, and
users were already building it themselves with their own customizations, or using third party libraries that embedded them
(where have been left developers that just want to build a WebRTC enabled mobile
app?), and they just only provided another build in August 2020 (1.0.32006) to
fill some important security holes, in case someone (everybody?) was still using
the binary mobile libraries.
Designing “Almost-Autonomous” Reminders in ChatGPT (No Third-Party Bots)
How we went from a one-off ping to a nightly, varied, almost-autonomous reminder flow inside ChatGPT, and the three agent patterns you can use to build it — complete with runnable code.
Minimal and secure Python distroless Docker images with Poetry
For a recent project, I needed to create a Docker image for a Python application that is being handled with Poetry. I already done it one year ago using distroless images, that provide minimal Docker images based on Debian without package managers, shells or any other tools commonly found in traditional images, and optimized for security and size. But after the release of Debian 12 and Poetry 2.0, and so much improvements on the ecosystem during this year, this time I wanted to take the opportunity to create a more secure and minimal image, and to know what would be the best practices for doing so.
Optimizing Git Branch Naming & Syncing with Upstream Repositories
When working with multiple remote repositories, especially when syncing changes from upstream (such as in a forked repository), it’s important to have a well-structured system for organizing and tracking branches. This ensures clarity, ease of maintenance, and the ability to manage branches effectively. In this post, we’ll walk through the decision-making process for setting up a clear naming convention and syncing branches between your repository and an upstream one.