Skip to main content

Command Palette

Search for a command to run...

How WebRTC Actually Works: A Deep Dive Through Live Experiments

I didn't just read about WebRTC — I broke it, fixed it, and broke it again. Here's everything I learned.

Updated
12 min read

The Problem: The Web Was Never Built for Real-Time

The web was built on a simple idea: you ask, the server answers. That's HTTP — a request-response model that works beautifully for loading pages, fetching data, and submitting forms.

But what happens when you need something more live?

The old workarounds:

Polling — Your browser repeatedly asks the server "anything new?" every few seconds. Simple, but wasteful. You're making hundreds of requests just to get one update.

Long Polling — A smarter version: your browser asks the server, and the server holds the connection open until something new happens, then responds. Better, but still built on a model where the server has all the power.

WebSockets — A genuine improvement. After an HTTP handshake, the connection stays open and both sides can push data freely. Great for chat apps and live feeds. But it still routes everything through a server.

The problem with all of these? Every byte of data travels through your server. For video calls or large file transfers, that means:

  • Your server pays for all that bandwidth

  • Latency is added at every hop

  • You become a bottleneck at scale

What if two browsers could just talk directly to each other?

That's exactly what WebRTC was built for.


Enter WebRTC

WebRTC (Web Real-Time Communication) is a browser-native technology that enables direct, peer-to-peer communication between clients — no plugins, no installs, no server in the data path.

It exposes three core APIs:

  • RTCPeerConnection — Manages the actual peer-to-peer connection, ICE negotiation, and security

  • RTCDataChannel — Sends arbitrary data (text, files, game state) directly between peers

  • MediaStream — Handles audio and video capture and transmission

But establishing a direct connection between two browsers across the internet is a genuinely hard problem. WebRTC solves it by breaking the challenge into well-defined subproblems:

Component What it does
SDP (Session Description Protocol) Describes what each peer can do — codecs, formats, network info
STUN (Session Traversal Utilities for NAT) Helps a peer discover its own public IP address
ICE (Interactive Connectivity Establishment) Discovers all possible network paths and picks the best one
TURN (Traversal Using Relays around NAT) Acts as a relay server when direct connection is impossible
DTLS (Datagram Transport Layer Security) Encrypts the connection — WebRTC is secure by default
SRTP (Secure Real-time Transport Protocol) Encrypts audio and video streams specifically
RTCP (Real-time Transport Control Protocol) Monitors connection quality and provides feedback

Now let's talk about the elephant in the room: why is any of this complicated?


The Core Challenge: You Are Hidden From the Internet

Most devices don't have a public IP address directly. Instead, your router gets a single public IP, and all the devices on your home network share it through a system called NAT (Network Address Translator).

NAT is great for security — it hides your devices from the public internet. But it creates a fundamental problem for peer-to-peer communication:

If I want to send you a packet directly, what address do I send it to? You don't know your own public IP. I don't know yours either.

WebRTC's answer: before connecting, both peers gather every possible address they could be reached at. These addresses are called ICE candidates.

What are local and remote candidates?

A candidate is simply an address and port through which a peer can be reached.

Local candidates are all the ways your computer can be reached — your local IP on your home network, your public IP discovered via STUN, addresses from multiple interfaces (WiFi, Ethernet, VPN), and so on.

Remote candidates are all the ways the other peer can be reached. When you connect, both sides exchange their local candidates, which become remote candidates from the other's perspective.

WebRTC then systematically tries every combination of local and remote candidates to find the fastest, most reliable path. There are three types of candidates you'll commonly see:

  • host — Your machine's actual local IP (e.g., 192.168.1.5:5000). Fast, but only works when both peers are on the same network.

  • srflx (server-reflexive) — Your public IP as seen by a STUN server (e.g., 49.12.13.24:6000). Works across different networks.

  • relay — An address on a TURN server that forwards your traffic. The fallback when everything else fails.

With that foundation set, let me walk you through the experiments.


Experiment 1: A Basic Peer-to-Peer Connection

Objective: Verify that two browsers can communicate directly after signaling.

Setup: Two browser tabs on the same desktop machine, with manual SDP exchange.

What I observed

Peer A
Peer B

The ICE state transitioned cleanly:

new → checking → connected

The DataChannel opened successfully, and I could see in the logs:

dtlsState = connected
data-channel state = open
candidateType = host

The two sockets were communicating on ports 58855 ↔ 58857 — both on the same machine.

Key insight

Only host candidates appeared. No srflx, no relay. Even though a STUN server was configured, WebRTC never needed to use it — it found a direct local path immediately.

This makes sense: both peers were on the same machine, sharing the same network. There was no NAT to traverse.

One subtle but important observation: dtlsState = connected tells us the connection was encrypted. WebRTC enforces encryption by default using DTLS — there's no option to turn it off. This is often overlooked, but it means your peer-to-peer data is always secure.

Limitation: This experiment doesn't reflect real-world conditions. When both peers are on the same machine, everything is fast, predictable, and forgiving. Bugs hide here. We'll see why that matters in Experiment 3.


Experiment 2: What Happens Without STUN

Objective: Prove that WebRTC fails in real network conditions without a STUN server.

Setup: One desktop (on WiFi) and one mobile device (on 4G), no STUN server configured.

What I observed


ICE state: checking → connected → disconnected → failed
Connection state: failed
Candidate types: host only

No data was exchanged. The DataChannel opened locally on each side but no messages arrived.

Why it failed

Without STUN, each peer only knows its private IP:

  • Desktop: 192.168.x.x (your local WiFi subnet)

  • Mobile: 10.x.x.x (carrier NAT)

These addresses are not routable over the internet. Neither peer is visible to the other. When ICE tried host ↔ host, it failed — there was simply no path.

An interesting edge case I hit: I expected it to work when both devices were on the same WiFi network. It didn't. The likely reason is AP isolation — many routers block direct communication between devices on the same WiFi for security reasons, so even devices on the same network couldn't reach each other directly.

One subtle detail worth noting: dtlsState briefly showed connected even while iceState was failing. This means the DTLS handshake started, but without a valid ICE candidate pair to transport data, the connection ultimately collapsed.

Key takeaway: Private IP addresses are insufficient for cross-network communication. STUN is not optional in real-world deployments.


Experiment 3: Adding STUN — And a Hidden Bug Surfaces

Objective: Understand the role of STUN servers and enable cross-network connectivity.

Setup: Added a public STUN server to the configuration. Tested between desktop and mobile.

What I observed


New candidate types appeared:

candidateType = srflx

ICE reached connected. But something was wrong with the DataChannel — messages only flowed one way, and after a refresh, the direction flipped.

The real bug: asymmetric DataChannel creation

This experiment exposed a bug that was completely invisible in Experiment 1.

The issue: both peers were calling pc.createDataChannel("chat"). In WebRTC, only one peer should create the DataChannel. The other peer must receive it via the ondatachannel event.

When both peers create their own channel:

  • Peer A sends on its own channel

  • Peer B is listening on a different channel

  • Messages go nowhere silently

In Experiment 1, this bug was hidden. Both tabs ran on the same machine with near-zero latency and predictable timing. One tab happened to execute slightly earlier, became the effective initiator, and the other tab's channel was silently overridden. It appeared to work.

In Experiment 3, with real network latency and STUN negotiation adding async steps, the race condition was exposed.

This is one of the most important lessons from these experiments:

WebRTC behavior can appear correct in simple environments even when the implementation is wrong. The same machine, the same network, the controlled lab — these mask bugs that only appear under real-world async conditions.

The fix: Only one peer creates the DataChannel. The other listens:

// Peer A (initiator)
const dataChannel = pc.createDataChannel("chat");

// Peer B (receiver)
pc.ondatachannel = (event) => {
  const dataChannel = event.channel;
};

Experiment 4: ICE Candidate Analysis and Path Selection

Objective: Understand how WebRTC evaluates multiple paths and picks the best one.

Setup: Desktop and mobile, STUN enabled, analyzing the full candidate list.

What I observed


Peer A (desktop):

  • Local candidates: 2 host, 2 srflx

  • Remote candidates: 1 host, 2 srflx

Peer B (mobile):

  • Local candidates: 4 host, 2 srflx

  • Remote candidates: 1 host, 1 srflx

The higher number of host candidates on the mobile device likely reflects multiple network interfaces — cellular, WiFi, and possibly a loopback — each generating its own candidate.

The selected candidate pair was:

local host ↔ remote host

Why this path was chosen — and what latency tells us

Even though srflx candidates were available (discovered via STUN), WebRTC selected a direct host ↔ host path. ICE evaluates candidate pairs based on three factors:

  1. Connectivity — does the path actually work?

  2. Priority — host candidates are scored higher than srflx, which are scored higher than relay

  3. Round Trip Time (RTT) — lower latency paths are preferred

The host ↔ host path had the lowest RTT. It required no NAT traversal, no STUN relay, no extra hops. For the same reasons, direct P2P connections in these experiments showed significantly lower latency than relay-based paths.

Core insight: WebRTC does not blindly use STUN-discovered public paths. It evaluates all available paths and selects the most efficient working route. When a direct local path exists, it wins.


Experiment 5: TURN Relay Behavior — And an Unexpected Result

Objective: Force WebRTC to use a TURN relay server and observe the behavior.

Setup: Configured a public TURN server and forced relay-only mode:

iceTransportPolicy: "relay"

What I observed

This one surprised me.

  • No relay candidates were generated

  • onicecandidateerror fired with error code 701

  • The TURN server was unreachable from my local network

  • Despite setting iceTransportPolicy: "relay", the connection succeeded anyway — using host and srflx candidates

Why this happened

The public TURN server I configured was inaccessible — maybe it was blocked by my local network's firewall or the public Turn server was unreachable . With no relay candidates available, ICE fell back to other paths.

This revealed something important about TURN that's often glossed over:

TURN does not guarantee connectivity by itself. It introduces an external dependency that can fail due to firewall rules, UDP/TCP blocking, or server reliability issues.

In a production system, you would run your own TURN server (or use a reliable hosted service) and ensure it's accessible from the networks your users are on. A TURN server that your users can't reach is no TURN server at all.


Real-World Applications

WebRTC is the technology behind more applications than most people realize:

Video conferencing — Google Meet, Jitsi, and many others use WebRTC for their peer-to-peer video streams. The direct connection means lower latency and less load on their servers.

Live streaming and broadcasting — Platforms that need ultra-low latency (under 500ms) use WebRTC where traditional HLS/DASH protocols introduce seconds of delay.

Peer-to-peer file sharing — Services like Snapdrop use WebRTC DataChannels to transfer files directly between browsers on the same network, with no file ever touching a server.

Online multiplayer gaming — DataChannels over UDP give game developers a real-time transport layer built into the browser.

Collaborative tools — Screen sharing, whiteboarding, and document collaboration apps use WebRTC for their real-time sync.


What Was Used

All experiments were conducted using:

  • Node.js with Express as the signaling server

  • Socket.IO for real-time signaling message exchange

  • Browser WebRTC APIs (RTCPeerConnection, RTCDataChannel)

  • brave://webrtc-internals (available in Brave and Chrome as chrome://webrtc-internals) for deep inspection of ICE candidates, DTLS state, and connection logs

The full source code for all experiments is available on GitHub.


Conclusion

WebRTC is deceptively approachable on the surface — a few API calls and you have a working connection. But underneath, it's a carefully orchestrated system handling NAT traversal, encryption, candidate negotiation, and fallback strategies simultaneously.

The experiments surfaced three things I wouldn't have understood from documentation alone:

First, the same-machine environment masks bugs. Code that looks correct in controlled conditions can break silently the moment real network latency is introduced. Always test across actual devices and networks.

Second, STUN and TURN are not optional extras — they are the difference between a demo and a product. Without STUN, cross-network connections fail. Without reliable TURN, some users will never connect.

Third, WebRTC makes smart decisions. It doesn't use your STUN-discovered public IP out of habit — it actively evaluates all paths and picks the fastest one that works. That's ICE doing exactly what it was designed for.

Real-time communication on the web is hard. WebRTC makes it possible — but understanding how it works makes it reliable.