Early media is the audio that flows before a SIP call is answered — ringback, announcements, IVR prompts, custom hold tones. It is enabled by 183 Session Progress with SDP, but interoperability is famously inconsistent. Reliable provisional responses (100rel) were meant to fix this, and partly did.
Early media is RTP audio that flows in a SIP dialog before the call is answered — before any 200 OK response. The dialog exists in early state, the SDP has been negotiated in a provisional response, and audio packets begin flowing.
The most familiar example is network-generated ringback. The caller hears a ring tone, but it is not generated locally by their phone — it is sent as actual audio from a switch, gateway, or carrier. Early media is also how IVR prompts, busy announcements, and PSTN-derived call progress tones reach the caller before they pick up or before the called party answers.
Without early media, the only audio a caller hears before answer is whatever their own phone generates locally. With early media, the network can convey state through audio rather than through SIP signaling alone.
183 Session Progress is the response code that indicates session establishment is in progress and there may be early media to convey. When sent with SDP, it tells the caller's UA to:
The SDP in 183 follows the same offer/answer rules as SDP in 200 OK. The codec negotiated in 183 may be the same as the codec eventually negotiated in 200 OK or different. If different, the call switches codecs at answer.
180 Ringing can also carry SDP for early media, though this is less common. The semantic distinction is that 180 specifically means “the called endpoint is alerting the user” while 183 is more generic.
Provisional responses (1xx) are normally sent unreliably. The originator of an INVITE keeps retransmitting until a final response (2xx-6xx). Provisional responses come in along the way but are not retransmitted by the responder.
This causes problems for early media. If the 183 with SDP is lost in transit, the caller's UA does not know the SDP existed, does not open the RTP port, and the early media RTP that follows lands on a closed socket and is dropped. The caller hears silence.
RFC 3262 introduced reliable provisional responses to fix this. The mechanism:
This makes 183 with SDP reliable, eliminating the “lost 183” problem. But 100rel adds complexity and is not universally supported. Many older endpoints reject 100rel-required responses with 420 Bad Extension, breaking the call.
The carrier or PBX generates the ringback tone instead of the caller's phone. Allows for regional variations, custom carrier branding, and consistent quality regardless of caller's phone capabilities.
SS7-to-SIP interworking carries call progress tones (busy, fast busy, reorder, special information tones) into the SIP world. These are audio events that callers hear before the call connects or fails.
Some applications (banking auth challenges, toll-charge announcements, DTMF prompts before connecting) play audio to the caller without billing as an answered call. The PSTN does not consider the call answered until 200 OK; early media via 183 stays unanswered legally.
“This call may be recorded”, “Please hold while we connect you”, and similar announcements often arrive as early media before the call is bridged to a live agent.
The voicemail system answers, plays the greeting, but technically the dialog is in early state with the prompt as early media. The 200 OK comes later when the system is ready to record.
The diagnostic flow:
Early media is RTP audio that flows in a SIP dialog before the call is answered with 200 OK. It enables network-generated ringback tones, PSTN call progress tones, IVR prompts, and pre-call announcements. Early media is offered via 183 Session Progress with SDP — the caller's UA opens an RTP socket on the offered port and renders the incoming audio.
180 Ringing specifically means the called endpoint is alerting the user — the phone is physically ringing. 183 Session Progress is more generic; it means call setup is in progress and there may be early media (ringback, announcement, busy tone) to convey via SDP. 180 without SDP triggers local ringback on the caller's phone; 183 with SDP triggers in-band ringback delivered over RTP.
100rel is reliable provisional responses, defined in RFC 3262. It makes 1xx responses reliable through PRACK acknowledgment. This matters for early media because 183 Session Progress is normally sent unreliably — if lost in transit, the caller never opens the RTP socket and hears silence. 100rel ensures 183 arrives. The downside is that some endpoints do not support 100rel and reject calls that require it with 420 Bad Extension.
Paste your SIP trace into SIPSymposium. The analyzer correlates 183 Session Progress, 100rel negotiation, and RTP timing to identify exactly why early media is failing.