VoIP Symptom

Echo on VoIP Calls

6 min read  ·  Updated April 2026

Echo on VoIP calls is when the speaker hears their own voice repeated back to them with a delay. The cause is usually one of three things — acoustic feedback, hybrid coupling at a TDM-to-IP boundary, or network jitter that defeats echo cancellation. Each has a different fix.

In this guide

1. The three types of echo

Echo in voice calls comes in three distinct forms, each with different mechanics:

Acoustic echo

The speaker's microphone picks up audio from their own speaker and transmits it back to the far end. The far end hears their own voice with a round-trip delay. Caused by speakerphone use, poor handset acoustics, or feedback paths in the room.

Hybrid echo

At a 4-wire to 2-wire conversion point (any TDM-to-IP boundary touching the PSTN), some of the outbound signal leaks into the inbound path due to imperfect impedance matching. The far end hears their own voice reflected from the hybrid. Common in PRI gateways, FXO ports, and SIP-to-PSTN bridges.

Network echo

Echo cancellation depends on consistent timing. If RTP packets arrive with high jitter, late, or out of order, the canceller's reference signal does not align with the actual echo, and cancellation fails. The echo is not generated by the network — it is acoustic or hybrid in origin — but the network causes the canceller to fail.

2. How echo cancellation works

Echo cancellation works by maintaining a model of the echo path. The canceller knows what audio it just sent (the reference signal). It compares incoming audio against a delayed version of the reference. Anything that matches is identified as echo and subtracted from the incoming stream.

For this to work, three conditions must hold:

  1. The canceller has access to the outbound reference signal — usually true at endpoints, sometimes lost across network boundaries
  2. The delay between sending and the echo arriving is stable enough to model — jitter degrades this
  3. The echo path is mostly linear — nonlinearities (clipping, compression artifacts) reduce cancellation effectiveness

Modern echo cancellers in handsets and SBCs handle round-trip delays up to about 128ms with good cancellation. Above that, residual echo becomes audible. Above 250ms, cancellation is essentially absent and the user hears full echo.

3. Acoustic echo

Acoustic echo originates at the user's environment. The speaker plays audio, the microphone picks it up, the audio travels back to the far end. The far end hears themselves.

Cause 01
Speakerphone with no AEC
Cheap speakerphones or speakerphone modes on phones with weak acoustic echo cancellation (AEC) feed the speaker output directly into the microphone. Common on mobile devices in low-quality calls, and on conference room speakerphones not designed for VoIP.
Cause 02
Headset microphone too close to ear cup
Some headsets have boom mics positioned to pick up significant audio from the ear cup speaker. Common with consumer-grade headsets used for business VoIP.
Cause 03
High playback volume
Cranking the speaker or handset volume increases acoustic feedback to the microphone. The AEC can handle a range, but at high volumes it cannot cancel everything.
Cause 04
Hands-free in a small reverberant room
Hard surfaces and small spaces produce strong acoustic reflections. AEC models direct paths well; reflections create longer-tail echo that overwhelms the canceller.

Fixes are physical: switch to a headset, lower volume, move the microphone, treat the room, or use a phone with stronger AEC.

4. Hybrid echo at TDM-IP boundaries

Hybrid echo happens at impedance-mismatched boundaries between IP and analog or TDM voice. The classic case is an FXO port connecting a SIP PBX to an analog phone line. The 4-wire IP side has separate transmit and receive paths; the 2-wire analog side has them combined. The transformer that bridges them (the “hybrid”) leaks some transmit signal into the receive direction.

How much leaks depends on impedance match. Perfect match would leak nothing; real-world line impedance varies and the match is always imperfect. The leaked signal travels back to the IP side as audible echo to whoever is on the IP end of the call.

Hybrid echo can be cancelled effectively when:

Most modern PRI/FXO gateways and SBCs include hardware or DSP-based echo cancellation specifically for hybrid echo. If echo persists at a known TDM boundary, check the gateway's EC settings, especially the tail length parameter.

5. Network-induced echo

Network echo is misnamed — the echo is acoustic or hybrid, but the network defeats the canceller. The mechanism:

  1. Echo canceller models the round-trip delay based on observed timing
  2. Network jitter increases the variance of arrival times
  3. The canceller cannot lock onto a stable delay
  4. Echo passes through partially cancelled or completely uncancelled

Less directly, packet loss and reordering can cause the canceller to drift. The reference signal is lost or arrives out of order, the model becomes incorrect, and echo leaks until the model recovers.

Echo that worsens during network congestion, varies over a single call, or correlates with high jitter is network-induced. The fix is at the network level — reducing jitter, applying QoS, prioritizing RTP, or upgrading congested links.

Codec choice also matters. Codecs that handle packet loss gracefully (Opus with FEC, G.722 with PLC) preserve echo cancellation reference quality better than older codecs. G.729 is particularly bad in this regard because the heavy compression reduces the canceller's ability to distinguish signal from echo.

6. Diagnosing echo from a call

Echo diagnosis depends on identifying which side hears it and what triggers it:

For persistent echo, the classic isolation method is to swap one variable at a time: different phone, different headset, different network path, different codec. Whatever change eliminates echo localizes the source.

Frequently asked questions

What causes echo on VoIP calls?

Echo on VoIP calls comes from three sources: acoustic feedback (speaker audio picked up by microphone), hybrid coupling at TDM-to-IP boundaries (analog/PRI gateways with imperfect impedance matching), and echo cancellation failures caused by network jitter or packet loss. Each type has a different fix — acoustic is usually a hardware or volume issue, hybrid is a gateway echo canceller setting, and network-induced echo is a QoS or congestion issue.

Why do I only hear echo on calls to landlines?

Landline echo is hybrid echo. At the IP-to-PSTN boundary, the FXO or PRI gateway converts between 4-wire (separate transmit and receive) and 2-wire (combined) audio paths. Imperfect impedance matching causes some of the outbound signal to leak back into the inbound path. The fix is on the gateway: enable hardware echo cancellation with an appropriate tail length (32 to 128ms depending on line characteristics).

Can network problems cause echo on VoIP?

Yes, indirectly. Network jitter and packet loss do not generate echo themselves, but they prevent the echo canceller from working correctly. The canceller needs stable timing to model the echo path; high jitter makes the model inaccurate and echo passes through uncancelled. Echo that varies during a call or worsens with congestion is usually network-induced. The fix is reducing jitter and applying QoS to RTP traffic.

Tracing echo back to its source?

Paste your SIP trace into SIPSymposium. The analyzer correlates RTP jitter, codec negotiation, and call quality metrics to help identify whether echo is acoustic, hybrid, or network-induced.

Analyze my trace Create free account
Related guides