Common Networking Issues and How to Diagnose Them Fast
Most production network problems fall into a handful of repeating patterns. Recognising the pattern from a few symptoms is the difference between a 10-minute fix and a 2-hour war room.
“It’s slow” — the latency family
Slowness has many causes. The right question is “slow where?”:
- High RTT on
ping— the underlying network path is long or congested. Checktraceroutefor the hop where latency jumps. - Fast ping but slow page load — your server is slow, or DNS resolution is slow, or you’re TLS-handshaking on every request.
- Fast first byte, slow download — bandwidth limited, or TCP window is too small, or there’s packet loss triggering slow start.
“It’s broken” — the connectivity family
The classic symptoms and their usual cause:
| Symptom | Likely cause |
|---|---|
| “Connection refused” | Service is down or not listening on that port |
| “Connection timed out” | Firewall silently dropping packets |
| “No route to host” | Routing table is broken |
| “Connection reset by peer” | Server crashed mid-connection or load balancer killed the conn |
| “Name or service not known” | DNS lookup failed |
| “SSL_ERROR” / “certificate has expired” | Cert issue (expired, wrong hostname, wrong CA) |
“It works for some users” — the partial-failure family
The hardest bugs. Common causes:
- DNS propagation — you changed an A record and only some resolvers see the new value yet.
- Stale CDN cache — one PoP serving old content, others serving new.
- Per-region failure — only one AZ is broken; users routed to others are fine.
- MTU mismatch — small requests work, large ones fail. Pings work but file transfers hang.
- Asymmetric routing — packets go out one path, return another, stateful firewall drops them.
The “it suddenly stopped working” rule
If a system worked yesterday and doesn’t today, something changed. The change is almost never random hardware failure — it’s a deploy, a config push, a certificate expiry, a DNS change, a quota hit, or someone else’s outage upstream. Find the change first; debugging from scratch wastes hours.
A debugging starter pack
# Reachability
ping -c 4 target
mtr target # combined ping + traceroute, in real time
# DNS
dig +trace target
dig @8.8.8.8 target # is it just my resolver?
# TCP
nc -vz target 443
ss -tnp | grep target # local connection state
# TLS
openssl s_client -connect target:443 -servername target
# HTTP
curl -v -o /dev/null https://target
curl -w "@curl-format.txt" ... # detailed timing breakdown
# Capture
sudo tcpdump -i any -nn -w trace.pcap host target
What to learn next
For a structured approach to network debugging, see debugging by OSI layer. For specific tools, deep-dive into ping and traceroute, packet capture, and finally network performance tuning.