Networking Issues: Diagnose Them Fast

Most production network problems fall into a handful of repeating patterns. Recognising the pattern from a few symptoms is the difference between a 10-minute fix and a 2-hour war room.

“It’s slow” — the latency family

Slowness has many causes. The right question is “slow where?”:

High RTT on ping — the underlying network path is long or congested. Check traceroute for the hop where latency jumps.
Fast ping but slow page load — your server is slow, or DNS resolution is slow, or you’re TLS-handshaking on every request.
Fast first byte, slow download — bandwidth limited, or TCP window is too small, or there’s packet loss triggering slow start.

“It’s broken” — the connectivity family

The classic symptoms and their usual cause:

Symptom	Likely cause
“Connection refused”	Service is down or not listening on that port
“Connection timed out”	Firewall silently dropping packets
“No route to host”	Routing table is broken
“Connection reset by peer”	Server crashed mid-connection or load balancer killed the conn
“Name or service not known”	DNS lookup failed
“SSL_ERROR” / “certificate has expired”	Cert issue (expired, wrong hostname, wrong CA)

“It works for some users” — the partial-failure family

The hardest bugs. Common causes:

DNS propagation — you changed an A record and only some resolvers see the new value yet.
Stale CDN cache — one PoP serving old content, others serving new.
Per-region failure — only one AZ is broken; users routed to others are fine.
MTU mismatch — small requests work, large ones fail. Pings work but file transfers hang.
Asymmetric routing — packets go out one path, return another, stateful firewall drops them.

The “it suddenly stopped working” rule

If a system worked yesterday and doesn’t today, something changed. The change is almost never random hardware failure — it’s a deploy, a config push, a certificate expiry, a DNS change, a quota hit, or someone else’s outage upstream. Find the change first; debugging from scratch wastes hours.

A debugging starter pack

# Reachability
ping -c 4 target
mtr target                          # combined ping + traceroute, in real time

# DNS
dig +trace target
dig @8.8.8.8 target                 # is it just my resolver?

# TCP
nc -vz target 443
ss -tnp | grep target               # local connection state

# TLS
openssl s_client -connect target:443 -servername target

# HTTP
curl -v -o /dev/null https://target
curl -w "@curl-format.txt" ...      # detailed timing breakdown

# Capture
sudo tcpdump -i any -nn -w trace.pcap host target

What to learn next

For a structured approach to network debugging, see debugging by OSI layer. For specific tools, deep-dive into ping and traceroute, packet capture, and finally network performance tuning.

Common Networking Issues and How to Diagnose Them Fast

“It’s slow” — the latency family

“It’s broken” — the connectivity family

“It works for some users” — the partial-failure family

The “it suddenly stopped working” rule

A debugging starter pack

What to learn next

Hubs, Switches, and Routers Explained

dig and nslookup — DNS Troubleshooting

Cloud VPC Networking: Subnets, Routes, and Security Groups Explained

Modern Load Balancing: L4 vs L7, Algorithms, and Health Checks

WPA2 vs WPA3 — Wi-Fi Security Explained

WebSockets and Server-Sent Events Explained

Leave a Reply Cancel reply

“It’s slow” — the latency family

“It’s broken” — the connectivity family

“It works for some users” — the partial-failure family

The “it suddenly stopped working” rule

A debugging starter pack

What to learn next

Similar Posts

Leave a Reply Cancel reply