Debugging by OSI Layer: A Step-by-Step Network Troubleshooting Method
When something is broken on the network, the worst thing you can do is start randomly poking. The fastest path to a fix is to work the OSI model from bottom to top — confirm each layer is healthy before assuming the problem is higher up.
The seven-layer checklist
| Layer | What to check | Tools |
|---|---|---|
| 1. Physical | Cable plugged in, link light on, NIC enabled | Eyeball, ip link, ethtool |
| 2. Data Link | MAC visible on switch, no port flapping, correct VLAN | ip neigh, switch logs, arp -a |
| 3. Network | Has IP, gateway reachable, route exists | ip addr, ip route, ping, traceroute |
| 4. Transport | Port open, no firewall drop, TCP handshake completes | ss -tn, nmap, nc -vz |
| 5–6. Session/Pres. | TLS handshake, cert valid, ALPN matches | openssl s_client, browser devtools |
| 7. Application | HTTP status, payload, app-level auth | curl -v, app logs |
The standard sequence
Faced with “I can’t reach the server,” walk this script:
# Layer 3: do I have a working IP?
ip addr show
ip route get 8.8.8.8
# Layer 3: can I reach the gateway?
ping -c 3 $(ip route | awk '/default/ {print $3}')
# Layer 3: can I reach the internet?
ping -c 3 1.1.1.1
# Is DNS working?
dig +short example.com
# Layer 4: is the port open from here?
nc -vz example.com 443
# Layer 7: does HTTP respond?
curl -v https://example.com/health
The first command that fails tells you which layer to dig into. Skipping layers is how 30-minute outages become 4-hour ones.
The “is it me, the network, or the server?” triangle
When debugging a remote service, ask three questions in order:
- Can other people reach it? If yes, the problem is local to you. If no, it’s wider.
- Can you reach other things? If only this one service fails, it’s the service. If everything fails, it’s your network.
- Did anything change recently? Deploy, DNS update, certificate renewal, firewall rule. The answer is almost always “yes.”
Capture when in doubt
If you’ve ruled out the obvious and still don’t know what’s wrong, take a packet capture. tcpdump -i any -w trace.pcap host example.com for two minutes will tell you whether packets are leaving, returning, being reset, or vanishing.
What to learn next
This whole approach assumes you know what each layer is. Revisit OSI vs TCP/IP, then read about specific tools: ping and traceroute, packet capture, and common networking issues.