How To Wiki

How to Find the Root Cause of a Network Outage

1,795pages on
this wiki
Add New Page
Add New Page Talk0



Here are some quick steps that you can take in order to attempt to detect the location of a network fault.

This example shows us a simple network configuration. We have a headquarters with a server farm. This is also where the network management system is located on the network. We also have two remote sites. Each site has users, and it is connected through an MPLS virtual private network:

In this situation, let’s say that a syslog message comes into you main router at headquarters, identifying an internet issue. It tells you that the connection to “Site A” is down. Your NMS is showing green for that connection.

What should you do?

1. Think critically. Where is this data being pulled from? It is possible that your remote site is actually fine. Should you trust a syslog message? Or should you collect polling data into your NMS? Using your NMS, poll “Site A” to see if the router or switch is down. 2. In this instance, the router and switch is down:

You must now determine whether or not it is a WAN outage that you are dealing with. 3. In order to do that, find out from the router located at headquarters whether the connection there can be made to the “Site A” router. This is known as checking an adjacent interface:

4. Once you have gotten this far, you need to start thinking about recent changes that have occurred on your network. Oftentimes, you can trace a problem back to a recent change if the issue is not a WAN problem. This can be a misconfiguration on a router, switch or other network device. If you have taken proper backup steps, it shouldn’t be a problem to revert those changes.

These few critical steps are helpful to track down a network fault. In the case of it being the MPLS, you have problems on your node. Otherwise, start checking for network configuration changes on your network right away.

Learn more on how to use a network performance monitor to detect faults. Having your tools properly configured is a big step toward being prepared for a problem.


Also on Fandom

Random Wiki