This topic determine the symptoms and causes of network problems using a layered model. Start learning CCNA 200-301 for free right now!!
Note: Welcome: This topic is part of Module 12 of the Cisco CCNA 3 course, for a better follow up of the course you can go to the CCNA 3 section to guide you through an order.
Table of Contents
Physical Layer Troubleshooting
Now that you have your documentation, some knowledge of troubleshooting methods and the software and hardware tools to use to diagnose problems, you are ready to start troubleshooting! This topic covers the most common issues that you will find when troubleshooting a network.
Issues on a network often present as performance problems. Performance problems mean that there is a difference between the expected behavior and the observed behavior, and the system is not functioning as could be reasonably expected. Failures and suboptimal conditions at the physical layer not only inconvenience users but can impact the productivity of the entire company. Networks that experience these kinds of conditions usually shut down. Because the upper layers of the OSI model depend on the physical layer to function, a network administrator must have the ability to effectively isolate and correct problems at this layer.
The figure summarizes the symptoms and causes of physical layer network problems.
The table lists common symptoms of physical layer network problems.
Performance lower than baseline
Requires previous baselines for comparison.
The most common reasons for slow or poor performance include overloaded or underpowered servers, unsuitable switch or router configurations, traffic congestion on a low-capacity link, and chronic frame loss.
Loss of connectivity
Loss of connectivity could be due to a failed or disconnected cable.
Can be verified using a simple ping test.
Intermittent connectivity loss can indicate a loose or oxidized connection.
Network bottlenecks or congestion
If a router, interface, or cable fails, routing protocols may redirect traffic to other routes that are not designed to carry the extra capacity.
This can result in congestion or bottlenecks in parts of the network.
High CPU utilization rates
High CPU utilization rates are a symptom that a device, such as a router, switch, or server, is operating at or exceeding its design limits.
If not addressed quickly, CPU overloading can cause a device to shut down or fail.
Console error messages
Error messages reported on the device console could indicate a physical layer problem.
Console messages should be logged to a central syslog server.
The next table lists issues that commonly cause network problems at the physical layer.
This is the most fundamental reason for network failure.
Check the operation of the fans and ensure that the chassis intake and exhaust vents are clear.
If other nearby units have also powered down, suspect a power failure at the main power supply.
Faulty network interface cards (NICs) can be the cause of network transmission errors due to late collisions, short frames, and jabber.
Jabber is often defined as the condition in which a network device continually transmits random, meaningless data onto the network.
Other likely causes of jabber are faulty or corrupt NIC driver files, bad cabling, or grounding problems.
Many problems can be corrected by simply reseating cables that have become partially disconnected.
When performing a physical inspection, look for damaged cables, improper cable types, and poorly crimped RJ-45 connectors.
Suspect cables should be tested or exchanged with a known functioning cable.
Attenuation can be caused if a cable length exceeds the design limit for the media, or when there is a poor connection resulting from a loose cable, or dirty or oxidized contacts.
If attenuation is severe, the receiving device cannot always successfully distinguish one bit in the data stream from another bit.
Local electromagnetic interference (EMI) is commonly known as noise.
Noise can be generated by many sources, such as FM radio stations, police radio, building security, and avionics for automated landing, crosstalk (noise induced by other cables in the same pathway or adjacent cables), nearby electric cables, devices with large electric motors, or anything that includes a transmitter more powerful than a cell phone.
Interface configuration errors
Many things can be misconfigured on an interface to cause it to go down, such as incorrect clock rate, incorrect clock source, and interface not being turned on.
This causes a loss of connectivity with attached network segments.
Exceeding design limits
A component may be operating sub-optimally at the physical layer because it is being utilized beyond specifications or configured capacity.
When troubleshooting this type of problem, it becomes evident that resources for the device are operating at or near the maximum capacity and there is an increase in the number of interface errors.
Symptoms include processes with high CPU utilization percentages, input queue drops, slow performance, SNMP timeouts, no remote access, or services such as DHCP, Telnet, and ping are slow or fail to respond.
On a switch the following could occur: spanning tree reconvergence, EtherChannel links bounce, UDLD flapping, IP SLAs failures.
For routers, there could be no routing updates, route flapping, or HSRP flapping.
One of the causes of CPU overload in a router or switch is high traffic.
If one or more interfaces are regularly overloaded with traffic, consider redesigning the traffic flow in the network or upgrading the hardware.
Data Link Layer Troubleshooting
Troubleshooting Layer 2 problems can be a challenging process. The configuration and operation of these protocols are critical to creating a functional, well-tuned network. Layer 2 problems cause specific symptoms that, when recognized, will help identify the problem quickly.
The figure summarizes the symptoms and causes of data link layer network problems.
The table lists common symptoms of data link layer network problems.
No functionality or connectivity at the network layer or above
Some Layer 2 problems can stop the exchange of frames across a link, while others only cause network performance to degrade.
Network is operating below baseline performance levels
There are two distinct types of suboptimal Layer 2 operation that can occur in a network.
First, the frames take a suboptimal path to their destination but do arrive causing the network to experience unexpected high-bandwidth usage on links.
Second, some frames are dropped as identified through error counter statistics and console error messages that appear on the switch or router.
An extended or continuous ping can help reveal if frames are being dropped.
Operating systems use broadcasts and multicasts extensively to discover network services and other hosts.
Generally, excessive broadcasts are the result of a poorly programmed or configured applications, a large Layer 2 broadcast domain, or an underlying network problem (e.g., STP loops or route flapping).
A router recognizes that a Layer 2 problem has occurred and sends alert messages to the console.
Typically, a router does this when it detects a problem with interpreting incoming frames (encapsulation or framing problems) or when keepalives are expected but do not arrive.
The most common console message that indicates a Layer 2 problem is a line protocol down message
The table lists issues that commonly cause network problems at the data link layer.
An encapsulation error occurs because the bits placed in a field by the sender are not what the receiver expects to see.
This condition occurs when the encapsulation at one end of a WAN link is configured differently from the encapsulation used at the other end.
Address mapping errors
In topologies, such as point-to-multipoint or broadcast Ethernet, it is essential that an appropriate Layer 2 destination address be given to the frame. This ensures its arrival at the correct destination.
To achieve this, the network device must match a destination Layer 3 address with the correct Layer 2 address using either static or dynamic maps.
In a dynamic environment, the mapping of Layer 2 and Layer 3 information can fail because devices may have been specifically configured not to respond to ARP requests, the Layer 2 or Layer 3 information that is cached may have physically changed, or invalid ARP replies are received because of a misconfiguration or a security attack.
Frames usually work in groups of 8-bit bytes.
A framing error occurs when a frame does not end on an 8-bit byte boundary.
When this happens, the receiver may have problems determining where one frame ends, and another frame starts.
Too many invalid frames may prevent valid keepalives from being exchanged.
Framing errors can be caused by a noisy serial line, an improperly designed cable (too long or not properly shielded), faulty NIC, duplex mismatch, or an incorrectly configured channel service unit (CSU) line clock.
STP failures or loops
The purpose of the Spanning Tree Protocol (STP) is to resolve a redundant physical topology into a tree-like topology by blocking redundant ports.
Most STP problems are related to forwarding loops that occur when no ports in a redundant topology are blocked and traffic is forwarded in circles indefinitely. This causes excessive flooding because of a high rate of STP topology changes.
A topology change should be a rare event in a well-configured network.
When a link between two switches goes up or down, there is eventually a topology change when the STP state of the port is changing to or from forwarding.
However, when a port is flapping (oscillating between up and down states), this causes repetitive topology changes and flooding, or slow STP convergence or re-convergence.
This can be caused by a mismatch between the real and documented topology, a configuration error, such as an inconsistent configuration of STP timers, an overloaded switch CPU during convergence, or a software defect.
Network Layer Troubleshooting
Network layer problems include any problem that involves a Layer 3 protocol, such as IPv4, IPv6, EIGRP, OSPF, etc. The figure summarizes the symptoms and causes of network layer network problems.
The table lists common symptoms of network layer network problems.
Network failure is when the network is nearly or completely non-functional, affecting all users and applications on the network.
These failures are usually noticed quickly by users and network administrators and are obviously critical to the productivity of a company.
Network optimization problems usually involve a subset of users, applications, destinations, or a type of traffic.
Optimization issues can be difficult to detect and even harder to isolate and diagnose.
This is because they usually involve multiple layers, or even a single host computer.
Determining that the problem is a network layer problem can take time.
In most networks, static routes are used in combination with dynamic routing protocols. Improper configuration of static routes can lead to less than optimal routing. In some cases, improperly configured static routes can create routing loops which make parts of the network unreachable.
Troubleshooting dynamic routing protocols requires a thorough understanding of how the specific routing protocol functions. Some problems are common to all routing protocols, while other problems are particular to the individual routing protocol.
There is no single template for solving Layer 3 problems. Routing problems are solved with a methodical process, using a series of commands to isolate and diagnose the problem.
The table lists areas to explore when diagnosing a possible problem involving routing protocols.
General network issues
Often a change in the topology, such as a down link, may have effects on other areas of the network that might not be obvious at the time.
This may include the installation of new routes, static or dynamic, or removal of other routes.
Determine whether anything in the network has recently changed, and if there is anyone currently working on the network infrastructure.
Check for any equipment and connectivity problems, including power problems such as outages and environmental problems (for example, overheating).
Also check for Layer 1 problems, such as cabling problems, bad ports, and ISP problems.
Check the routing table for anything unexpected, such as missing routes or unexpected routes.
Use debug commands to view routing updates and routing table maintenance.
If the routing protocol establishes an adjacency with a neighbor, check to see if there are any problems with the routers forming neighbor adjacencies.
If the routing protocol uses a topology table or database, check the table for anything unexpected, such as missing entries or unexpected entries.
Transport Layer Troubleshooting – ACLs
Network problems can arise from transport layer problems on the router, particularly at the edge of the network where traffic is examined and modified. For instance, both access control lists (ACLs) and Network Address Translation (NAT) operate at the network layer and may involve operations at the transport layer, as shown in the figure.
The most common issues with ACLs are caused by improper configuration, as shown in the figure.
Problems with ACLs may cause otherwise working systems to fail. The table lists areas where misconfigurations commonly occur.
Selection of traffic flow
Traffic is defined by both the router interface through which the traffic is traveling and the direction in which this traffic is traveling.
An ACL must be applied to the correct interface, and the correct traffic direction must be selected to function properly.
Order of access control entries
The entries in an ACL should be from specific to general.
Although an ACL may have an entry to specifically permit a type of traffic flow, packets never match that entry if they are being denied by another entry earlier in the list.
If the router is running both ACLs and NAT, the order in which each of these technologies is applied to a traffic flow is important.
Inbound traffic is processed by the inbound ACL before being processed by outside-to-inside NAT.
Outbound traffic is processed by the outbound ACL after being processed by inside-to-outside NAT.
Implicit deny any
When high security is not required on the ACL, this implicit access control element can be the cause of an ACL misconfiguration.
Addresses and IPv4 wildcard masks
Complex IPv4 wildcard masks provide significant improvements in efficiency but are more subject to configuration errors.
An example of a complex wildcard mask is using the IPv4 address 10.0.32.0 and wildcard mask 0.0.32.15 to select the first 15 host addresses in either the 10.0.0.0 network or the 10.0.32.0 network.
Selection of transport layer protocol
When configuring ACLs, it is important that only the correct transport layer protocols be specified.
Many network administrators, when unsure whether a type of traffic flow uses a TCP port or a UDP port, configure both.
Specifying both opens a hole through the firewall, possibly giving intruders an avenue into the network.
It also introduces an extra element into the ACL, so the ACL takes longer to process, introducing more latency into network communications.
Source and destination ports
Properly controlling the traffic between two hosts requires symmetric access control elements for inbound and outbound ACLs.
Address and port information for traffic generated by a replying host is the mirror image of address and port information for traffic generated by the initiating host.
Use of the established keyword
The established keyword increases the security provided by an ACL.
However, if the keyword is applied incorrectly, unexpected results may occur.
Misconfigured ACLs often cause problems for protocols other than TCP and UDP.
Uncommon protocols that are gaining popularity are VPN and encryption protocols.
The log keyword is a useful command for viewing ACL operation on ACL entries. This keyword instructs the router to place an entry in the system log whenever that entry condition is matched. The logged event includes details of the packet that matched the ACL element. The log keyword is especially useful for troubleshooting and provides information on intrusion attempts being blocked by the ACL.
Transport Layer Troubleshooting – NAT for IPv4
There are several problems with NAT, such as not interacting with services like DHCP and tunneling. These can include misconfigured NAT inside, NAT outside, or ACLs. Other issues include interoperability with other network technologies, especially those that contain or derive information from host network addressing in the packet.
The figure summarizes common interoperability areas with NAT.
The table lists common interoperability areas with NAT.
BOOTP and DHCP
Both protocols manage the automatic assignment of IPv4 addresses to clients.
Recall that the first packet that a new client sends is a DHCP-Request broadcast IPv4 packet.
The DHCP-Request packet has a source IPv4 address of 0.0.0.0.
Because NAT requires both a valid destination and source IPv4 address, BOOTP and DHCP can have difficulty operating over a router running either static or dynamic NAT.
Configuring the IPv4 helper feature can help solve this problem.
Because a router running dynamic NAT is changing the relationship between inside and outside addresses regularly as table entries expire and are recreated, a DNS server outside the NAT router does not have an accurate representation of the network inside the router.
Configuring the IPv4 helper feature can help solve this problem.
Like DNS packets, NAT is unable to alter the addressing information stored in the data payload of the packet.
Because of this, an SNMP management station on one side of a NAT router may not be able to contact SNMP agents on the other side of the NAT router.
Configuring the IPv4 helper feature can help solve this problem.
Tunneling and encryption protocols
Encryption and tunneling protocols often require that traffic be sourced from a specific UDP or TCP port, or use a protocol at the transport layer that cannot be processed by NAT.
For example, IPsec tunneling protocols and generic routing encapsulation protocols used by VPN implementations cannot be processed by NAT.
Application Layer Troubleshooting
Most of the application layer protocols provide user services. Application layer protocols are typically used for network management, file transfer, distributed file services, terminal emulation, and email. New user services are often added, such as VPNs and VoIP.
The figure shows the most widely known and implemented TCP/IP application layer protocols.
The table provides a short description of these application layer protocols.
Enables users to establish terminal session connections with remote hosts.
Supports the exchanging of text, graphic images, sound, video, and other multimedia files on the web.
Performs interactive file transfers between hosts.
Performs basic interactive file transfers typically between hosts and networking devices.
Supports basic message delivery services.
Connects to mail servers and downloads email.
Collects management information from network devices.
Maps IP addresses to the names assigned to network devices.
Network File System (NFS)
Enables computers to mount drives on remote hosts and operate them as if they were local drives. Originally developed by Sun Microsystems, it combines with two other application layer protocols, external data representation (XDR) and remote-procedure call (RPC), to allow transparent access to remote network resources.
The types of symptoms and causes depend upon the actual application itself.
Application layer problems prevent services from being provided to application programs. A problem at the application layer can result in unreachable or unusable resources when the physical, data link, network, and transport layers are functional. It is possible to have full network connectivity, but the application simply cannot provide data.
Another type of problem at the application layer occurs when the physical, data link, network, and transport layers are functional, but the data transfer and requests for network services from a single network service or application do not meet the normal expectations of a user.
A problem at the application layer may cause users to complain that the network or an application that they are working with is sluggish or slower than usual when transferring data or requesting network services.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.