Troubleshooting Ethernet collisions

Troubleshooting Ethernet collisions

Level: - Intermediate
Platform: - Cisco catalyst switches and routers
Keywords: - meaning of different types of Ethernet collisions
Reference: - cisco.com
Author: - Surender Singh

This document provides an overview of the different counters related to Ethernet collisions and how to troubleshoot problems with Ethernet collisions

Note: The information in this document only applies to half-duplex Ethernet. In full-duplex Ethernet, collision detection is disabled.

What Are Collisions?

A collision is the mechanism used by Ethernet to control access and allocate shared bandwidth among stations that want to transmit at the same time on a shared medium. Because the medium is shared, a mechanism must exist where two stations can detect that they want to transmit at the same time. This mechanism is collision detection.
Ethernet uses CSMA/CD (Carrier Sense Multiple Access/Collision Detect) as its collision detection method. Here is a simplified example of Ethernet operation:

  1. Station A wishes to send a frame. First, it checks if the medium is available (Carrier Sense). If it isn't, it waits until the current sender on the medium has finished.
  2. Suppose Station A believes the medium is available and attempts to send a frame. Because the medium is shared (Multiple Access), other senders might also attempt to send at the same time. At this point, Station B tries to send a frame at the same time as Station A.
  3. Shortly after, Station A and Station B realize that there is another device attempting to send a frame (Collision Detect). Each station waits for a random amount of time before sending again (using the exponential backoff algorithm) leaving cisco.com. The time after the collision is divided into time slots; Station A and Station B each pick a random slot for attempting a retransmission.
  4. Should Station A and Station B attempt to retransmit in the same slot, they extend the number of slots. Each station then picks a new slot, thereby decreasing the probability of retransmitting in the same slot.

In sum, collisions are a way to distribute the traffic load over time by arbitrating access to the shared medium. Collisions are not bad; they are essential to correct Ethernet operation.

Some useful facts:

  • The maximum amount of time slots is limited to 1024.
  • The maximum amount of retransmissions for the same frame in the collision mechanism is 16.

The Deferred Counter

Here's an example of output from the show interface command:

router#show interface ethernet 0
Ethernet0 is up, line protocol is up 
  Hardware is Lance, address is 0010.7b36.1be8 (bia 0010.7b36.1be8)
  Internet address is 10.200.40.74/22
  MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:06, output hang never
  Last clearing of "show interface" counters never
  Input queue: 1/75/1/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: random early detection(RED)
  5 minute input rate 1000 bits/sec, 2 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     2058015 packets input, 233768993 bytes, 1 no buffer
     Received 1880947 broadcasts, 0 runts, 0 giants, 1 throttles
     3 input errors, 0 CRC, 0 frame, 0 overrun, 3 ignored
     0 input packets with dribble condition detected
     298036 packets output, 32280269 bytes, 0 underruns
     0 output errors, 10 collisions, 0 interface resets
     0 babbles, 0 late collision, 143 deferred
     0 lost carrier, 0 no carrier
     0 output buffer failures, 0 output buffers swapped out

The deferred counter counts the number of times the interface has tried to send a frame, but found the carrier busy at the first attempt (Carrier Sense). This does not constitute a problem, and is part of normal Ethernet operation.

The Collisions Counter

Here's another example of output from the show interface command:

router#show interface ethernet 0
Ethernet0 is up, line protocol is up 
  Hardware is Lance, address is 0010.7b36.1be8 (bia 0010.7b36.1be8)
  Internet address is 10.200.40.74/22
  MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:06, output hang never
  Last clearing of "show interface" counters never
  Input queue: 1/75/1/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: random early detection(RED)
  5 minute input rate 1000 bits/sec, 2 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     2058015 packets input, 233768993 bytes, 1 no buffer
     Received 1880947 broadcasts, 0 runts, 0 giants, 1 throttles
     3 input errors, 0 CRC, 0 frame, 0 overrun, 3 ignored
     0 input packets with dribble condition detected
     298036 packets output, 32280269 bytes, 0 underruns
     0 output errors, 10 collisions, 0 interface resets
     0 babbles, 0 late collision, 143 deferred
     0 lost carrier, 0 no carrier
     0 output buffer failures, 0 output buffers swapped out

As explained above, collisions do not constitute a problem. The collisions counter counts the number of frames for which one or more collisions occurred when trying to send them.
The collisions counter can be broken down into single collisions and multiple collisions, as in this output from the show controller command:

8 single collisions, 2 multiple collisions

This means that eight (out of 10) frames have been successfully transmitted after one collision; the two other frames required multiple collisions to arbitrate access to the medium.
An increasing collision rate (number of packets output divided by the number of collisions) doesn't indicate a problem: it is merely an indication of a higher offered load to the network. An example of this could be because another station was added to the network.
There is no set limit for "how many collisions are bad" or a maximum collision rate.
As a conclusion, the collisions counter doesn't provide a very useful statistic when it comes to analyzing network performance or problems.

Late Collisions

To allow collision detection to work properly, the time period in which collisions are detected is restricted (512 bit-times). For Ethernet, this is 51.2us (microseconds), and for Fast Ethernet, 5.12us. For Ethernet stations, collisions can be detected up to 51.2 microseconds after the beginning of the transmission, or in other words: up to the 512th bit of the frame.

When a collision is detected by a station after it has sent the 512th bit of its frame, this is counted as a late collision.
Late collisions are reported by the following error messages:

%AMDP2_FE-5-LATECOLL: AMDP2/FE 0/0/[dec], Late collision 
%DEC21140-5-LATECOLL: [chars] transmit error 
%ILACC-5-LATECOLL: Unit [DEC], late collision error 
%LANCE-5-LATECOLL: Unit [DEC], late collision error 
%PQUICC-5-LATECOLL: Unit [DEC], late collision error 
%PQUICC_ETHER-5-LATECOLL: Unit [DEC], late collision error 
%PQUICC_FE-5-LATECOLL: PQUICC/FE([DEC]/[DEC]), Late collision    
%QUICC_ETHER-5-LATECOLL: Unit [DEC], late collision error 

The exact error message depends on the platform. You can check the number of excessive collisions in the output of a show interface ethernet [interface number] command.

router#show interface ethernet 0
Ethernet0 is up, line protocol is up 
  Hardware is Lance, address is 0010.7b36.1be8 (bia 0010.7b36.1be8)
  Internet address is 10.200.40.74/22
  MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:06, output hang never
  Last clearing of "show interface" counters never
  Input queue: 1/75/1/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: random early detection(RED)
  5 minute input rate 1000 bits/sec, 2 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     2058015 packets input, 233768993 bytes, 1 no buffer
     Received 1880947 broadcasts, 0 runts, 0 giants, 1 throttles
     3 input errors, 0 CRC, 0 frame, 0 overrun, 3 ignored
     0 input packets with dribble condition detected
     298036 packets output, 32280269 bytes, 0 underruns
     0 output errors, 10 collisions, 0 interface resets
     0 babbles, 0 late collision, 143 deferred
     0 lost carrier, 0 no carrier
     0 output buffer failures, 0 output buffers swapped out

Note that the station reporting the late collision is merely indicating the problem; it is generally not the cause of the problem. Possible causes are usually incorrect cabling or a non-compliant number of hubs in the network. Bad network interface cards (NICs) can also cause late collisions.

Excessive Collisions

As discussed before, the maximum number of retries in the backoff algorithm is set to 16. This means that if an interface fails to allocate a slot in which it can transmit its frame without another collision for 16 times, it gives up. The frame is simply not transmitted, and is marked as an excessive collision.
Excessive collisions are reported by the following error messages:

%AMDP2_FE-5-COLL: AMDP2/FE 0/0/[DEC], Excessive collisions, TDR=[DEC], TRC=[DEC] 
%DEC21140-5-COLL: [chars] excessive collisions 
%ILACC-5-COLL: Unit [DEC], excessive collisions. TDR=[DEC] 
%LANCE-5-COLL: Unit [DEC], excessive collisions. TDR=[DEC]    
%PQUICC-5-COLL: Unit [DEC], excessive collisions. Retry limit [DEC] exceeded 
%PQUICC_ETHER-5-COLL: Unit [DEC], excessive collisions. Retry limit [DEC]
%PQUICC_FE-5-COLL: PQUICC/FE([DEC]/[DEC]), Excessive collisions, TDR=[DEC]
%QUICC_ETHER-5-COLL: Unit [DEC], excessive collisions. Retry limit  [DEC]

The exact error message depends on the platform.
Note: The Transmit Retry Count (TRC) counter is a 4-bit field which indicates the number of transmit retries of the associated packet. The maximum count is fifteen. However, if a Retry Error occurs, the count rolls over to zero. In this case only, the TRC value of zero should be interpreted as meaning sixteen. TRC is written by the controller into the last transmit descriptor of a frame, or when an error terminates a frame.
Note: The time delay reflectometer (TDR) counter is an internal counter that counts the time (in ticks of 100 nanoseconds (ns) each) from the start of a transmission to the occurrence of a collision. Because a transmission travels about 35 feet per tick, this value is useful to determine the approximate distance to a cable fault.
You can check the number of excessive collisions in the output of a show controller ethernet [interface number] command.

router#show controller ethernet 0
LANCE unit 0, idb 0xFA6C4, ds 0xFC218, regaddr = 0x2130000, reset_mask 0x2
IB at 0x606E64: mode=0x0000, mcfilter 0000/0000/0100/0000
station address 0010.7b36.1be8  default station address 0010.7b36.1be8
buffer size 1524
RX ring with 16 entries at 0x606EA8
Rxhead = 0x606EC8 (4), Rxp = 0xFC244 (4)
00 pak=0x0FCBF4 Ds=0x60849E status=0x80 max_size=1524 pak_size=66
01 pak=0x10087C Ds=0x6133B6 status=0x80 max_size=1524 pak_size=66
02 pak=0x0FDE94 Ds=0x60BA7E status=0x80 max_size=1524 pak_size=203
03 pak=0x100180 Ds=0x611F82 status=0x80 max_size=1524 pak_size=66
04 pak=0x0FD09C Ds=0x609216 status=0x80 max_size=1524 pak_size=66
05 pak=0x0FE590 Ds=0x60CEB2 status=0x80 max_size=1524 pak_size=66
06 pak=0x100AD0 Ds=0x613A72 status=0x80 max_size=1524 pak_size=66
07 pak=0x0FD9EC Ds=0x60AD06 status=0x80 max_size=1524 pak_size=66
08 pak=0x0FF830 Ds=0x610492 status=0x80 max_size=1524 pak_size=348
09 pak=0x1003D4 Ds=0x61263E status=0x80 max_size=1524 pak_size=343
10 pak=0x0FEA38 Ds=0x60DC2A status=0x80 max_size=1524 pak_size=66
11 pak=0x100D24 Ds=0x61412E status=0x80 max_size=1524 pak_size=64
12 pak=0x0FC74C Ds=0x607726 status=0x80 max_size=1524 pak_size=64
13 pak=0x0FD798 Ds=0x60A64A status=0x80 max_size=1524 pak_size=66
14 pak=0x0FE7E4 Ds=0x60D56E status=0x80 max_size=1524 pak_size=64
15 pak=0x0FD2F0 Ds=0x6098D2 status=0x80 max_size=1524 pak_size=66
TX ring with 4 entries at 0x606F68, tx_count = 0
TX_head = 0x606F80 (3), head_txp = 0xFC294 (3)
TX_tail = 0x606F80 (3), tail_txp = 0xFC294 (3)
00 pak=0x000000 Ds=0x63491E status=0x03 status2=0x0000 pak_size=332
01 pak=0x000000 Ds=0x634FDA status=0x03 status2=0x0000 pak_size=327
02 pak=0x000000 Ds=0x630A9E status=0x03 status2=0x0000 pak_size=60
03 pak=0x000000 Ds=0x630A9E status=0x03 status2=0x0000 pak_size=60
3 missed datagrams, 0 overruns
0 transmitter underruns, 0 excessive collisions
8 single collisions, 2 multiple collisions
0 dma memory errors, 0 CRC errors
 
0 alignment errors, 0 runts, 0 giants
0 tdr, 0 spurious initialization done interrupts
0 no enp status, 0 buffer errors, 0 overflow errors
0 TX_buff, 1 throttled, 1 enabled
Lance csr0 = 0x73    

Excessive collisions indicate a problem. Common causes are devices connected as full-duplex on a shared Ethernet, broken NICs, or simply too many stations on the shared medium.

If you have any suggestions or want to add more to this article do write us an email articles@knowurtech.com

What Next?


If you liked this article, you can share it with others using the following link:


Related Content :