NXOS Troubleshooting
NXOS Troubleshooting
@iamvinayvsawant
@iamcrissoto2024
TACDCN-2010
#CiscoLive
• Introduction
• Unveiling the tools
Agenda • Real-world Applications and
Success Stories
• Summary
TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Tools Reviewed Today
Tool #1: Ethanalyzer
#CiscoLive
Network Topology
Spine-1
Leaf-1 Leaf-2
VLAN 20
VLAN 10 Eth1/1 Eth1/2 10.20.20.1
10.10.10.1
VLAN 10 VLAN 20
Host 1
10.10.10.2 Host 2
MAC 10.20.20.20
10B3.D6A4.BA17
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Tool #1: Ethanalyzer
Ethanalyzer
Packet Capture Tool
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Ethanalyzer
Packet Capture Tool
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Ethanalyzer Leaf-1
Real Word Example [platform N9K-C93180YC-FX / 10.2(6) Reply VLAN 10
10.10.10.1
Eth1/1
Example use case 1:Capture ICMP
Host 1
10.10.10.2
MAC
Request 10B3.D6A4.BA17
Leaf-1#
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Decode Internal
Maps to E1/1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Ethanalyzer Leaf-1
Real Word Example [platform N9K-C93180YC-FX / 10.2(6) Reply VLAN 10
10.10.10.1
Host 1
VLAN 10
Request 10.10.10.2
ARP Request
Leaf-1# ethanalyzer local interface inband display-filter "arp" limit-captured-frames 0 Received
Capturing on 'ps-inb'
23 2024-04-18 16:08:07.492460719 10:B3:D6:A4:BA:17→ ff:ff:ff:ff:ff:ff ARP 64 Who has 10.10.10.1? Tell 10.10.10.2
24 2024-04-18 16:08:07.492902971 e4:1f:7b:2f:a5:c7 → 10:B3:D6:A4:BA:17 ARP 64 10.10.10.1 is at e4:1f:7b:2f:a5:c7
2
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
ARP filter based
on sender IP
Example use case 4: Capture ARP with more filters
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Spine-1
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Eth1/47 Eth1/3 Lo1
Example use case 5: Check if packets are getting 192.168.100.2
Eth1/47 Eth1/3
software Switched
Leaf-1 Leaf-2
VLAN 20
VLAN 10 Eth1/1 Eth1/2 10.20.20.1
10.10.10.1
VLAN 10 VLAN 20
Host 1
10.10.10.2 Host 2
MAC 10.20.20.20
10B3.D6A4.BA17
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Leaf-1# ethanalyzer local interface inband display-filter "ip.addr==10.10.10.1 && ip.addr==10.10.10.2" limit-
captured-frames 0
Capturing on 'ps-inb’ Syn Syn Ack
34 2024-04-18 17:22:46 10.10.10.2 → 10.10.10.1 TCP 78 51278 → 179 [SYN] Seq=0 Win=29200 Len=0 MSS=1460
35 2024-04-18 17:22:46 10.10.10.1 → 10.10.10.2 TCP 78 179 → 51278 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0
MSS=1460 36 2024-04-18 17:22:46 10.10.10.2 → 10.10.10.1 TCP 70 51278 → 179 [ACK] Seq=1 Ack=1 Win=29200
Len=0 TSval=22641527
37 2024-04-18 17:22:46 10.10.10.1 → 10.10.10.2 BGP 146 OPEN Message Ack
44 2024-04-18 17:22:47 10.10.10.1 → 10.10.10.2 BGP 118 UPDATE Message, KEEPALIVE Message
45 2024-04-18 17:22:47 10.10.10.2 → 10.10.10.1 BGP 166 UPDATE Message, KEEPALIVE Message, UPDATE Message
46 2024-04-18 17:22:472 10.10.10.1 → 10.10.10.2 TCP 70 179 → 51278 [ACK] Seq=144 Ack=192 Win=65536 Len=0
Update
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Leaf-1# ethanalyzer local interface inbound-hi display-filter "bgp && ip.addr==10.20.20.1" limit-captured-frames 0
Update packet
Capturing on inband dropped
2024-05-04 21:51:28.977444 10.10.10.1 -> 10.20.20.1 BGP OPEN Message
2024-05-04 21:51:29.979955 10.10.10.1 -> 10.20.20.1 BGP KEEPALIVE Message
2024-05-04 21:51:30.996699 10.10.10.1 -> 10.20.20.1 BGP [TCP Retransmission] KEEPALIVE Message
2024-05-04 21:51:31.106224 10.10.10.1 -> 10.20.20.1 BGP UPDATE Message, UPDATE Message, UPDATE Message,
UPDATE Message, UPDATE Message
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6) Example use case 7: Capture COPP dropped traffic
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6) Filter Base on
Queue Number
Example use case 7: Capture COPP dropped traffic
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Save to
Saving Capture and Ring Buffer bootflash
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Ethanalyzer
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Local Read
Reading the capture locally
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Tool #2: SPAN-to-CPU
SPAN-to-CPU
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
How does it work? Spine-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
SPAN-to-CPU Spine-1 Capture
Real Word Example [platform N9K-C93180YC-FX / 10.2(6) Destination
Nexus CPU
Eth1/47
Configuration
Eth1/47
Leaf-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Span to CPU
Real Word Example [platform N9K-C93180YC-FX / 10.2(6)
Example
Mirror
Leaf-1#
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
SPAN-to-CPU
Changing default Span Rate
Default
50 pps
Leaf-1#show hardware rate-limiter span
Module: 1
R-L Class Config Allowed Dropped Total
+----------------+----------+--------------------+--------------------+--------------------+
span 50 0 0 0
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
Tool #3: ELAM
ELAM
Packet Capture and Forwarding Verification Tool
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
ELAM
Nexus 9k Generations/Model Types
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
ELAM
Syntax, Top of Rack
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
ELAM
In-Select Options and Trigger options
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
ELAM
In-Select Options and Trigger options
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
ELAM
Scenario #1a: ICMP Failure – Does it reach me?
Spine-1
Leaf-1# debug
Leaf-1# debug platform
platform internal
internal tah
tah elam
elam
Lo1 Lo1
Leaf-1(TAH-elam)# trig
Leaf-1(TAH-elam)# trig init
init
Eth1/47 Eth1/3
192.168.100.1 192.168.100.2 Slot 1:
Slot 1: param
param values:
values: start
start asic
asic 0,
0, start
start slice
slice 0,
0, lu-a2d
lu-a2d 1,
1, in-
in-
Eth1/47 Eth1/3
select 6,
select 6, out-select
out-select 00
Leaf-1 Leaf-2
Leaf-1(TAH-elam-insel6)# set
Leaf-1(TAH-elam-insel6)# set outer
outer ipv4
ipv4 src_ip
src_ip 10.10.10.2
10.10.10.2
VLAN 10 VLAN 20 dst_ip 10.10.10.1
dst_ip 10.10.10.1
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1
VLAN 20
VLAN 10
Leaf-1(TAH-elam-insel6)# start
Leaf-1(TAH-elam-insel6)# start
Leaf-1(TAH-elam-insel6)# report
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
ELAM
Asic Trigger Explanations and vPC Notes
Leaf-1(TAH-elam)# trig init
Slot 1: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
• The asic value is only necessary for modular (EoR) Nexus 9ks due to the different linecards which may use
different ASIC values per port group
• The slice number corresponds to the partition of the ASIC to which the interfaces are associated
• In modern ELAM (post 7.0(3)I5(2) ), defining the lu-a2d value and out-select is not necessary, but it is
associated with the different aspects of the forwarding table
• When using vPC, it is important to set up the ELAM capture on both sides, since traffic can technically land
on either peer
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
ELAM
Scenario #1a: ICMP Failure – Does it reach me?
Spine-1 <prev>
Leaf-1(TAH-elam-insel6)# report
Dst MAC address:
SUGARBOWL ELAM REPORT 10:B3:D6:A4:75:A7
SUMMARY Src MAC address:
Lo1 Eth1/47 Eth1/3 Lo1
192.168.100.1 192.168.100.2
slot - 1, asic - 0, slice - 0 10:B3:D6:A4:BA:17
Eth1/47 Eth1/3 ============================ .1q Tag0 VLAN: 10, cos = 0x0
Leaf-1 Leaf-2 Incoming Interface: Eth1/1 Sup hit: 1, Sup Idx: 2788
Src Idx : 0x1, Src BD : 10
Outgoing Interface Info: dmod 0, Dst IPv4 address: 10.10.10.1
VLAN 10 VLAN 20 dpid 0
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1 Src IPv4 address: 10.10.10.2
Dst Idx : 0x5bf, Dst BD : 10 Ver = 4, DSCP = 0, Don't
VLAN 10 VLAN 20
Packet Type: IPv4 Fragment = 0
<cont>
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ELAM
ELAM Report Component Notes
Leaf-1(TAH-elam-insel6)# report
ELAM not triggered yet on slot - 1, asic - 0, slice – 0
ELAM hit flop error on slot - 1, asic - 0, slice - 1. Try elam again.
• If you expect to receive the traffic on a specific interface or general and see “ELAM not triggered” there
are one of three possible scenarios:
• You may need to run the ELAM again (it is possible that you started the capture too late)
• You are not receiving the traffic on the interface that you think you are (particularly relevant if you do
not know on what interface you should receive the traffic)
• The Nexus 9k is not receiving the traffic
- If you see the error “ELAM hit flop error”, this is not a cause for concern. This simply means that you should
enter “reset” and then set the trigger with the “set” command again
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
ELAM
ELAM Report Component Notes
HOMEWOOD ELAM REPORT SUMMARY
slot - 1, asic - 3, slice - 0
============================
• If your ingress interface or egress interface is an L3 port or loopback, the SRC/DST BD field for that interface will show as
a number outside of the normal range of vlans allowed by the Nexus, which is normally in the 4096+ range
• When using dot1q tunnels, a transit switch may show a different SRC/DST BD tag than the .1q Tag field ; this is expected
behavior
• If this packet is not destined for the Nexus 9k and you see a supervisor hit, there are one of two possible scenarios:
• You have an SVI on the switch active
• Your switch is incorrectly punting the packet into the control plane, which can cause latency and/or drops
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
ELAM
ELAM Report Component Notes
Drop Info:
----------
LUA:
LUB:
LUC:
LUD:
Final Drops:
vntag:
vntag_valid : 0
vntag_vir : 0
vntag_svif : 0
• Just because you see something under the “Drop Info” in the second half of the output does not mean
that it will drop. If the ELAM correctly registers the packet drop, you should see a reason under BOTH
LUA/B/C/D AND the Final Drops sections
- Although ELAM is normally very reliable, sometimes ELAM may show that it is forwarding even though it is
not. To verify if this is happening, do the following:
- Confirm, if possible, that the next hop receives the expected packet
- Confirm, through consistency checker and troubleshooting commands, that we have correctly
programmed routes/L2 adjacencies/protocols
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
ELAM
Scenario #1b: ICMP Failure – Do I forward it correctly?
Spine-1
• Determine your SRC Interface and its slice number and source
Lo1 Eth1/47 Eth1/3 Lo1
192.168.100.1 192.168.100.2
ID:
Eth1/47 Eth1/3
Leaf-1# show system internal ethpm info interface eth1/1 | i dpid
Leaf-1 Leaf-2
IF_STATIC_INFO:
port_name=Ethernet1/1,if_index:0x1a000000,ltl=6144,slot=0,
VLAN 10
SRC IF VLAN 20 nxos_port=0,dmod=1,dpid=16,unit=0,queue=65535,xbar_unitbmp=
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1 0x0,ns_pid=255,slice_num=0,port_on_slice=16,src_id=32
VLAN 10 VLAN 20
• Confirm that the packet is non-encapsulated
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ELAM
Scenario #1b: ICMP Failure – Do I forward it correctly?
Spine-1
Leaf-1# debug platform internal tah elam
?
Leaf-1(TAH-elam-insel6)# set outer ipv4 src_ip 10.10.10.2
VLAN 10
SRC IF VLAN 20 dst_ip 10.20.20.20
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1
Leaf-1(TAH-elam-insel6)# report
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ELAM
Scenario #1b: ICMP Failure – Do I forward it correctly?
Spine-1
Leaf-1# report
SUGARBOWL ELAM REPORT SUMMARY
slot - 1, asic - 0, slice - 0
Lo1
============================
Eth1/47 Eth1/3 Lo1
192.168.100.1 192.168.100.2
Eth1/47 Eth1/3 Incoming Interface: Eth1/1
Leaf-1 Leaf-2 Src Idx : 0x1, Src BD : 10
Outgoing Interface Info: dmod 1, dpid 38
Dst Idx : 0xb9, Dst BD : 200
VLAN 10
SRC IF VLAN 20
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1
<snip>
VLAN 10 VLAN 20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
ELAM
Scenario #1b: ICMP Failure – Do I forward it correctly?
Spine-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
ELAM
ELAM Report: Optional way to run it
• Another way to run it which prevents you from having to decode the outgoing dmod/dpid values
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
ELAM
Scenario #1c: ICMP Failure – Do I forward it correctly (End of Rack/9500s)
Spine-1
Spine-1(config)# debug platform internal tah elam
SRC IF Spine-1(TAH-elam)# trig init
Slot 1: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
Slot 22: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
Lo1 Eth1/31 Eth1/36 Lo1
192.168.100.1
Slot 23: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
192.168.100.2
Eth1/47 Eth1/48 Slot 24: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
Slot 26: param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0
Leaf-1 Leaf-2 switch(TAH-elam-insel6)#
VLAN 10 VLAN 20 • If you have a Nexus 9500, you likely have multiple linecards and
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1 fabric modules
VLAN 10 VLAN 20 • If you try to run an ELAM wide open on a Nexus 9500, it will
attempt to trigger on all of the modules, and only on one slice of a
specific ASIC of module 1
• To trigger for a packet correctly on a Nexus 9500, you must
Host 1 Host 2 capture on the correct linecard/module, the right ASIC and the
10.10.10.2 10.20.20.20 right slice where the traffic may appear
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
ELAM
Scenario #1c: ICMP Failure – Do I forward it correctly (End of Rack/9500s)
Spine-1
Spine-1# show system internal ethpm info int e1/36 | i dpid
IF_STATIC_INFO: port_name=Ethernet1/36,if_index:0x1a004600,ltl=6004,slot=0,
SRC IF nxos_port=140,dmod=4,dpid=44,unit=3,queue=65535,xbar_unitbmp=0x0,ns_pid=255,
slice_num=0,port_on_slice=44,src_id=88
Lo1 Eth1/31 Eth1/36 Lo1
192.168.100.1 192.168.100.2
Eth1/47 Eth1/48
• Keep in mind that if you have multiple interfaces (like ECMP, port-
Leaf-1 Leaf-2 channels with multiple links, etc.), you will need the highlighted
info for all of the interfaces
• If the interfaces are on different modules, you will need to attach
VLAN 10 VLAN 20
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1 to each module and check each one separately
VLAN 10 VLAN 20
Spine-1# attach mod 1
Attaching to module 1 ...
To exit type 'exit', to abort type '$.'
Last login: Sat May 11 16:44:59 UTC 2024 from sup27 on pts/1
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
ELAM
Scenario #1c: ICMP Failure – Do I forward it correctly (End of Rack/9500s)
Spine-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
ELAM
Scenario #1c: ICMP Failure – Do I forward it correctly (End of Rack/9500s)
Spine-1
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
ELAM
Scenario #2: Am I receiving the ARP Request and Flooding it?
Spine-1
Leaf-1# debug platform internal tah elam
VLAN 10 VLAN 10
Leaf-1(TAH-elam-insel6)# start
Leaf-1(TAH-elam-insel6)# report
Host 1 Host 3
10.10.10.2 10.10.10.3
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
ELAM
Scenario #2: Am I receiving the ARP Request and Flooding it?
Spine-1
slot - 1, asic - 0, slice - 0
============================
<prev>
Incoming Interface: Eth1/1
Lo1 Eth1/47 Eth1/3 Src Idx : 0x1, Src BD : 10 Target Hardware address:
192.168.100.1
Eth1/47 Eth1/3 Outgoing Interface Info: dst_ptr 10, FF:FF:FF:FF:FF:FF
dst_ptr_is_flood 1 Sender Hardware address:
Leaf-1 Leaf-2 10:B3:D6:A4:BA:17
Packet Type: ARP Target Protocol address: 10.10.10.3
Sender Protocol address: 10.10.10.1
VLAN 10
SRC IF VLAN 20 Dst MAC address: FF:FF:FF:FF:FF:FF ARP opcode: 1
10.10.10.1 10.20.20.1
Eth1/1 Eth1/3 Src MAC address:
VLAN 10 VLAN 10 10:B3:D6:A4:BA:17 Sup hit: 1, Sup Idx: 2648
.1q Tag0 VLAN: 10, cos = 0x6
<cont>
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
ELAM
Decode Flood Interfaces Option #1: Consistency Checker
• Non-detailed output:
Leaf-1# show consistency-checker membership vlan 10
Checks: Port membership of Vlan in vifvlanmbr, rwepgstate and qsmt_ovtbl tables
Additional Checks: Fex port membership of Vlan in vifvlanmbrsearchtable table
Ports configured as "switchport monitor" will be skipped
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
ELAM
Decode Flood Interfaces Option #1: Consistency Checker
• Detailed output:
Leaf-1# show consistency-checker membership vlan 10 detail
"expectedDetails": {},
"actualDetails": {
"hwTableName": "tah_sug_qsmt_ovtbl",
"hwIndexName": "data",
"tableData": [
{
"number": 1,
"units": [
{
"number": 0,
"slices": [
{
"number": 0,
Translated Bitmap on
"bitmap": "0x00000000:0x000000c0:0x00010000“
<snip>
Slide 52
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
ELAM
Decode Flood Interfaces Option #2: ELAM and ASIC Hardware Tables
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
ELAM
Decode Flood Interfaces Option #2: ELAM and ASIC Hardware Tables
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
ELAM
Decode Flood Interfaces Option #2: ELAM and ASIC Hardware Tables
• Starting from right and moving to the left in the above hex string, each value holds 4 bits (starting from 0 on
the far right). Therefore, if we count the bits:
000000c0:00010000 = 0000 0000 0000 0000 0000 0000 1100 0000 : 0000 0000 0000 0001 0000 0000 0000 0000
39 38 16 3 0
• This bitmap corresponds to the Sport column values under “show interface hardware-mappings”
Name Ifindex Smod Unit HPort FPort NPort VPort Slice SPort SrcId MacId MacSP VIF Block BlkSrcID
Eth1/1 1a000000 1 0 16 255 0 -1 0 16 32 4 0 1 0 32
<snip>
Eth1/47 1a005c00 1 0 38 255 184 -1 0 38 68 8 4 1537 0 68
Eth1/48 1a005e00 1 0 39 255 188 -1 0 39 70 8 6 1537 0 70
<snip>
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
ELAM
ELAM for ARP Reply
• For the ARP Reply, you can just get the results from an ARP ELAM for the reverse flow:
<snip>
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
ELAM
Scenario #3: VXLAN Flow – Is the fabric dropping the packet?
Spine-1
• Determine your SRC Interface and its slice number and source ID on each
switch:
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
ELAM
Scenario #3: VXLAN Flow – Is the fabric dropping the packet?
Spine-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
ELAM
Scenario #3: VXLAN Flow – Is the fabric dropping the packet?
Spine-1
SIP: 192.168.100.1, DIP: Leaf-2# report
192.168.100.2 Leaf-1# report SUGARBOWL ELAM REPORT SUMMARY
{SIP: 10.10.10.2, DIP: SUGARBOWL ELAM REPORT slot - 1, asic - 0, slice - 0
10.20.20.20}
SUMMARY ============================
Lo1 Eth1/47 Eth1/3 Lo1 slot - 1, asic - 0, slice - 0
192.168.100.1 192.168.100.2 ============================ Incoming Interface: Eth1/3
Eth1/47 Eth1/3
Src Idx : 0x9, Src BD : 200
Leaf-2 Incoming Interface: Eth1/1 Outgoing Interface Info: dmod 1, dpid 17
Leaf-1
Src Idx : 0x1, Src BD : 10 Dst Idx : 0x5, Dst BD : 20
Outgoing Interface Info: dmod 1,
dpid 38 Outer Dst IPv4 address: 192.168.100.2
VLAN 10 VLAN 20 Dst Idx : 0xb9, Dst BD : 200 Outer Src IPv4 address: 192.168.100.1
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
ELAM
Scenario #3: VXLAN Flow – Is the fabric dropping the packet?
Spine-1
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
ELAM
Scenario #4: Are we dropping the packet?
Spine-1
Leaf-1# debug platform internal tah elam
VLAN 20
VLAN 10
Leaf-1(TAH-elam-insel6)# start
Leaf-1(TAH-elam-insel6)# report
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
ELAM
Scenario #4: Are we dropping the packet?
Spine-1
Leaf-1# report
SUGARBOWL ELAM REPORT SUMMARY
slot - 1, asic - 0, slice - 0
============================
Lo1 Eth1/47 Eth1/3 Lo1 Incoming Interface: Eth1/1
192.168.100.1 192.168.100.2 Src Idx : 0x1, Src BD : 10
Eth1/47 Eth1/3 Outgoing Interface Info: dmod 1, dpid 38
Dst Idx : 0xb9, Dst BD : 200
Leaf-1 Leaf-2 <snip>
Dst IPv4 address: 10.20.20.20
Src IPv4 address: 10.10.10.2
<snip>
VLAN 10 VLAN 20 Drop Info:
10.10.10.1 Eth1/1 Eth1/2 10.20.20.1 ----------
VLAN 20
Even though we show a
VLAN 10 LUA:
LUB: forwarding decision to another
LUC: interface, we are dropping the
LUD:
ACL_DROP packet due to this reason
Final Drops:
Host 1 Host 2 ACL_DROP
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
ELAM
Scenario #4: Are we dropping the packet?
Spine-1
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
ELAM
Scenario #4: Are we dropping the packet?
• Most common drop codes for ELAM/codes that indicate bad forwarding:
• ACL_DROP: Frame/Packet matched on a deny entry for an ACL
• IP_MTU_CHECK_FAILURE: Frame failed the MTU check for the interface
• IP_SELF_FWD_FAILURE: IP Redirects enabled on the SVI
• ROUTING_DISABLED: Routing for particular vlan disabled
• SRC_VLAN_MBR: Packet/Frame received on an interface where the vlan is not configured/programmed
• TTL_EXPIRED: Packet received on an interface that causes the TTL to be decremented to zero, resulting in a
drop
• UC_DF_CHECK_FAILURE: vPC loop avoidance failure
• UC_PC_CFG_TABLE_DROP: No route in the VRF for the destination
• UC_RPF_FAILURE:
• UC_TENANT_MYTEP_BRIDGE_MISS: VXLAN leaf receiving traffic from a leaf for which it has not learned any
hosts/routes ; it does not have a peering with that VTEP
• UC_TENANT_MYTEP_ROUTE_MISS: VXLAN leaf in the particular tenant VRF does not have a route for the
given destination
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Tool #4:
Consistency-Checker
Consistency-Checker
Example Customer Scenario
Problem Description:
Host 2
192.168.0.167
You have devices in vlan 100 in a similar (simplified)
topology to that on the left. Some can resolve ARP and
VLAN 100
ping, while others cannot ping or even resolve ARP.
Eth1/49-50 Eth1/2
Core- Core-
9k-1 9k-2
Po100 How would you troubleshoot this?
Host 1 Host 3
192.168.0.48 192.168.0.21
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Consistency-Checker
Example Customer Scenario
Normal Troubleshooting Reveals:
Host 2
192.168.0.167 • Links from the switches to the affected devices (and
between the switches themselves) were tested and found to
VLAN 100 be clean
Eth1/49-50 Eth1/2 • MACs of affected devices are learned on relevant interfaces
Core- Core-
9k-1 9k-2 • STP found to be forwarding on all relevant interfaces
Po100
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Consistency-Checker
Example Customer Scenario
Troubleshooting Using Our Session’s Tools (So Far) Reveals:
Host 2
192.168.0.167 • No software switching of the traffic on either of the switches
for any flows, as per Ethanalyzer
VLAN 100
Host 1 Host 3
Eth1/50 appears to have some issue, but what is it and how
192.168.0.48 192.168.0.21 can we figure it out?
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Consistency Checker
Forwarding State/Feature Consistency Verification Tool
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Consistency-Checker
Example Customer Scenario
• Using the consistency checker for our problem:
Host 2
192.168.0.167 Core-9k-1# show consistency-checker member vlan 100
<snip>
VLAN 100
Eth1/49-50
Checking hardware for Module 1 Unit 0
Eth1/2
<snip>
Core- Core- Consistency Check: FAILED
9k-1
Po100
9k-2 Vlan:10, Hardware state consistent for:
Ethernet1/1
Ethernet1/49
VLAN 100 Eth1/1 Vlan:10, Hardware state inconsistent for:
Eth1/1
Ethernet1/50
VLAN 100 VLAN 100
• Other commands that would work in this instance:
show consistency-checker stp-state vlan 100
Host 1 Host 3 show consistency-checker l2 switchport int port-channel 100
192.168.0.48 192.168.0.21
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Consistency-Checker
Example Customer Scenario: Resolution Steps
• You’ve identified the underlying issue. Now what?
Host 2
192.168.0.167 • Steps :
• Grab a [show tech detail] and/or relevant feature show techs
VLAN 100
• Flap the relevant SVI/interface/route
Eth1/49-50 Eth1/2
• If flapping doesn’t work, perform [reload ascii]
Core- Core-
9k-1 9k-2
• Regardless of flapping/reloads working, take a show tech
Po100
detail/feature show tech after
• For further analysis, you can send your troubleshooting results and
VLAN 100 Eth1/1
the show techs to TAC
Eth1/1
Host 1 Host 3
192.168.0.48 192.168.0.21
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Tool #5:
Show Troubleshoot
Troubleshoot Command
• Helps us check MAC and IP route programming in hardware
• Gathers hardware and software table commands
• Nests and organizes commands and their output for more coherent viewing
Syntax
• L2:
• L3
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Troubleshoot Command
Example of L3 Troubleshoot Output
Leaf-1# show troubleshoot l3 ipv4 10.20.20.1 src-ip 10.10.10.1 vrf default
************************************ Check Route in RIB
CHECK ROUTE IN PI RIB
************************************
show ip route 10.20.20.20 vrf default
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Troubleshoot Command
Example of L3 Troubleshoot Output
************************************
Check prefix
CHECK ROUTE IN UFIB
************************************ learning history
show forwarding internal trace v4-pfx-history module 1
Check Host
PREFIX 10.20.20.20/32 TABLE_ID 0x1
route entry in
Time usecs ha_handle next_obj next_obj_HH NH_cnt epoch operation
2024/04/24 13:59:57.239 58555 0xca9071fc V4 adj 0x7a127 1 1 Create
v4 host table
************************************
CHECK HOST ROUTE IN HARDWARE
************************************
show hardware internal tah l3 v4host | grep 10.20.20.20 Adjacency
HW Loc | Ip Entry | VRF | MPath | NumP | Base/L2ptr |CC|SR|DR|TD|DC|DE|LI|HR| Index Checks
-----------|------------------------|---------|-----------|-----------|----------------|--|--|--|--|--|--|--|
3/5 | 10.20.20.20 | 1 | No |0 | 0xd0004 | | | | | | | | |
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
Troubleshoot Command
Example of L3 Troubleshoot Output
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Troubleshoot Command
Example of L3 Troubleshoot Output
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Troubleshoot Command
Example of L2 Troubleshoot Output
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Troubleshoot Command
Example of L2 Troubleshoot Output
Leaf-1# show troubleshoot l2 mac 689e.0b8b.0327 vlan 10 detail
MAC: 689e.0b8b.0327 Vlan: 10
Show spanning-tree VLAN 10
VLAN Name Status Ports
---- -------------------------------- --------- -------------------------------
10 VLAN0010 active Eth1/1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Troubleshoot Command
Example of L2 Troubleshoot Output
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Tool #6: iCAM
Intelligent CAM Analytics and Machine-learning (iCAM)
• iCAM provides resource monitoring and analytics.
• You can obtain traffic, scale and resource (usage level) monitoring for the following resources and
functions:
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Intelligent CAM Analytics and Machine-learning (iCAM)
Using iCAM to check L2-Switching Scale
VLAN
Utilization Level
Leaf-1# show icam scale l2-switching Critical
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Scale Limits for L2 Switching
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Feature Verified Config Cur Cur Threshold Polled
Scale Scale Scale Util Exceeded Timestamp
------------------------------------------------------------------------------------------------------------------------------------------------------------------
MAC Addresses - - - - - -
Mod:1,FE:0) 92000 92000 16 0.01 None 2024-05-09 13:54:50
VLANs 3967 3000 3839 127.96 Critical 2024-05-09 13:54:50
(VDC:1) - - 3839 127.96 Critical 2024-05-09 13:54:50
Isolated Port*Vlan 190000 190000 0 0 None 2024-05-09 13:54:50
(VDC:1) - - 0 0 None 2024-05-09 13:54:50
Leaf-1#
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Intelligent CAM Analytics and Machine-learning (iCAM)
Using iCAM to check Unicast Routing Scale
LPM Routes
Scale High
Leaf-1# show icam scale unicast-routing
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Scale Limits for Unicast Routing
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Feature Verified Config Cur Cur Threshold Polled
Scale Scale Scale Util Exceeded Timestamp
------------------------------------------------------------------------------------------------------------------------------------------------------------------
IPVv4 LPM Routes - - - - - -
Mod:1) 6000 6000 5468 91.13 Warning 2024-05-09 13:54:50
IPVv6 LPM Routes - - - - - -
Mod:1) 1900 1900 03 0.15 None 2024-05-09 13:54:50
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Tool #7: PIE
PIE
Platform Telemetry Tool
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
PIE
Scenario #1: Link Failure
Spine-1
Leaf-2
Reason: No Signal from peer is detected .Please check peer
Leaf-1
Eth1/47 shut configuration.
on Spine-1
Eth1/1 Eth1/2
• Apart of Link Debug Telemetry to assist with more granular L1 link issues
related to signaling
Host 1 Host 2
10.10.10.2 10.20.20.20
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
PIE
Scenario #2: Optic Health
Spine-1
Leaf-2
Health Metric: --------BAD------- Mod: 01
Leaf-1
Leaf-1 showing
CRCs on E1/47
Eth1/1 Eth1/2
• Provides metrics for optics to indicate source of link flaps or optic health
with respect to current/voltage/power
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
PIE
Scenario #3: Environment Monitoring (PSUs and FANs)
Spine-1
Eth1/1 Eth1/2
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
PIE
Scenario #4: CPU and Memory Top Talkers
Spine-1
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Summary
Summary
Tool #1: Ethanalyzer – Captures packet going to and from CPU
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Conclusion
In conclusion, NXOS is a feature rich OS with lots of inbuilt tools.
If you know how to use them, it will expand your tool and empower you in your
troubleshooting.
#CiscoLive
Question & Answer
Additional Reference Resources
ELAM
https://www.cisco.com/c/en/us/support/docs/switches/nexus-9000-series-switches/213848-nexus-9000-cloud-
scale-asic-tahoe-nx-o.html
https://www.youtube.com/watch?v=s0PSHN2Qxhc
Ethanalyzer
https://www.cisco.com/c/en/us/support/docs/switches/nexus-7000-series-switches/116136-trouble-ethanalyzer-
nexus7000-00.html
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
• Visit the Cisco Showcase
for related demos
TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Complete Your Session Evaluations
Complete a minimum of 4 session surveys and the Overall Event Survey to be entered in
a drawing to win 1 of 5 full conference passes to Cisco Live 2025.
Earn 100 points per survey completed and compete on the Cisco Live Challenge
leaderboard.
#CiscoLive TACDCN-2010 © 2024 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
Thank you
#CiscoLive