Blog · Cisco IOS-XE

Nine gotchas after upgrading a Cat9K fleet in install-mode

Lessons captured after the project closed -- the kind of detail that doesn't make it into the Cisco docs.

Posted 2026-05-21 · Cisco Catalyst 9K / IOS-XE 17.x / Ansible

We upgraded 13 Cat9300X / 9500 switches in install-mode and came out with a list of things we wish someone had told us up front. In rough order of "most likely to bite you":

1. The 6-hour auto-abort timer is real

If you don't issue install commit within 360 minutes of install activate, the switch reverts on its own. Sounds great until your debug session crosses the line.

2. install commit needs an interactive shell

You can't fire-and-forget it from a single SSH command. Use paramiko invoke_shell(), enable, then send the commit -- otherwise AAA exec authorization may quietly refuse the elevation.

3. install activate, not install activate file ...

Once you've added the image, the bare form is correct. The longer form fails silently in some 17.x trains.

4. write memory before install add

If you skip this and the add fails part-way, you can lose unsaved config across the reload. Save first.

5. U state vs C state

show install summary distinguishes "U" (uncommitted / active) from "C" (committed). They are not the same. Don't claim "done" on U.

6. AAA exec-chain lockouts

An aaa authorization exec default group radius local if-authenticated chain stops on explicit AUTHZ-REJECT from RADIUS rather than falling through. We locked ourselves out of one switch this way; recovery required serial console.

7. Image transport: switch-initiated HTTP, not Ansible SCP

SCP push is slow (~140 KB/s) and shares the AAA path. Switch-initiated copy http://... hit ~5.8 MB/s and didn't get killed by ANSIBLE_PERSISTENT_COMMAND_TIMEOUT on transient drops.

8. Separate playbook for inactive-image cleanup

Don't fold install remove inactive into the upgrade playbook. Run it as a separate, deliberate pass once you've confirmed the new image is stable.

9. IPBASE / UNIVERSAL (non-K9) images can't do SNMPv3 priv

Several older 3750s in the same fleet were stuck on auth/noauth only. Audit the image with show version | inc Cisco IOS Software before planning an SNMPv3 push.


← Back to blog

Talk to an engineer →