| |

Fixing AGV WiFi Roaming Disconnects: Why Your PLC Watchdog Times Out

Fixing AGV WiFi Roaming Disconnects with 802.11r v1

Are you struggling with AGV WiFi roaming disconnects? This is an infuriatingly predictable situation in a high-speed warehouse. Your Automated Guided Vehicle (AGV) or Autonomous Mobile Robot (AMR) is effortlessly gliding up the main aisle of the facility. It approaches the boundary between the assembly area and the warehouse—the exact physical midpoint between Access Point (AP) 1 and Access Point 2. The signal strength indicator shows full bars. The network link light is solid green.

Is your fleet suffering from Roaming Issues?

Check the symptoms you are experiencing on the floor:

⚠️ Diagnosis: You are likely experiencing a “Phantom Disconnect” caused by standard WiFi handover delays.
Your network isn’t broken, it’s just too slow for industrial safety systems. Read on for the 802.11r fix.

And then, without warning, the AGV slams on its emergency brakes. The SCADA screen flashes red. The Siemens or Allen-Bradley PLC logs a Communication Fault, a Watchdog Timeout, or an I/O Device Failure. You check the system logs, reboot the robot, and it runs perfectly… until it crosses that exact same invisible line again.

The Root Cause of AGV WiFi Roaming Disconnects: The “Phantom Disconnect”

If you have spent hours replacing antennas, tweaking RF transmission power, or auditing your ladder logic, you need to step back. The problem is almost certainly not your control logic, and it isn’t a dead zone in your factory’s WiFi coverage. You are experiencing a network phenomenon known on the factory floor as a Phantom Disconnect.

🔍 Diagnostic Definition: The Phantom Disconnect

A state where the physical wireless link (Layer 1/Layer 2) appears completely intact—evidenced by green link lights and strong RSSI values—but routing (Layer 3) is momentarily paused. During this critical pause, time-sensitive packets like Modbus TCP keep-alives or Profinet I/O frames are dropped into a void, triggering deterministic safety systems to instantly halt the machinery.

When an AGV moves through a large industrial space, it must hand over its connection from one router to another to maintain continuous coverage. Standard enterprise and consumer WiFi networks execute this handover using a “break-before-make” logic. The AGV’s wireless client must physically sever its connection with AP 1 before it can begin negotiating a new connection with AP 2. For laptops downloading emails or smartphones streaming a video, this interruption is completely invisible due to software buffering. For hard real-time industrial control systems, it is catastrophic.

The 2-Millisecond Window: Why Industrial Protocols Can’t Survive a Normal AP Switch

To understand why the machinery stops dead in its tracks, we have to look at the mathematical mismatch between IT network protocols and OT (Operational Technology) safety tolerances.

With regular WPA2-PSK or WPA2-Enterprise encryption, if your AGV decides to jump from AP 1 to AP 2, it must undergo a full 4-way cryptographic handshake. In this negotiation, we ask for access, exchange keys, verify credentials and give a new path through the network. This handshake can easily take 300 to over 500 milliseconds in a noisy manufacturing facility with plenty of 2.4GHz EMI interference from welding arcs and VFDs (Variable Frequency Drives).

Now, compare that to your PLC. Industrial Ethernet protocols are entirely deterministic. They rely on strict, predictable timing. A typical Profinet I/O update cycle might be configured for just 2 milliseconds. If the PLC misses a predefined number of consecutive update cycles—often triggering a watchdog threshold as low as 32ms or 50ms—the controller logically assumes the remote I/O node has been physically severed or compromised.

Your AGV isn’t breaking down; it is doing exactly what it was programmed to do: Fail safe when communications are lost. The consumer-grade WiFi equipment managing the handover is simply too slow for industrial mathematics.

Prove It Yourself: Spotting the Delay in Wireshark

You don’t have to take our word for it. If you want to definitively prove to your IT department that the wireless infrastructure is the root cause of the AGV stoppages, you can capture the exact moment of failure using a packet sniffer like Wireshark.

  1. Set up a Mirror Port: Connect your laptop to a mirrored port on the switch that feeds the APs, or run a wireless capture interface in promiscuous mode near the “overlap zone” where the AGV typically stops.
  2. Filter for EAPOL: In Wireshark, set your display filter to eapol (Extensible Authentication Protocol over LAN). This will isolate the WPA2 4-way handshake frames.
  3. Delta Time analysis: Analyze the time difference between Message 1 (AP sending to the AGV) and Message 4 (the last acknowledgment from the AGV).

In a normal roaming situation you might see a big gap in the Delta Time column. In the middle of that 400ms gap, if you clear the filter, you will see the PLC sending TCP Retransmissions or Profinet Alarm frames like it’s going out of style, none of which get a reply, because the cryptographic tunnel has not been established yet. The smoking gun is in the packet capture.

Industrial WiFi Troubleshooting Guide

Want to see the exact handshake packets? Download our free PDF guide: “Analyzing AGV Roaming Delays with Wireshark”.

Download PDF Guide

PLC Watchdog Timeout vs. Seamless Roaming

Standard WiFi (Break-before-make)
AP 1AP 2
802.11r Fast Transition (Make-before-break)
AP 1AP 2

Standard roaming halts in the middle (red) to negotiate keys, triggering the PLC fault. 802.11r pre-authenticates for a glide-through.

“Physics is the Enemy”: Breaking the “Sticky Client” Trap

Faced with these frustrating roaming drops, the knee-jerk reaction of many maintenance engineers is to try and brute-force the physics. They remove the stock 3dBi antennas on the AGV’s onboard router and replace them with massive, high-gain 10dBi antennas, praying that a “stronger signal” will fill in the gap and avoid the disconnect from occurring.

In the heavy metal and concrete environments of modern warehousing, this is the worst possible adjustment you can make. Increasing the antenna gain artificially inflates the perceived signal strength and exacerbates a condition known as the Sticky Client Syndrome.

Wireless clients (like the router on your AGV) make their own autonomous decisions about when to roam based on the RSSI (Received Signal Strength Indicator) of the current AP. If you put a massive antenna on the AGV, it will hold onto the fading signal of AP 1 long after it has physically driven under AP 2. The AGV becomes “sticky.” It clings to a heavily degraded, error-prone connection through metal warehouse racks, dropping packets frantically rather than cleanly letting go and negotiating with the strong AP directly above it.

The Industrial Antidote: Fixing AGV WiFi Roaming Disconnects with 802.11r

You cannot fix a protocol issue with bigger antennas. The definitive solution to the AGV roaming problem is the IEEE 802.11r-2008 standard, specifically engineered to support fast roaming for sensitive industrial motion control applications.

This standard introduces Fast BSS Transition (FT). Instead of waiting for the AGV to disconnect from AP 1 to begin negotiating with AP 2, 802.11r allows the network infrastructure to pre-calculate and pre-distribute the cryptographic keys (the Pairwise Master Key, or PMK) across the access points in the background.

Comparison of standard WiFi handover vs 802.11r Fast Transition for AGV WiFi roaming disconnects
Figure 1: Standard WiFi roaming delay vs. 802.11r Fast Transition seamless handover.
Roaming CharacteristicsStandard Consumer/IT WiFi802.11r Industrial WiFi
Handover LogicBreak-before-make (Disconnects completely first)Make-before-break (Pre-authenticates via backend)
Authentication Delay300ms – 500+ ms (Requires full 4-way handshake)< 20 ms (Utilizes cached PMK keys)
Impact on Deterministic PLCWatchdog timer expires; AGV triggers Emergency StopCompletely transparent to PLC; AGV continues motion

Case Study: Zero Downtime in an Automotive Logistics Center

Theoretical standards are excellent, but how does this perform on the floor? Consider a Tier-1 automotive parts manufacturer operating a fleet of 50+ AGVs. Their chassis were equipped with standard commercial Wi-Fi bridges. Due to the high density of steel stamping machines and constant layout changes, their AGVs experienced an average of 15 network-related stoppages per shift. Every stoppage required a human operator to physically walk onto the floor, reset the safety relay, and manually clear the SCADA fault, costing thousands in lost throughput.

The facility completely replaced the onboard commercial bridges by adopting a decoupled AGV onboard network architecture, utilizing robust industrial-grade Wi-Fi routers that natively supported 802.11r protocols. By configuring the APs and the AGV routers to share the same Mobility Domain and enabling FT Over-the-Air, the roaming handover times dropped from an erratic 450ms to a consistent 12ms. The result? Network-related AGV stoppages dropped to zero immediately after deployment.

Frequently Asked Questions (AGV Roaming)

Q:
Does enabling 802.11r cause connection issues for legacy AGV clients?
Yes, some older non-802.11r wireless clients or legacy PLCs might fail to parse the modified beacon frames broadcasted by 802.11r-enabled access points, causing them to drop off the network entirely. To prevent this, we recommend segregating legacy clients onto a dedicated SSID that does not have 802.11r enabled, or upgrading edge devices using a Serial-to-Ethernet Bridge.
Q:
Should I use 2.4GHz or 5GHz for industrial AGV roaming?
5GHz offers higher throughput and avoids congested IT network channels, whereas 2.4GHz has much better RF penetration through concrete pillars and metal warehouse racking. For ground level AGV roaming, 2.4GHz with 802.11r FT is generally preferred for link stability, and 5GHz is better for stationary line of sight wireless backhaul.
Q:
What is the difference between 802.11r, 802.11k, and 802.11v?
They work together but serve different roles in the roaming ecosystem. 802.11r handles the actual fast authentication (key caching) to prevent timeouts. 802.11k (Radio Resource Measurement) helps the AGV quickly scan and find the best nearby APs without wasting time on blind searches. 802.11v (BSS Transition Management) allows the network infrastructure to actively recommend the best AP to the AGV based on current AP load. For directly fixing PLC watchdogs, 802.11r is the most critical piece.
Q:
Can I implement Fast Transition (FT) across different router brands?
Technically yes, if all APs and clients share the same Mobility Domain, SSID, and WPA2 passphrases. However, in an OT (Operational Technology) environment, mixing consumer-grade access points with industrial routers often breaks the pre-authentication cache due to proprietary firmware tweaks. We strongly recommend standardizing on a single industrial routing architecture for mission-critical mobile assets.

💡 Looking for a comprehensive architecture? To explore our complete distributed modular design and learn how to decouple networks for maximum resilience, please visit our AGV Onboard Network Solutions.

Stop Guessing. Start Testing.

Don’t let standard consumer Wi-Fi behavior dictate your factory’s uptime. Equip your toughest AGV route with an industrial routing unit designed for sub-20ms seamless roaming.

Plug & Play setup • Global Shipping • Direct Technical Support

REQUEST A QUOTE

SKU/Part No.