What I recall is that there were several layers to handle receiving actions from users and broadcasting state to other users, others to represent entities and a few more to represent locations.
Like WoW in the early and middle years, the Open World feel is a bit of a fiction. WoW started with sharding akin to availability zones, but at ten million subscribers there’s only so much sharding you can do before people can’t meet each other. There were artificial choke points designed into both games that could be leveraged to offload some of the most intense interactions to separate hardware. Later WoW introduced phasing, which turned the map into a 4 dimensional space where actions were not observed by all observers (and map state could vary based on quest progress). The grouping system at that point had more situations where it could match you up with anyone in the same region not just the same availability zone, with or without moving you to a new hardware to do so.
Everyone noted how brilliant it was going to be for scaling the system up, but I’ve found that the best scaling designs are bidirectional. Both of these games made a system where they could respond to post-content-release flash crowds, and also power more hardware down. The other thing Blizzard did that in a non-cloud world I found particularly clever: before a major content release they would upgrade hardware, migrate users to the new machines, use the old hardware for load shedding tasks, release the content, wait for traffic to subside, then decommission old hardware to get back to steady state. You get to test the new hardware before the Big Show, and rely on burned in hardware to get you through to the other side. Very good operational intelligence IMO.
In Blizzard’s case some of the scaling tricks I mentioned earlier also improved the problem of users getting orphaned by declining subscriptions being spread out between too many AZs. In a pinch when traffic was low, you had access to the entire region (though I can’t say if that was controlled algorithmically or not). Previously the best they could do was a sort of traffic shaping by offering to move people from one server to another for free, where usually they charged a fee for doing so. That could take months to achieve load shedding. My suspicion is that at some points they had pools of hardware that belonged to the region and was parceled out between zones based on load. That allows you to absorb flash mobs better because those will be isolated to one zone instead of dependent on game state or time of day (after work on Friday everywhere with a low ping rate to that region, for instance).