it can do it already, the trick is to prompt it to approach it how a human would.
1. use a temp file as a reference for the entire refactor
2. make it plan the entire thing, tell it to use a high level and low level checklists, tell it to take notes for itself, and tell it to use the temp file as a scratchpad for taking notes and storing code blocks.
3. tell it to do small incremental changes, and do bottoms up approach.
I find that even CoPilot can do this pretty quickly if you do one example refactor and then prompt it to repeat the example on all the files from a find and search.
Since that's already a huge speed up, I'm sure many of these agents can do the same.