To be more accurate, binary trees are constructed only after a bucket ends up containing a small number of elements, which should only happen rarely unless a bad hash function is chosen.
That's defending against hash collisions which isn't what the grandparent is noticing (the slowing down of memory accesses due to not fitting into cache).
I think they implemented that change to deal with a denial of service attack in http servers. A malicious client could send a large list of parameters with an identical String::hashCode() value, resulting in a single large bucket. Other workarounds included a size limit on the parameter list or a hash function with some parameters randomized at program start.
https://stackoverflow.com/questions/24673567/change-to-hashm...