The most time consuming part of running balancer, or calc_pg_upmaps in particular, that I can think of is
the re-calculation of each pg mapping. For large clusters, that might take seconds even minutes to finish up.
I think a more ideal fix would be introducing a pg mapping cache, e.g., we don't have to re-calculate all pg mappings
if the osdmap epoch does not change, plus some methods to manipulate it the parallel way (https://github.com/ceph/ceph/pull/28373).