The other problem with the "Bayes with flat prior = frequentist maximum likeliho...

mjw · on Aug 31, 2014

If this seems a bit odd (and it did to me at first!) think about it this way:

Bayesian methods work by averaging over a bunch of different models / different values of the parameters.

What it means to compute a mean depends on the parameterisation in which you do it: simplest example being that an arithmetic mean is not in general the same as a geometric mean, or a harmonic mean.

There's no "neutral" / parameterisation-independent way to specify how this averaging is done, so if you care about the average case, you're going to have commit to doing it some particular favoured parameterisation. Choosing that parameterisation is equivalent to choosing the prior.

Frequentist methods avoid the need for this decision; the price they pay is that without a prior they're unable to condition on the observed data. They must consider every parameter value and its resulting sampling distribution separately and can't average over them.