I was thinking the same thing while reading, but the author does mention them at the end (together with the bee swarm plot or sina plot, which I think is the better version of a violin plot)
I use violin plots but a complication is that the shape depends upon the bandwidth hyperparameter of the kernel density estimator that is used inside. The plot can differ a lot for different bandwidth values.
Selection of the 'proper' bandwidth is a classic bias-variance tradeoff problem.
While true, that's not an additional problem compared to box plots which effectively just set the bandwidth to maximum. So IMO they are strictly better.
I agree but so do box plots. I think probably the best thing is violin plots when there's lots of data and bee swarm plots when there isn't. But either are better than box plots.
https://en.m.wikipedia.org/wiki/Violin_plot