Do you think the 8086/8088 were well-optimised/efficient in general? In other words, had they spent some more effort, could they have reduced the transistor count and reduced the number of cycles the instructions take without changing the functionality?
The 8086/8088 look pretty well-optimized in many ways: the design of the instruction set, the architecture, the microcode, and the silicon layout. Studying it closely, I've seen multiple things that are more clever than I expected, and haven't seen anything that makes me shake my head in dismay.