Generics can increase code bloat because they give the compiler less information...

codeflo · on Aug 23, 2020

Very interesting. Does this mean certain optimizations run before the monomorphization step? Do you know why? Compilation performance is the obvious thing that comes to mind.

ridiculous_fish · on Aug 24, 2020

Yes there are optimizations that run before monomorphization; this one occurs during macro expansion.

It's a bit of a chicken-and-egg problem. To monomorphize clone(), someone must emit an implementation first. But an optimal implementation requires analyses that aren't available until later in the pipeline. Here the optimization kicks in for types deriving Copy, but a generic parameter is enough to defeat it.

eddyb · on Aug 25, 2020

IIRC, that "optimization" mostly avoids wasting time compiling a complex `Clone` implementation, when simply returning `*self` suffices (there are some crates with at lot of `#[derive(Copy, Clone)]` types). We try to avoid having a lot of logic like that too early, for precisely the reasons you mention.

I'd be interested in an example where LLVM can't optimize the general version, as it means we might want to do this through MIR shims instead (which can be generated when collecting the monomorphic instances to codegen - this is what happens when you clone a tuple or closure, for example).

eddyb · on Aug 25, 2020

Did you mean to link to a different example, or different compiler flags?

The link you provide shows only one function, because LLVM has optimized both to be identical, and deduplicated them.

(If you disable the "Directives" filter, you can see a `.set example::clone_concrete, example::clone_abstract`, which aliases one to the other)

5462 · on Aug 26, 2020

The behavior differs between (the present) nightly and rustc 1.45.2; the nightly available when this link was posted matched the 1.45.2 behavior.

The output with 1.45.2 is as follows:

  example::clone_concrete:
        mov     eax, edi
        ret

  example::clone_abstract:
        mov     ecx, edi
        and     ecx, -256
        xor     eax, eax
        xor     edx, edx
        cmp     dil, 1
        sete    dl
        cmove   eax, ecx
        or      eax, edx
        ret

eddyb · on Aug 27, 2020

Fascinating coincidence! It was probably the LLVM upgrade (https://github.com/rust-lang/rust/pull/73526) landing, probably before the comment was even posted (but the nightly would only show up with the upgraded LLVM the next day).