Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generics can increase code bloat because they give the compiler less information, or the information is available too late.

For example, see how switching from a concrete to a generic type increases the size of clone() by 4x: https://rust.godbolt.org/z/qbYr3v

This is because at the point the compiler synthesizes clone() it has less information about the type.



Very interesting. Does this mean certain optimizations run before the monomorphization step? Do you know why? Compilation performance is the obvious thing that comes to mind.


Yes there are optimizations that run before monomorphization; this one occurs during macro expansion.

It's a bit of a chicken-and-egg problem. To monomorphize clone(), someone must emit an implementation first. But an optimal implementation requires analyses that aren't available until later in the pipeline. Here the optimization kicks in for types deriving Copy, but a generic parameter is enough to defeat it.


IIRC, that "optimization" mostly avoids wasting time compiling a complex `Clone` implementation, when simply returning `*self` suffices (there are some crates with at lot of `#[derive(Copy, Clone)]` types). We try to avoid having a lot of logic like that too early, for precisely the reasons you mention.

I'd be interested in an example where LLVM can't optimize the general version, as it means we might want to do this through MIR shims instead (which can be generated when collecting the monomorphic instances to codegen - this is what happens when you clone a tuple or closure, for example).


Did you mean to link to a different example, or different compiler flags?

The link you provide shows only one function, because LLVM has optimized both to be identical, and deduplicated them.

(If you disable the "Directives" filter, you can see a `.set example::clone_concrete, example::clone_abstract`, which aliases one to the other)


The behavior differs between (the present) nightly and rustc 1.45.2; the nightly available when this link was posted matched the 1.45.2 behavior.

The output with 1.45.2 is as follows:

  example::clone_concrete:
        mov     eax, edi
        ret

  example::clone_abstract:
        mov     ecx, edi
        and     ecx, -256
        xor     eax, eax
        xor     edx, edx
        cmp     dil, 1
        sete    dl
        cmove   eax, ecx
        or      eax, edx
        ret


Fascinating coincidence! It was probably the LLVM upgrade (https://github.com/rust-lang/rust/pull/73526) landing, probably before the comment was even posted (but the nightly would only show up with the upgraded LLVM the next day).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: