More often, I see the argument for generics in terms of developer productivity. ...

taeric · on May 24, 2022

Is a popular refrain for statically known types. If you can trade some compute from run to compile time, it can obviously speed up runtime.

mrweasel · on May 24, 2022

I’m still surprised that generics increase performance. I also don’t fully understand how.

duped · on May 24, 2022

Essentially without generics, any kind of polymorphic code needs to use dynamic dispatch. For example, if you add a generic sort() function for collections, you would need to lookup the comparison function for any arguments at runtime. Generic programming - be it through type parameters, ad hoc polymorphism, or whatever - allows a language implementation to know what that comparison function is at compile time and inline it accordingly. This is true for all kinds of things you may want to do generically. Sorting, layout of objects, iteration, etc.

Ultimately generic typing is a way of better expressing the constraints of a program to a compiler, which means the compiler can do a better job of generating code for specific instances. The downside is that the compiler needs to generate code explicitly for each concrete instance of a generic function or type, which means that it takes more compile time and code size is larger.

Edit: I'd also like to point out that JIT compilers and sufficiently-smart AOT compilers can do this sort of inlining if they can prove that for any combination of run time types passed into a sort() they have the same comparison function and inline it as needed. That said it's the same (or higher) complexity as a generics system, with more pitfalls (for example, if they can't prove it but make a guess the sort function never changes, then get the guess wrong, they have to de-optimize).

jacksnipe · on May 24, 2022

Monomorphization generally opens the door for faster execution on a modern processor.

If you have a layer of indirection (i.e., no genetics so you need to dispatch at runtime) then you wind up with an additional jump instruction.

While modern processors and branch prediction make jumps relatively cheap, they can’t avoid instruction cache misses if the dispatch is happening frequently.

By removing the layer of indirection, the compiler can choose to inline the implementation instead of using a jmp; this keeps our icache clean, and can lead to faster execution.

However, there’s a real cost to this! Inlining is, in broad strokes, often faster; but that’s not always true if you’re e.g. inlining a large number of instructions into a tight loop. As with all performance, profile to know the truth.

kaba0 · on May 24, 2022

What is more meaningful perhaps is not the lack of jump, but another optimization “pass” after the inline. Let’s say the compare function needs an object passed to it (e.g. in Java’s case). In this case the call site would create two new objects, pass them to the compare function and jump there. Instead in case of an inline and optimization case the call site might avoid object creation entirely if they would only be destructured in the compare part.

jacksnipe · on May 24, 2022

You’re right, inlining is the actual key. My comment is a bit jumbled.

mcluck · on May 24, 2022

The ultra short version of what he is saying is that generics allow you to provide information to the compiler that enables inlining. So instead of doing something like this (pseudo assembly)

    PUSH x
    PUSH y
    CALL compare
    CMP x y
    POP

You end up with

    CMP x y

Of course this will vary based on the type that you're optimizing for but in all cases you're removing the need to push and pop the stack for every comparison. I'm not smart enough to talk about other optimizations enabled by generics though.

taeric · on May 24, 2022

Anytime you can move a choice from runtime to compile time, it has the ability to speed up runtime. Such that if a lot of what your sort routine is doing at runtime is finding out what the type of the objects being sorted is, then moving that to a choice at compile time can recover all of that time.

Note, this is not just generics. Java could use overloading for years before generics to get similar benefits. Common LISP has type annotations that can give similar benefits. Generics is just a much easier/stronger to use construct for this in many cases.

b0sk · on May 24, 2022

the post goes in detail.

Essentially, imagine you have custom routines for your ints and floats and your language doesn't support generics. You'll have a code like this

  if type is INT:
    answer = optimizedRoutineForInt();
  else type is FLOAT:
    answer = optimizedRoutineForFloat();

This happens at run-time. There's an overhead.

With generics, there's zero overhead. Your code will look something like

  answer = optimizedRoutine<Type>();

and if optimizedRoutine<Float> can be inlined at this site, there are no conditions or jumps in the code.

The post goes in detail about how for sorting, this sort of comparison function is called so frequently it has a massive impact on performance

veidelis · on May 24, 2022

..this comparison function of sort..