One of the close colleagues I was alluding to also works in HFT and they do in f...

logicchains · on Jan 25, 2020

I'm curious why they found Julia intractable. In my experience it's much quicker to write than Rust, C++ or Cython. It's also much more expressive than Cython.

Is it because they tried embedding it in C++? That can be painful, because it needs its own thread, and can only have one per process, but it's certainly doable.

mlthoughts2018 · on Jan 26, 2020

I’m not sure what you mean by saying Julia is more expressive than Cython, given that Cython is as expressive as C.

In this shop’s particular situation, it’s mostly the switching costs to Julia that cause it to lose the debate. The firm has lots of systems software, data fetchers, offline analytics jobs, research code, etc. With Python & Cython, they easily write all of it in one ecosystem, build shared libraries that span all these use cases, rely on shared testing frameworks, integration pipelines, packaging, virtual envs, etc.

If Julia offered some kind of crazy game changer advantage that required a huge amount of effort to get in Python/Cython, they might consider breaking off some subsystem that has to have new environment management, new tooling, etc., and is not sharable across as many use cases.

But there is no such case. They might get some sort of “5% more generic” or “5% benefit from seamless typing instead of a little rough around the edges typing in Cython”, and these differences would never justify the huge costs of switching or the missing third party packages that are heavily relied on.

I always like to remind people that in any professional setting, ~95% of the software you write is for reporting and testing, and 5% at best is for the actual application. Out of that 5%, another 95% never has serious resource bottlenecks and taking care to write super careful optimized code for the 5% of the 5% can be done in nearly any language. Choose your ecosystem based on what best solves your problems in that other 99.75% of cases.

This is especially true in HFT and quant finance, which is why so many of those firms use Python for everything except the 0.25% of the code where performance is insanely critical, they just use anything that super easily plugs into Python, usually C++ or Cython.

logicchains · on Jan 26, 2020

>I’m not sure what you mean by saying Julia is more expressive than Cython, given that Cython is as expressive as C.

I mean expressive in the sense of how much you can get done per unit code/time. Perhaps a better way of phrasing it: for most problems X that I encounter in my work, I can write code in Julia to solve X faster than I could write C/C++ to solve x, and also faster than I could write Cython to solve x. Excellent type inference is a big part of this, along with macros, multiple dispatch, and libraries designed with performance in mind (e.g. https://juliacollections.github.io/DataStructures.jl/latest/...).

>In this shop’s particular situation, it’s mostly the switching costs to Julia that cause it to lose the debate. The firm has lots of systems software, data fetchers, offline analytics jobs, research code, etc. With Python & Cython, they easily write all of it in one ecosystem, build shared libraries that span all these use cases, rely on shared testing frameworks, integration pipelines, packaging, virtual envs, etc.

>I always like to remind people that in any professional setting, ~95% of the software you write is for reporting and testing, and 5% at best is for the actual application. Out of that 5%, another 95% never has serious resource bottlenecks and taking care to write super careful optimized code for the 5% of the 5% can be done in nearly any language. Choose your ecosystem based on what best solves your problems in that other 99.75% of cases.

That makes sense then. In my firm at least (and in my team at least) the case is different: we're mostly full stack, so each member will be responsible for the whole pipeline from research->model_development->backtesting->production_algo_development->algo_testing/initial_trading. In this case 95% of my time is spent writing research code, running research, and writing production code, so if I can double the speed at which my research code runs or double the speed at which I can write it, that translates into a massive increase in my productivity/output.