> I want to elaborate a bit on the tradeoffs of this decision. The reason Prisma uses this strategy is because in a lot of real-world applications with large datasets, DB-level JOINs can become quite expensive...
> The total cost of executing a complex join is often higher than executing multiple simpler queries. This is why the Prisma query engine currently defaults to multiple simple queries in order to optimise overall throughput of the system.
> But Prisma is designed towards generalized best practices, and in the "real world" with huge tables and hundreds of fields, single queries are not the best approach...
> All that being said, there are of course scenarios where JOINs are a lot more performance than sending individual queries. We know this and that's why we are currently working on enabling JOINs in Prisma Client queries as well You can follow the development on the roadmap.
Though this isn't a complete answer still. Part of it is that Prisma was, at its start, a GraphQL-centric ORM. This comes with its own performance pitfalls, and decomposing joins into separate subqueries with aggregation helped avoid them.
It's a completely ridiculous answer though. They're linking to High Performance MySQL's 2nd edition, which came out in June 2008, and was written for users of MySQL 5.0 running on 2008-era hardware.
My take, as a MySQL expert: that advice is totally irrelevant now, and has been for quite some time. It's just plain wrong in a modern context.
I’m not even sure it was correct for its time? The whole point of an RDBMS is to execute join operations. The only reason I’d suspect an RDBMS to be bad at its one fundamental job, in any point of time, is the N+1 query scenario or multiple left joins with unrelated dependencies, but that’s triggered by bad orm abstractions to begin with
It was absolutely valid advice for its time, but only in the highly specific cases/reasons outlined in the book. The 2nd edition was written by the top Percona folks, who pretty much had more experience scaling databases for large websites than anyone else.
The Prisma answer just does not summarize correctly what the book was saying.
It mainly boiled down to sharding and external caching. Storage and memory were much smaller back then, so there was a lot of sharding and functional partitioning, and major reliance on memcached; all of those are easier if you minimize excessive JOINs.
The query planner in MySQL wasn't great at the time either, and although index hints could help, huge complex queries sometimes performed worse than multiple decomposed simpler queries. But the bigger issue was definitely enabling sharding (cross-shard joins had to be handled at the application level) and enabling external caching (do a simple range scan DB query to get a list of IDs/PKs, then do point lookups in memcached, then finally do point lookups in the DB for any that weren't in memcached).
And, in light of that, the default is changing in a couple of months after the JOIN mode has had a significant period of time being tested in the real world.
This is answered at the very top of the link on the post you replied to. In no unclear language, no less. Direct link here: https://github.com/prisma/prisma/discussions/19748#discussio...
> I want to elaborate a bit on the tradeoffs of this decision. The reason Prisma uses this strategy is because in a lot of real-world applications with large datasets, DB-level JOINs can become quite expensive...
> The total cost of executing a complex join is often higher than executing multiple simpler queries. This is why the Prisma query engine currently defaults to multiple simple queries in order to optimise overall throughput of the system.
> But Prisma is designed towards generalized best practices, and in the "real world" with huge tables and hundreds of fields, single queries are not the best approach...
> All that being said, there are of course scenarios where JOINs are a lot more performance than sending individual queries. We know this and that's why we are currently working on enabling JOINs in Prisma Client queries as well You can follow the development on the roadmap.
Though this isn't a complete answer still. Part of it is that Prisma was, at its start, a GraphQL-centric ORM. This comes with its own performance pitfalls, and decomposing joins into separate subqueries with aggregation helped avoid them.