Whenever I read a post with this premise and the first thing that pops up is qso...

vitno · on May 24, 2022

This comment really misses the point of the post and kinda just feels like you have an ax to grind about sorting. This feels like when people say "don't worry about memory safety, just don't write bad code". "don't worry about generics, just don't sort"

Generic sorting is something all programmers are familiar with so it makes good examples.

jstimpfle · on May 24, 2022

It's not a good example if it isn't a realistic situation.

Kinda like the Dog->Animal inheritance hierarchies to prove why OOP syntax is required.

ncmncm · on May 24, 2022

OOP syntax is not in fact required, and such hierarchies are not in fact presented for that reason. They are used to illustrate a concept using a familiar subject.

OOP is rarely the right way to organize a program, but almost any sufficiently large program has one or more subsystems that map naturally to OOP. A language that supports it is helpful in those places. Saying you could code the same thing without the feature wholly misses the point: of course you could, but it would be more work, and there would be more places to make mistakes.

The same applies to any powerful language feature. Where it is useful, using it helps. Where it is not useful, another may be. Languages lacking powerful features make the programmer do more work, leaving less time and attention to make the program, or other more important programs, better.

jstimpfle · on May 24, 2022

I wasn't saying there can't be a good way to put a feature to use. And, like any programmer that is maintaining a level of pride, I'm aware I am constantly looking for reasons why certain features aren't needed or are even bad in the global picture, at least in a given context. Looking for a minimal set of axioms to build the world on does make sense. Language feature can be seen as axioms in the sense that it's hard or impossible to break them down, since they're hidden in the compiler.

Regarding OOP syntax specifically, let us agree that it is a question of religion if writing foo.bar() instead of foo_bar(foo) is really a huge saving of work. There are numerous counter-arguments (language-specific ones as well as more general ones) why it could be not, and could in fact be detrimental.

I disagree with the view that using any feature to quickly get something done leaves more time to make the program better. In my view, less focus on syntactic superficialities like OOP method syntax, and premature optimizations like using std::sort, leaves more time to focus on what really matters, to understand what a program does and needs to do on a global scale, and it holds me back less waiting for slow builds.

All of that is pretty subjective, and there are certainly good examples for many different approaches, and naturally one is most familiar with those that one likes personally. The important thing is to be happy and be / feel productive.

ncmncm · on May 25, 2022

OOP is absolutely not about syntax. At all. The differing syntax is nothing but a visual clue, for benefit of readers, that some late binding might be going on.

Literally anything that takes your attention necessarily takes it away from something else. Anything that does not need your attention anymore frees it up for important things you now have more time for. That is true for everyone, unless you waste it on looking at cat pictures (or, here) instead.

Failure to understand the uses and consequences of unfamiliar language features is not a virtue. Every single line of code anyone writes does not use almost all features of a language. All that is different in your case is failure to use features in those places where using them correctly would have made your code better.

The important thing is good code. If your code is not improving, you short-change yourself.

jstimpfle · on May 25, 2022

I often like your comments but sometimes not. Here I would definitely downvote if I could, let me explain why.

- You are disagreeing with me on things I did not say and that were not the subject of discussion. ("OOP is not about syntax")

- You are "teaching" me, from a high point, quite arrogantly. I'm particularly irritated by this: "The important thing is good code. If your code is not improving, you short-change yourself." Suggesting I am not improving, and giving commonplace advice as if I wouldn't realize that.

- You're suggesting a "failure" of mine to take an appropriate action, without an existing situation to judge.

- You are disagreeing with something I said and that I made an effort to create a reasonable argument for, but you're not presenting a counter-argument and instead are just repeating what you said before.

- While I think you might be super smart, I feel it was altogether a low quality comment of yours with little grounding in reason. Let me go back to this: "Failure to understand the uses and consequences of unfamiliar language features is not a virtue". Not that I ever claimed it was. Conversely, do you think that understanding the uses and consequences of all or most language features is a virtue? Do you think tools matter over results? Do you think a mathematical theory is more beautiful if you introduce more concepts than necessary to elegantly transport the message? Do you agree that features have a utility but also a cost? Can you see how writing a piece of code using a new feature creates new interactions and complexities with the rest of the code (including seemingly unrelated parts), producing new headaches?

(easy examples: isn't it enough to use functions? Why should I use methods, creating new problems when I need to take a function pointer and making it harder to find places of usage? Why should I use private methods, when there is no good reason to declare them in the header in the first place (static functions in the implementation file are just fine!) especially if this exposes implementation details?)

Do I have to apologize that while there might be theoretically ways to accommodate uses of many different features, I'm simply not willing to invest the effort to figure those out - effort that I'm not sinking into problems that I'm more interested in? I simply don't feel like the programming language I use (I like C) is holding me back from what I'm interested in doing.

ncmncm · on May 25, 2022

First: I apologize for my tone. You did not deserve to be talked down to.

In answer to questions. Yes, it is a virtue to understand more language features, even in languages one doesn't use. Nobody understands all features in all languages. Many think they understand features they do not. (It appears, by your remarks, that you do not understand features that support OOP, so cannot reason reliably about their potential value for particular cases.)

A program using more powerful features, where they contribute something useful, will be shorter and clearer, and offer less scope for mistakes, than one with the same semantics cobbled together from low-level primitives. Anyone reading the latter has to dig down to see whether it really attempts what it seems to, and really does what it attempts. There are no such doubts about standard features.

A proof that relies on previously proven results is usually considered more elegant, but one that relies more on axioms may have more generality, which is also desired. The analogy to code would be a program relying on fewer third-party libraries, making it more portable and less fragile. That reasoning would not apply to language features, which are more akin to axioms.

(Euclid proved as much as he could without using the fifth axiom, which turned out millennia later to make those results applicable in spherical and hyperbolic geometry. This does not seem analogous to anything on-topic.)

Do features have a cost? Everything costs, so the cost of one choice can only be compared to the cost of another. Some costs are borne once, e.g. learning a feature. Some cost at build time. Some cost at runtime. We need to know which is which. Any abstraction used imposes the cost of knowing its domain, and, where that is limited, ensuring each use entirely fits in it. This cost is balanced against the load of extra detail exposed in not using it.

Results and tools are not at odds. Results include more than program output; the program itself is a result. Drawing upon the full suite of available tools produces better results. A screwdriver handle might substitute for a hammer, but it is not a thing to brag on.

Using a feature that doesn't contribute anything is just flailing. Everything in a program should bring it closer to achieving its goals. Not knowing to use a feature where it would have helped is paying a cost in every such case just to avoid the one-time cost of learning the feature, generally without even knowing we are paying it.

Features are added to languages always against enormous resistance. Any that make it in have passed the test of making important programs better. Not learning an available feature has the consequence of not knowing where using it would have made one's program better.

Not using a feature because you don't understand it is very different from not using the same feature because you have adjudged that it adds not enough value. Our tools affect us more than we imagine. What we are interested in doing is always affected by what we know how to do, or know can be done. We attempt more if we know more.

Attention is always the scarcest resource. Spending it learning language features takes it away from other things. But learning language features is an investment that pays back anywhere the feature might be useful, even where we don't end up using it.

jstimpfle · on May 25, 2022

> (It appears, by your remarks, that you do not understand features that support OOP, so cannot reason reliably about their potential value for particular cases.)

I would be interested to learn what you think those features are; I have years of experience fooling around in languages such as Python, Javascript, Java, C++11+, Haskell, and also other less common ones. Maybe I have in fact not understood how to best make use of certain features. I've always been a little bit obsessed with performance & control, and I ended up in frustration about non-orthogonality / non-composability of modern language features and approaches. So I found myself on the path of exploring what's possible when limiting myself to basic flow control and basic read and write operations and function calls.

It's certainly been a journey of ups and downs, but I'm in good company, and I've gotten to know some people who are amazingly productive this way. Basically my standpoint as of today is that at least for systems programming, languages should get out of the way. There is little value in using features that make it easier to forget about certain aspects of programming, when those tasks are actually only a small part of the total work, and for good performance it's important to get the details right - which can be done mostly in a centralized location. Of course I'm thinking about resource and memory management. (Most critics of bare-bones C programming still think it is about matching malloc() and free() calls. That's a terrible way to program, as they rightly claim, and there are far better approaches to do it that do not involve language features or libraries).

What matters IME is mostly systemic understanding, and sinking time into getting the details right. Papering over them is not a solution. YMMV, maybe I'll soon have another series of attempts at "modern" approaches, and whatever the way to programming heaven ultimately is, I've already learned a couple of insights that transformed my thinking significantly, and that I wouldn't have learnt otherwise.

ncmncm · on May 26, 2022

The essence of OO, from the standpoint you describe, is a struct containing a pointer to an array of function pointers. Such an object is what, in kernel memory, a file descriptor in Unix locates; read() on a file descriptor calls through one of those pointers. Syntax this, "private" that, "this" the other are window dressing. You obviously can cobble that together by hand, as has been done in Unixy kernels forever. But, as a language feature, it is always done exactly right, with no distracting apparatus. You never need to check if it is really meant for what it seems to be, or if it really does exactly what it seems meant to do.

It is the same for other powerful features. You might, in principle, write code by hand to do the same thing, but in practice no one has enough spare time and attention, so you do something more crude, instead. A destructor is not something sneaky going on behind your back: you scheduled it explicitly when you called the constructor, certain it will run at exactly the right time, every time, with no possibility of a mistake. You never need to check up on it, or wonder if it was missed this one time.

The destructor is still the single most powerful run-time feature of C++. It was the only instance of deterministic automation in any language, ever, until D and Rust adapted it. Most of the other powerful features in C++, as in other powerful languages, direct action at compile time. The closest analogs usable in C are cpp, yacc, and lex. You can't cobble up your own without abusing macros or making other programs like yacc. The world has not been kind to attempts at those. But build systems see no problem with powerful language built-ins.

Static types are far more powerful than what C does with them, just checking that function calls match declarations. (Function declarations, BTW, were copied into C from C++.) A powerful language puts types to work to do what you could never afford to do by hand.

jstimpfle · on May 26, 2022

Sure I understand all of that, and I'm really surprised you thought I don't.

You describe the essence of OO being implemented by vtables and how C++ does the right thing while many Unixy kernels do this by hand.

(I understand the part about manually set up vtables as well, and in fact one of the things I do for money is maintenance of a commercial Linux kernel module that has lots of these. I don't feel like Linux has a problem encoding these tables by hand, while it is one of the projects that has the most need for vtables because it fits the description of a system that needs to be very flexible. Linux has many more relevant problems than hand-crafted vtables.)

Maybe it will surprise you, but for the greenfield stuff that I do in my spare time and that I'm allowed to do as my job, I can write dozens of KLOC of code without ever needing to setup a vtable. This probably isn't even that special for how I approach things - if we go to any random C++ class, chances are that by far the most methods there aren't virtual, or at least are prematurely virtual.

Personally I think of vtable situations as less than ideal from a technical perspective. I get that these situations can arise naturally in highly flexible systems where there are lots of alternative implementations for the same concept. On the other hand, I think vtables are problematic as a default, in particular for abstractions that aren't yet proven by time to work. They tend to create this kind of callback hell where stuff never gets done in the most direct way, and where you end up with much more repeated boilerplate than you'd like.

There are many other ways to go about "interfaces" that are not vtables, and it depends on the situation which is best. Vtables are callback interfaces, and I try to avoid callbacks if possible. As described callbacks make for incoherent code, since the contexts of the caller and the callee are completely disjoint. Another problem is that they imply a tight temporal coupling (a callback is a synchronous call)!

I would say the primary approach that I use where vtables are maybe often advertised, is to decouple using queues (concurrency, not necessarily implying parallelism). This achieves not only decoupling of implementation but also temporal decoupling. Asynchronicity should never replace standard function calls of course. But it is a great way to do communication between subsystems. Not only for requests that could take a while to process (parallelism) but also from a perspective of modularity and maintainability.

read() and write() are the best examples, they are now starting to be replaced by io_uring in the Linux kernel. I think read() and write() maybe only ever made sense in the context of single-cpu machines. Doing these things asynchronously (io_uring) makes so much more sense from an architectural perspective and also from a performance / scheduling perspective if requests aren't required (or impossible) to be served immediately.

Leaving that aside, there are other ways to do vtables than the one C++ has baked in. A few months ago I watched a video about this by Casey Muratori that I liked, but I can't find it right now. He talked about how he dislikes exactly why you stated, because it promotes one way to "implement" OO (which might not be the best, for example because of double indirection) over others.

Regarding (RAII) destructors, it's a similar situation. If I need a language feature that helps me call my destructors in the right order, this could mean I have too many of them and lost control. I can see value in automated destruction for script-like, scientific, and enterprise code, and admittedly also for some usage code in kernel / systems programming a la "MutexLocker locker(mutex);". But as code gets less scripty and more system-y, the cases where resources get initialized and destroy in the same code location (scope) become fewer. I have decided to see what happens if I leave destructors out completely and try to find alternative structures, which I feel has been fruitful so far.

ncmncm · on May 26, 2022

As I said before, OO is not a valid organizing principle for programs. It is just a pattern very occasionally useful. The feature is there for such occasions. It does not merit multiple paragraphs.

Destructors, by contrast, are a key tool. The same is true of other actually-powerful language features. Avoiding them makes your code less reliable and slower. You may assert some moral advantage in chopping down trees with an ax, but professionals use a chainsaw for reasons.

jstimpfle · on May 26, 2022

Maybe the goal was never to chop trees?

One of the projects I'm working on right now is a vector GUI with one OpenGL based and one software based render pipeline. It's also networked (one server only). Right now I can't tell you a single thing that needs to be destroyed in this program. When you start it, it acquires and sets up resources from the OS, and it uses those. Well, there is a worker thread that reads files from the filesystem, that one needs to fopen() and fclose() files. There is also a job_pool_lock() and job_pool_unlock() to read/write work from a queue...

When the program is closed, the OS will cleanup everything, no need for me to do that (that would also be slower). And note that what the OS does is not like RAII destruction of individual objects. It's maybe also not like a garbage collector. It is like destruction of a (linear) arena allocator, which is how I try to structure my programs when that is required. This way reduces need for destruction calls by so much that I seriously don't worry about not being able to use RAII.

(The frame rendering also works like that - When starting a new frame, I reset the memory cursor for temporary objects that have frame lifetime to the beginning, reusing all pages from the last frame. I'm not ever releasing these pages).

The widget hierachy is constructed by nesting things in an HTML like fashion. I have a simple macro that exploits for-loop syntax to "open" and automatically "close" these widgets. The macro works mostly fine in practice but can break if I accidentally return from this loop (a break is caught by the macro). An RAII based solution would be better here, but I also feel that maybe there must be a better overall approach than nesting generic stuff in this way. The "closing" of elements in this case is needed not because they need to be destroyed (they are stored in the frame allocator after all) but because I don't want to name individual nodes in the hierarchy, but want to build the nesting implicitly from the way the nodes are opened and closed.

There is no way that you could convince me that "destruction" is an important concern in this program. It's probably not enough of a concern to make me switch to some idiomatic automation technology that has potentially large ramifications on how I need to structure my program.

ncmncm · on May 27, 2022

You appear to be unaware of arena allocation.

Destructors are for scheduled future work. Anyplace you don't need work scheduled, you don't need a destructor. Anyplace you do, there they are.

jstimpfle · on May 27, 2022

> You appear to be unaware of arena allocation.

That is literally what I described, and named. It appears I appear a lot of things to you that aren't so.

> Anyplace you do, there they are.

My point was, maybe, after all, there aren't that many things that have to be scheduled. And there are other, sometimes better (depending on situation) ways to schedule these things, too.

After all, there is a LOT of value in using plain-old-data without any "scheduled" stuff attached to it. POD is a huge tool for modularization, because it allows to abstract from arbitrary objects to "untyped bytes".

But hey, guess I might go back at some point and use RAII for some things when I've learned enough good ways to avoid them, so I won't tend to paint myself in a corner just because RAII is always easily available.

japhib · on May 24, 2022

You should probably read the post first, my dude.

> Half of the time if you need to sort, you're doing something wrong and the data should already be sorted

OP is literally talking about the C implementation of Postgres. He probably needs to sort.

sophacles · on May 24, 2022

Why can't postgres just get it's data in a pre-sorted way tho?

sicp-enjoyer · on May 24, 2022

Stable sort and binary search is the bread and butter of pretty much every database operation.

When you bulk insert into indexed columns, it sorts the inserted data and merges that with the existing index. If you do joins on non-indexed columns it often sorts the intermediate results.

jstimpfle · on May 24, 2022

Only the least technical among many possible reasons: it cannot know in advance the row ordering that is requested by the user.

jstimpfle · on May 24, 2022

I hadn't known that we're on such close terms, my friend.

Seems like this case is falling in the other half then.

japhib · on May 25, 2022

If “my dude” offended you, I’m sorry. I was just trying to make it lighthearted. I get frustrated at the number of comments on HN articles that are mostly irrelevant if you’ve actually read the article.

jyounker · on May 24, 2022

He knows the information about the author because he read the article.

melissalobos · on May 24, 2022

> most people that whine about generics & performance don't know a lot about performance

Really it is almost everyone who doesn't know a lot about performance. Any modern programming language is targeted not at an expert, but a beginner(look at Rust which is supposed to be relatively difficult, yet is quite quick for people to pick up in my experience).

Languages should be able to be performant, even when used in poor ways. No matter what, people won't presort their data, since it likely comes from some database or csv file or json response or who knows what. So a language should be trying to accommodate that use case.

sicp-enjoyer · on May 24, 2022

Sorting or hashing are the main ways to:

- construct sets

- remove duplicates

- construct and query dictionaries (especially multi-maps, btrees, etc)

- group equivalent elements

- find the number of occurrences

- etc

Hashing solutions tend to be popular in modern language standard libraries, but not in C or C++ where in-place algorithms and simpler implementations are preferred. You could also make the argument that it's much harder to use hashes with generics as it's not always clear how to define good hashes for complex data types. Alternatively, sort requires only a < implementation.

> bucket sort which you could say runs in linear time.

This only works for integers in a small range. Also consider strings, doubles, or lexicographically ordered arrays.