splice(2) and tee(2) were basically tailor made for cat(1) and tee(1), and for s...

splice(2) and tee(2) were basically tailor made for cat(1) and tee(1), and for some reason I was under the impression that GNU tee was using splice(2) and/or tee(2). Using these could be trivial--just a few extra lines of code that could fall-through to the existing methods, for a huge speed-up in performance. (Performance matters because the consumers could be CPU bound, and an inefficient cat or tee might be taking away resources that could be used by the consumer.)

Regarding portability, GNU tail uses the Linux-specific inotify(2) to respond faster to writes. Like alot of OSS, coreutils uses the BSD .d_type member extension[1] of struct dirent to avoid unnecessary stat() calls. There are many other more intrusive OS-specific details baked into coreutils, but often it's the nature of the problem--in many situations you're dependent on platform-specific details or features. For the most part, these nitty-gritty platform-specific details are far more intrusive in terms of code complexity than the performance optimizations.

[1] Missing from Solaris, and probably most other SysV derivatives.