Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Bullshit Web (pxlnv.com)
1017 points by codesections on July 31, 2018 | hide | past | favorite | 550 comments


I've said this before, but it bears repeating:

Moby Dick is 1.2mb uncompressed in plain-text. That's lower than the "average" news website by quite a bit--I just loaded the New York Times front page. It was 6.6mb. that's more than 5 copies of Moby Dick, solely for a gateway to the actual content that I want. A secondary reload was only 5mb.

I then opened a random article. The article itself was about 1,400 words long, but the page was 5.9mb. That's about 4kb per word without including the gateway (which is required if you're not using social media). Including the gateway, that's about 8kb per word, which is actually about the size of the actual content of the article itself.

So all told, to read just one article from the New York Times, I had to download the equivalent of ten copies of Moby Dick. That's about 4,600 pages. That's approaching the entirety of George R.R. Martin's A Song of Ice and Fire, without appendices.

If I check the NY Times just 4 times a day and read three articles each time, I'm downloading 100mb worth of stuff (83 Moby-Dicks) to read 72kb worth of plaintext.

Even ignoring first-principles ecological conservatism, that's just insanely inefficient and wasteful, regardless of how inexpensive bandwidth and computing power are in the west.

EDIT: I wrote a longer write-up on this a while ago on a personal blog, but don't want it to be hugged to death:

http://txti.es/theneedforplaintext


I like this rant, you should go the next step:

All you need to 'fix' this is a fast loading news website that gets enough paid subscribers to earn enough margin from subscriptions that you can pay for a news staff, an office, and various overheads.

That is a longish way of saying that 99.9% of the overhead in any modern web site can be traced almost entirely to the mechanisms by which that web site is attempting to extract value from you for visiting/reading.

If people would visit with a 56K modem and deal with a 3 - 10 second page load, then that is the bar. And any spare bandwidth you might have is available for the web site to exploit in some way to generate revenue. The more bandwidth between you and them, the more ways they can come up with to exploit that bandwidth for additional surveillance, ads, or analytics that will get them more money.

When you are the customer, which to say it is your purchasing of a subscription or articles is the only revenue the site needs in order to survive, then the things that retain you as a customer have the highest priority (like fast page load times, minimal bandwidth usage).

But when you are a data cow, a random bit of insight into a picture much bigger than you can comprehend, a pixel in a much larger tapestry, or an action droplet in a much larger river of action. Well then there isn't really any incentive to make your life better, as long as the machine we have milking you for data can get even a couple of molecules more of that precious data milk without scaring you out of the barn. Well we'll build right up to that limit.


Hilariously, the New York times tries both: you get five or so article reads (with shitloads of tracking), and then you have to pay to read more per month.

But if you're paying, the pages don't load any different. You're paying to be mined.


In paper newspapers this is the norm. You can read the front page for free, have to pay to get the rest, and the rest is still filled with ads.

The difference is the tracking. I don’t think ads are really the problem. It’s the tracking that bloats pages and intrudes on privacy, and the tracking doesn’t need to be there because other media have ads without tracking and manage just fine.

There’s a race to the bottom here. Tracking earns more revenue, so to be competitive you have to do it. Most sites won’t stop tracking until forced to by either the basic infrastructure of the web or legal requirements. I hope GDPR will lead to the disappearance of tracking, but so far most sites seem to pretend tracking is compatible with GDPR.


Not really. You're paying for the additional content. The tracking is external to that deal.

I'm not saying I like it that way, but you're conflating two unrelated things.


This is why I've never subscribed to cable TV. I'm not going to pay for the privilege of watching 20 mins of commercials an hour.


That's not what you are paying for with basic cable; you get that with broadcast for free.

What you are paying for is a broader choice in the filler between the commercials.


I agree in theory. However, I haven't noticed the slowness in their website and the ads are well done and blend in with the webpage. They may have to redesign/rearchitect their whole website to get what you are asking for.

It should be noted that traditional newspapers include ads alongside news content and no one complains. In fact people used to sift through the Sunday NYT simply for the ads.


> the ads are well done and blend in with the webpage

That's called "native advertisement" and it's supposed to trick you into thinking you're reading a genuine article instead of an ad. I actively avoid sites that do this.

> In fact people used to sift through the Sunday NYT simply for the ads.

Back when people couldn't google for stuff, and the ads were useful because they mostly came from local businesses you actually needed once in a while.


There is no such thing as "the ads are well done and blend in with the webpage". Especially on a website where you pay for a subscription.


Moreover, there is a tipping point, beyond which the ad blends too well and becomes plain deception. This is even more of a problem for journalistic publications. See also: native advertising.


Indeed. In print, you had dedicated pages to advertisement. Not optimal, but much easier to ignore. Nowadays, you never know if an “article” is simply a marketing agenda.


Yes because the signs people put around the "native ads" literally stating that it's an "ad" or "sponsored content" are easily ignorable. Get real. People pay for a newspaper with ads in it. You still have to sift through it. Ignore it? You literally have to turn the page or spend 5 minutes pulling the ads out of the newspaper if you don't want them and in some cases there isn't a way to escape the ad with print because part of the article is there with the ad. Ad block won't save you then.


That's no different from paying for a real newspaper subscription though.

But I guess you could say that you are paying for delivery instead.


Tbh they could email me articles plaintext and I'd happily hand over my money.


Imagine that, a digital newspaper in your digital mailbox!


You mean sending you news in a digital letter?


How intriguing! I would like to subscribe to your... how do you say... letter of news.


but a letter of news may be too small, maybe we can remove that size constraint and call it 'a paper of news' that would be delivered


Maybe they could send multiple letters and I could have some kind of client that shows me the headlines and allows me to open the articles I'm interested in.


That would be amazing. How come Google hasn't invented something like that yet?


Over 20 years ago, the San Jose Mercury-News offered exactly such a service.

Called Newshound, it let you set up to face sets of keywords (in the basic $5/month subscription), and it would email you the plain text of every article that matched the criteria, whether generated within the publisher network or from wire services.


It would be trivial to write a script to scrape text.npr.org and send it to you.

Or you can just visit it, I suppose.


Firefox and Safari each have a 'Reader Mode' which does exactly what you want: presenting a web-page in the absence of any web design.

It's really the ultimate condemnation of modern web design that this feature is so useful.

Edit: won't help the data-consumption though, as I believe it can only be enabled after the page has loaded


In reply to your edit. You can use something like umatrix to block almost everything (even css) and reader mode will still work, I do this for most newspapers and works quite well.


we used to have usenet...


And when I first got online in the early 90's, my ISP had an additional subscription option to get real newspaper-type articles delivered in a special usenet newsgroup hierarchy (can't remember the name of the news service itself though).


What do you mean "used to"? Usenet still exists.


And how good is it?


Like any community, that's defined by the people in it. Some groups are excellent, some groups are mostly dead, some display varying levels of toxicity. Overall my experience hasn't been a bad one.


What kinds of groups are there?

Given the lack of popularity and commercial support, combined with the complexity to connect a client with an available server (compared with downloading an app from the store), I'd expect that they are populated mostly by old tech people or passers-by from universities; and any special interest group would have small user base coming from that demographic profile. Are my assumptions correct?


Reader Mode, my friend.


> That is a longish way of saying that 99.9% of the overhead in any modern web site can be traced almost entirely to the mechanisms by which that web site is attempting to extract value from you for visiting/reading.

Well, that and the fact that front-end developers just can't seem to exist without pulling in hundreds of kilobytes, or even megabytes, of JS libraries. You actually don't need all that crap to serve adverts, or really even to do tracking: people managed without it in the 90s. It's just that it's more work to get the same effect without a buttload of JS in this day and age, and most third party tracking services involve their own possibly bulky JS lib[1]. The thing is, given the slim margins on ad-serving - whilst I don't condone it - I can see why people don't bother to put in the extra effort to slim their payloads.

[1]And if you have a particularly idiotic marketing department they might want, say, tracking to be done in three or four different ways, requiring three or four different libraries/providers. This is not merely cynicism: I encountered exactly this situation at a gig a few years back.


Enjoy a recent Hacker News discussion of a 404 error page that employed a 2.4MiB JavaScript framework and consumed significant CPU time to display.

* https://news.ycombinator.com/item?id=17383464


> "You actually don't need all that crap to serve adverts, or really even to do tracking: people managed without it in the 90s. It's just that it's more work to get the same effect without a buttload of JS in this day and age"

While I sympathize with this sentiment, this is also the entire history of computing in a nutshell. Moore's Law has driven us orders of magnitude beyond where we were when personal computers first came into existence; but Wirth's law[1] has kept pace. The laptop I'm typing this on right now has 8 GB of RAM, and that's already become pathetically tiny, pretty much the minimum viable for a consumer PC; I have to keep checking my memory usage or I'll spill over into swap (on a mechanical drive) and have to wait several minutes while my computer recovers.

Performance in computer applications fundamentally doesn't improve. Stuff gets prettier, sure, and applications do more. But things will still run about as slowly as they always have, sometimes a little worse. (There are exceptions - some things like loading programs from tape, or loading things from an HDD once SSDs were invented, were so painfully slow compared to their replacement that you'd have to actively try to write slow code to get anywhere near that performance.) It's ease of programming, flexibility, and freedom of design (in aesthetics and interface) that the advance of computing technology has always enabled. And all of those are extremely valuable in their own way, and can make applications genuinely better - even allowing qualitatively new things to come into existence that wouldn't have been feasible before - even if they don't run faster or take up less of your memory.

(To understand why, think about the development of - say - computer games since the 90s. For all that we mock poorly optimized games, how inaccessible would game development be if we required them to be coded as efficiently as Carmack built Doom? For all that mindlessly chasing "better graphics" has ballooned costs and led developers to compromise on gameplay, how many games simply couldn't be translated to 90s-era graphics without fatally compromising the experience? How many projects would never have been started if we set the skill floor for devs so high that a hobbyist couldn't just download Unity and start writing "shitty" code?)

(Or think about something like Python. Python is a perfect example of something that allows devs to massively sacrifice performance just to make programming less work. If we kept our once-higher-by-necessity standards for efficient usage of resources, something like Python's sluggish runtime would be laughable. But I think you, and I, and everyone else can agree that Python is a very good thing.)

[1] "What Intel giveth, Microsoft taketh away."


(All that being said, I'm also fairly salty about having 8 GB of RAM and a mechanical hard drive rendering my computer incredibly painful to use as technology has marched on. Discord - which I use almost exclusively as an IRC chatroom with persistent while-you-were-gone chat history, embedded media, and fun custom emotes - is an entire Electron app that eats over 100 MB minimum; Firefox is eating 750 MB just keeping this single tab open while I type this. Even with no other applications but those open, Windows 10 and assorted background processes already push me to 5.7 GB allocated. Various Windows background processes will randomly decide they'd like to peg my disk usage to 100% for ten to fifteen minutes at a time, which I imagine is because spinning rust disks are considered deprecated.

I saw a discussion on HN a few months back about a survey of computer hardware, and one dev in the comments was shocked - shocked! - to find out that the typical user didn't have 16 GB and a 4k screen. That definitely rustled my jimmies a bit.)


> I saw a discussion on HN a few months back about a survey of computer hardware, and one dev in the comments was shocked - shocked! - to find out that the typical user didn't have 16 GB and a 4k screen. That definitely rustled my jimmies a bit.)

This is extremely common in dev circles, it's an area where we're completely detached from average users. Just to make the point, here is the mozilla hardware survey that shows >50% of users having 4GB or less: https://hardware.metrics.mozilla.com/ .

If we look at the more technical users on steam (https://store.steampowered.com/hwsurvey) then only ~15% of users have 4GB or less, along with 40% having 8GB.

There's a good reason macbooks top out at 16GB.


Oh, I recognize that Mozilla survey as actually the specific one that user was talking about! Let me see if I can track down the actual comment thread; it's probably less ridiculous than I actually remember it being.

Ah, found it. https://news.ycombinator.com/item?id=16735354


I'm really surprised at the reaction to the resolution. I like 1080p for movies on my (too) big TV but for coding I was more than satisfied once we got to 1024x768 and haven't thought of it since. My home coding machine is a cheap dell at 1366x768 and I've always been happy with it.


I agree with everything you said. I just have a somewhat different experience with Firefox on Windows 10:

>Firefox is eating 750 MB just keeping this single tab open while I type this.

I have 127 tabs open on Firefox Quantum 61.0.1 (64 bit). It uses ~ 1100 MB spread among 7 processes. I have 6 addons enabled (Decentraleyes, Firefox pioneer, I don't care about cookies, Tab counter, Tree style tab and uMatrix).


Why do you hate spinning disks so much? And no they are not considered "deprecated", they're really the only way to affordably store large volumes of data. It's an old, venerable, and still-very-useful tech


I'm also on a system with 8Gb of ram at the moment. Firefox is using up a hilarious 4.6Gb keeping a few dozen web pages open, but the entire rest of my Linux system, including Inkscape, qCAD, and SketchUp under Wine are using only a combined 907 megabytes. So it's possible part of your problem is just Windows 10.


My i3wm environment doesn't randomly start anything I don't ask it to start and runs very comfortably with 8gb of ram heck it would run fine with 4gb and no swap. Maybe you are running the wrong os.


A tiling window manager won't save you when dealing with Electron apps.

I run Linux and StumpWM on my desktop, and recently I had upgrade to 12GB of RAM, because it turns out 8GB is very easy to exhaust these days. I currently have 9.3GB tied up, mostly by browser processes.


Yeah, this. I'm mostly using swaywm instead of Gnome in order to free up about 1 extra gigabyte of RAM for apps, but that equals about one Electron app. The only Electron app I haven't eliminated from my daily usage, though is Patchwork, so it's not so bad.


Funny thing is I use my i3wm environment on 16gb of RAM and a 4k screen. I'm actually migrating to rat poison because it's so incredibly simple and basic that it has been making me drool. I mean, look at the source code. It doesn't get much simpler than that for a tiled WM.


If you are interested in rat poison, you may also enjoy xmonad. I've used both and much preferred xmonad


Load the entire GHC garbage collected runtime just for my window manager? Isn't this the same philosophy that causes people to use Electron? And that results in unnecessarily large memory footprints and runtime performance penalties?


Xmonad is rock solid, lightening fast, and perfect for many who prefer to minimize their reliance on a mouse


I don't think esoteric linuxes are really necessary--I'm running vanilla Ubuntu 16.04 with 16GB RAM and top three processes are only Crashplan (~800MB), Dropbox (~460) and Chrome (327 resident, 1380 shared, per htop, with 6 tabs running). My total usage at the moment is 4.04GB out of 15.4 available, again per htop. Some of the other numbers in this thread are baffling to me.

But I don't run any Electron apps, so there's that.

But, yeah, I guess I wouldn't be able to run this same workload with 4GB RAM. That's what lubuntu is for.


Oh, no, I'm aware these are very much Win10 specific problems and I look at Linux people with a not insignificant degree of envy. Unfortunately, I do quite like PC gaming.


Modern gnome is a pig, some of this could be tracked back to gnome-shell's use of javascript and css styling if you were so inclined. Most linux users don't really notice it due to the fact that most x86 machines are crazy fast.

OTOH, try starting a modern full blown distro on something like an rpi instead of raspian and you will quickly discover that you _NEED_ more than 4G of ram and a lot of CPU just to start firefox. Its even worse if you don't have hardware GL acceleration.

OTOH, the lightweight desktops (lxqt, lxdt, xfce) really are..


While it would seem convenient to simply switch operating systems from the popular, widespread options to...whatever i3wm is, practically speaking that is seldom possible.


I3wm is a window manager you can run on any Linux distribution


Tiling wm does not an OS make.


Can you run recent versions of Photoshop? How about Premiere, or Final Cut?

SolidWorks? CATIA? Matlab?


I find that the best way to do this is to just VNC to a windows or Mac dedicated slave computer to do graphic arts work. VNC is so good nowadays that I can use my QHD phone screen as a second monitor for all my Adobe apps- and the lag nowadays is almost non-existent thanks to super fast wifi.


FreeRDP for connecting to Windows is a good option too.

Should say that KRDC is a good RD connection manager, but it's KDE only unless you wanna install a third of it.


Matlab works on linux.

Premier or Final cut don't but DaVinci Resolve and Lightworks do.

Obviously if your workflow and thus your livelihood depends on a particular tool you should run a platform that runs it but I would question why someone with money would pick windows over mac.


Those are commercial apps for people with real money. OTOH, there are a number of unbeatable free apps that don't have linux ports. Fusion360 comes to mind, its not solidworks level , but its light years ahead of freecad.


Smart marketing treads the fine line providing rich experiences for their users based on the average internet speed vs annoyance. If the service is free they should try to get 'some value' for the content.

As highlighted, some take it way too far by trying to extract 'maximum value' which ends up being counterproductive.


Hence the rant mentioning Molochian economy, under which we operate. And that reference explains in depth why this is a very hard problem.

I like the rant too. Except maybe the bit about sending content at the speed of humans - I for one would like to take lightweight, bullshit-free content as fast as it can be sent, to pipe it to further processing on my end, in the never-ending quest to automate things in my life.


Background for anyone who hasn't read it: https://slatestarcodex.com/2014/07/30/meditations-on-moloch/


I highly recommend everyone give this a read if they haven't. It's probably the best post on Slate Star Codex.


I mean, if you're just reducing the content even further, just request that they make the reduction possible server-side and everybody wins.


I was more thinking about e.g. running a script to fetch 3 different lightweight sites, run some personal code on it and combine the data. If the script would spend 99.9% of its time waiting on IO because of "human speeds", I wouldn't be too happy.

That said, I would be willing to bite the bullet and accept speed limits across the board if it resulted in lean web.


I achieve this with an RSS reader, in my case Miniflux.

Runs on a RPi under my TV and I stay well below my 300MB data cap, while consuming dozens of news sources.


Which news sources? How did you find the ones that still provide RSS? How much of it do you actually read?


I read virtually all of them. Most of them provide RSS feeds, some are a bit hidden but it's Googlable.

I started with the basics BBC, NYT, Guardian, The Intercept for general news, The Conversation for science news without sensationalism, and some tech blogs. Then I just read most of it, and follow some links to find new sources. Most of the times news start with "As reported by X", or just a link, so you can discover new sources like that.

You can also browse HN (and n-gate for the highlights) and Reddit to discover new sources.

If you add so much you can't keep up, remove some, or change the feeds into section feeds. Most online newspapers provide them. I miss Yahoo Pipes, and have yet to find a simple hosted alternative. There is also RSS Bridge for sites without feeds (Twitter, Facebook), but I still haven't found the time to set it up.

You can also add paywalled sources to read the headlines only. You can mark them as read from the index.


I think this is why batch jobs and crontabs exist. Just do what the BBSes used to do for syncing--wait until anti-peak-time and then let loose.


I don't think 3-10 seconds should be the bar. I spent years using a 14.4k / 28k / 56k modem.

That was during the mid and late 90s.

Browsing the web where you need to wait 3-10 seconds for everything to load is not a good user experience. It's a colossal waste of time, and today we have so many more reasons to view more pages compared to back then.

We should strive for an improvement instead of trying to stick with limitations from 20 years ago.

The real problem is people developing sites now give zero fucks about resource constraints. This is exactly like lottery winners who went from being poor to having 50 mil in their pocket but then end up broke in 3 years because they have no idea how to deal with constraints.

It's also a completely different type of person who is running these sites today. Back in the day you made a site to share what you've learned. Now you have "marketing people" who want to invade your privacy, track every mouse movement and correlate you to a dollar amount instead of providing value.


>Browsing the web where you need to wait 3-10 seconds for everything to load is not a good user experience.

Im a lone owner of a website. I have enough time to accomplish one of 3 things before January

>Finish my Finance App

>update 200 pages to have pictures load based on screen type so you can load in 3 seconds instead of 6.

>Collect and compile data to create 20 more pages, all of which my 3000 subscribers actually come to my page for.

Very quickly you can see why an extra 2 seconds of loading time is not on our mind. Its important to allocate resources effectively, changing my website to load faster is limited value added vs creating content that my users actually want.


> Very quickly you can see why an extra 2 seconds of loading time is not on our mind. Its important to allocate resources effectively, changing my website to load faster is limited value added vs creating content that my users actually want.

I understand. I'm also a sole owner of a website where I'm selling a product (video courses for software developers).

My priorities are to give as much value as possible for free and also sell some courses if I can.

According to Google's network tab the DOMContentLoaded time is about 250ms to load any page on my site (which are typically 1,000 to 5,000 word blog posts with some images). From the user's POV, the page loads pretty much instantly. Then about a second later Google Analytics and a social sharing widget pop up, but those happen after the content.

The interesting thing is I really didn't try hard to make this happen. I just stuck to server rendered templates and compressed my assets. I also made an effort to avoid heavy front end libraries and only add javascript / CSS when I needed to. I basically run a heavily modified theme based on Bootstrap with a couple of third party javascript libs (including jquery).

There's a lot of room for improvement but I haven't bothered because it seems good enough. It's very possible to get the perceived load speed of a page to be under 1 second without dedicating a lot of time to it.


So am I and I just don't buy it.

What on earth do people do to get over 1 second load time? Remember that you have to actively spend time to bloat a site.


The most profitable news paper in my country, didn’t have a website with articles or news stories on it until earlier this year. Their page was something from the 90ies (it was probably newer), and all it really offered was info about the paper and a way to buy it.

What they do, that other Danish papers don’t, is write lengthy meaty articles that takes time to read because they actually teach you something new. There was a story on Trumps connections to Russia, it was three full pages long, and we’re talking old school news paper format, so that’s what? 10 a4 pages worth of text?

The paper only comes out once a week, because it takes time to write, but also because it takes time to read.

I’m not sure where I am going with this, I just think it’s interesting how they’ve increased their subscription amounts while not really giving two shits about the bullshit web.

They may give two shits about the bullshit web now of course, having gotten a webpage with articles. I don’t know though, I’m a subscriber, but I haven’t visited their site yet.


What is the name of this newspaper?


Or what if, we allow tracking and ads, with native speed?

There is the problem of Network that won't / cant be fixed in any short term. Then the problem with Rendering and Reflow.

The first one being a long time before everyone gets 1Gbps internet, the 2nd being even if you have 1Gbps internet it will still be slow due to all the mini scripts.

What if the fonts were there in the first place?

What if the browser actively tack every mouse movement, links etc, bringing 80-90% of all the tracking scripts datapoint, and doing so natively, sending back the data to website as requested.

No more 3 - 5MB of Scripts downloaded per site, no more CPU running of these scripts, no more 1MB of fonts. And they don't cause the page to jank. You get Butterly smooth webpages while still getting Ads.

My biggest problem is with the idea of extending the Web via Javascript and everything should be Javascript only. Rather than extending the Web Browser native function.

Unfortunately this is an idea that Apple may not like, even if the data are anonymised.


FWIW, you have effectively just described the 'app' solution.

In that solution all of the tracking and analytics, fonts, and other 'baseline' content are part of the app, which then fetches the unique content (the few Kb of story text and Mb of images) and then renders it all locally. There is even some ability to do A/B testing in that setup.

The "App" itself is basically a browser with none of the non-content UI controls that browsers normally have, that can only go to a specific URL (the content supplier).


Precisely, but we don't want to be bounded by the App Store ecosystem. And we want to improve the UX of Website, which so far hasn't been great.

As a matter of fact, may be these API for tracking, analytics should be the same across Apps and Browsers.

We tech nerds keeps throwing out new terms, Web App, Web Pages, etc, but to our user they are all the same. They want to consumer information in Text, Video and Images, and in a fast and smooth way without Jank.


> All you need

You say that as if it's easy. "All you need" is to have a product that users what to pay for. But I think, all the various tries and attempts has proven that users aren't really that keen to pay for content online, at scale.


They don't need paid subscribers. Physical papers and tv have been supported by ads without tracking for decades. Companies pay for space or time.

Internet allowed advertisers to track so now we have this BS. They had many years to fix this. But tracking is the business of many companies like google and facebook. Now lot of people uses ad blocker and it's increasing.


The price an advertiser pays to be a full page ad in the NYTimes and printed with the paper is on the order of $150,000, the price an advertiser pays to obscure your entire screen with an ad is as little as $1.00.

What you're missing is that advertising rates for television and print are several decimal orders higher than the rates for internet advertising. Why that is is more complicated than you might guess, but the economics of "printing" a newspaper by sending you the text is a couple of orders of magnitude cheaper than running a printing press. Between those two realities a lot of news web sites are being crushed.


If 600,000 [0] people see that $150,000 ad, the advertiser has paid $4 per impression. At $1 CPM, 600k impressions is $60,000, but as you said this cost is at the lowest end. An ad that obscures your entire screen might cost as much as $8 CPM or more [1] and now buying the newspaper ad is sounding like a better deal.

[0] https://www.nytco.com/the-times-sees-circulation-growth-in-f...

[1] https://www.buysellads.com/buy/leaderboard/id/17/soldout/1/c...


Consider the fate of Dr. Dobbs journal - print magazine (https://news.ycombinator.com/item?id=8758915)


NPR text?


data cow.

thank you


Your welcome, but it is oil89's invention of 10 months ago : https://news.ycombinator.com/item?id=15350778 I just love it though.


I don't think thats a meaningful comparison. Moby Dick is a book, written by 1 guy and maybe an editor or two. NYT employs 1,300 people.

When you read a book all you get is the text. NYT has text, images, related articles, analytics, etc. Moby Dick doesn't have to know what pages you read. NYT needs to know how long you spent, on which articles, etc. They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

If Moby Dick was being rewritten and optimized every single day it would be a few mb. Its not, so you can't compare the two.

Yes NYT should be lighter, no your comparison is not meaningful. A better comparison would by Moby Dick to the physical NYT newspaper.


> NYT needs to know how long you spent, on which articles, etc. They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

No they don't. They really don't need to know any of that. They don't even get a pass on tracking because they're providing a free whatever - I pay for a subscription to the NYT. The business, or a meaningfully substantial core of it, is viable without tracking.

It would be nice if the things I pay for didn't start stuffing their content with bullshit. What and who do I have to pay to get single second page loads? It's not a given that advertising has to be so bloated and privacy-invasive. Various podcasts and blogs (like Daring Fireball) plug the same ad to their entire audience each post/episode for set periods of time. If you're going to cry about needing advertising then take your geographic and demographic based targeting. But no war of attrition will get me to concede you need user-by-user tracking.

You want me to pay for your content? Fine, I like it well enough. You want to present ads as well? Okay sure, the writing and perspectives are worth that too I suppose. But in addition to all of this you want to track my behavior and correlate it to my online activity that has nothing to do with your content? No, that's ridiculous.


> No they don't. They really don't need to know any of that. They don't even get a pass on tracking because they're providing a free whatever - I pay for a subscription to the NYT. The business, or a meaningfully substantial core of it, is viable without tracking.

Clearly they disagree. Or maybe you should let them know that they don't need that.

To say it without sarcasm, what you feel you are entitled as a paying customer and what they feel they need/want to understand their customers are clearly at odds. Ultimately, what you think matters nothing in isolation and what they think matters nothing in isolation. What you two agree upon, is the only thing that matters. That is to say, if you think they shouldn't track you but you use their tracking product anyway, you've compromised and agreed to new terms.

I imagine you could come up with a subscription that would adequately compensate them for a truly no tracking experience. But I doubt you two would agree on a price to pay for said UX.


You're correct of course, but I don't really see how this isn't a vacuous observation. Yes clearly our perceptions are at odds, but that has nothing to do with the reality of whether or not they need to be doing that tracking. Obviously they think they need to, or they wouldn't do it. But I think I've laid out a pretty strong argument that they actually don't need to, which leads me to believe that they actually haven't considered it seriously enough to give it a shot.

Would they be as profitable? Maybe, maybe not. Would they become unprofitable? No, strictly speaking. I'm confident in that because the NYT weathered the decline of traditional news media before the rise or hyper-targeted ads, and because I've maintained a free website in the Alexa top 100,000 on my own, with well over 500,000 unique visitors per day. That doesn't come close to the online audience of a major newspaper, but it's illustrative. There is a phenomenal amount of advertising optimization you can do using basic analytics based on page requests and basic demographic data that still respects privacy and doesn't track individual users. I outlined a few methods, such as Daring Fireball's.

Maybe instead of this being a philosophical issue of perspective between a user and an organization, it's an issue of an organization that hasn't examined how else it can exist. Does the NYT need over 10,000 employees? Is there a long tail of unpopular and generally underperforming content that nevertheless sticks around, sucking up money and forcing ever more privacy-invasive targeting? If the NYT doesn't know its audience well enough to present demographic-targeted ads on particular articles and sections, what the hell is it doing tracking users individually? It's just taking the easy way out and giving advertising partners the enhanced tracking they want. But they don't need to do that, and whether or not they think they need to do it is orthogonal to the problem itself.


> You're correct of course, but I don't really see how this isn't a vacuous observation. Yes clearly our perceptions are at odds, but that has nothing to do with the reality of whether or not they need to be doing that tracking. Obviously they think they need to, or they wouldn't do it. But I think I've laid out a pretty strong argument that they actually don't need to, which leads me to believe that they actually haven't considered it seriously enough to give it a shot.

It most definitely is. But so is the word need, in this context. How would we define what they need to do, and what they don't need to do?

My argument is simply such that, of course they don't need to (by my definition), but nothing will change that unless they see a different, more lucrative offer. Ie, "oh hey, here's 2 million readers who will only read the page in plain html and will pay an extra $20/m". It just seems like a needless argument, as I don't believe there's anything that can change their behavior without us changing ours. Without the market changing.

Rather, I think the solution lies not in them, but in you. In us. To use blockers and filters to such an extreme degree that it's made clear that UX wins here, and they need to provide the UX to retain the customers.

Thus far, we've not done enough to change their "need". If a day comes that they do need to stop tracking us, well, they'll either live or die. But the problem, and solution, lies in us. My 2c.


> What you two agree upon, is the only thing that matters.

That's precisely why many of us use (and promote the use of) adblockers and filtering extensions.


Classic narrowcasting mistake that dying companies make.

Statista claims 2.3 million digital subscribers. NYT is trying to milk that 2.3M for everything they got, squeeze the last drops of blood from the stone while they still can.

That's a great way to go out of business, when 99.97% of the world population is not your customer and your squeezing labors are not going to encourage them to sign up.

If you hyperoptimize to squeeze every drop out of a small customer base, eventually you end up with something like legacy TV networks where 99% of the population won't watch a show even for free, and the tighter the target focus on an ever shrinking legacy audience, the smaller the audience gets, until the whole house of cards collapses.

Its similar to the slice of pie argument; there are many business strategies that make a pie slice "better" at the price of shrinking it, and eventually the paper-thin slice disappears from the market because the enormous number of the employees can't eat anymore, but that certainly will be the most hyperoptimized slice of pie ever made, right before it entirely disappears.

NYT is going to have a truly amazing spy product right before it closes.


Why is that doubtful? There's all kinds of examples of tiered subscriptions in the world. I think it would be doubtful because the NYT wouldn't want to explicitly admit all the tracking they are doing.


> Why is that doubtful? There's all kinds of examples of tiered subscriptions in the world. I think it would be doubtful because the NYT wouldn't want to explicitly admit all the tracking they are doing.

Many reasons, one of which you said. What would the price tag be for them to admit all they are tracking?


Currently the price is free, and comes bundled with uMatrix, and a cookie flush. I’d like to pay the NYT for their journalism, but only with money, not the ability to track me. As a result they get no money, and no tracking.


> Currently the price is free, and comes bundled with uMatrix, and a cookie flush. I’d like to pay the NYT for their journalism, but only with money, not the ability to track me. As a result they get no money, and no tracking.

You misunderstood me. I mean, what would they like you to pay them, for them to be 100% transparent about what they're doing for tracking, what their advertisers are doing and who they are, and possibly stopping all that entirely. Ie, what is it worth to them.


Interestingly if you pay them, and thus are logged in when you view an article, then they can better track you.

In contrast if you never sign up, disable JS, and periodically clear your cookies, then the entire site works fine and none of the third party trackers work. At best they can link your browser user agent and IP to a hit on the server side.


NYT needs to produce and recommend content that people find engaging to continue earning their subscription dollars.

The idea that tracking is purely or primarily there to support a business model of selling user data is a strawman invented by self-righteous HNers. You need to know what parts of your product are effective to make it competitive in today’s marketplace.


90% of that can be accomplished with server-side stats. Do you really need to track mouse movements and follow readers with super-cookies across the web to find out what articles people find engaging on your site?

> The idea that tracking is purely or primarily there to support a business model of selling user data

Purely, no. Primarily? You can bet your sweet ass.


I agree in general but there are some things which I don’t see going away any time soon that publishers need. Online advertisers want to know that their ads are being viewed by a human and not a bot, and that they were on screen for long enough and that the user didn’t just scroll past. Publishers want to know how far down you make it in their article, so they know where to put the ads in the body of the article.


I'm not accusing anyone of selling my data and I'm not trying to champion a crusade against the entire advertising industry. I'm asserting that the NYT can achieve the substantial majority of the advertising optimization and targeting it needs to do to be profitable 1) without doing user-specific tracking and 2) without making page loads extremely slow.

Like I said, serve me an ad. I'm not an idealist, I understand why advertising exists. But don't justify collecting data about which articles I read to serve to some inscrutable network of advertisers by saying that it has to be this way. We don't need this version of advertising.


> I'm asserting that the NYT can achieve the substantial majority of the advertising optimization and targeting it needs to do to be profitable

Majority, not all. Why should they leave money on the table, exactly?


Because it's disrespectful of user privacy, performance inefficient and computationally wasteful?

Most companies are not achieving the platonic maximalization of profit or shareholder value. They leave money on the table for a variety of reasons. It's not beyond the pale that this would be one of them. If you don't agree, then frankly it's probably an axiomatic disagreement and I don't think we can reason one another to a synthesis.


There's nothing axiomatic about our disagreement here, it's not like I'm unaware of the existence of inefficient businesses. Individual companies may choose to leave money on the table, but industries and markets as a whole do not (not intentionally, anyway).

You've just described the status quo, where businesses have to sacrifice their lifeblood to achieve your ideals. Those businesses tend to be beaten by more focused competitors, which results in the industry you see today, filled with winners that don't achieve your ideals.

But good luck trying to champion an efficient web industry by essentially moralizing.


I'm not trying to champion anything, I'm speaking my mind. I don't expect the NYT to change because I'm writing an HN comment. If market forces or legislation are insufficient to force companies to respect user privacy across unrelated domains, then I'll rely on my own setup: a Pi-Hole VPN for mobile devices, and uBlock Origin for desktop devices. I happily whitelist domains with non-intrusive ads and respect for Do-Not-Track.

But more to the point, you're presenting an argument which implies the NYT is a business which will be beaten by its competitors if it doesn't track users through their unrelated web history. I don't think that kind of tracking is an existential necessity for the NYT. It's not their core competency. Their core competency is journalism - if they are beaten by a competitor it won't be because the competitor has superior tracking, for several reasons:

1. Journalism is not a winner take all environment,

2. Newspapers were surviving in online media well before this tracking was around,

3. The NYT already has sufficiently many inefficiencies that if they actually cared about user privacy, they could trim the fat elsewhere so they wouldn't have to know to within 0.001% precision whether or not a user will read an entire article just to be profitable.

I really don't think this is too idealistic. It's not like I'm saying they need to abandon advertising altogether. I don't even have a problem with the majority of advertising. It's the poor quality control and data collection that I take issue with. All I'm saying is that they don't need to do what they're doing to be profitable.


> Journalism is not a winner take all environment

So? I'm not sure how this means that news orgs won't suffer from losing business to competitors with superior tracking.

> Newspapers were surviving in online media well before this tracking was around

Markets change. Advertisers have different expectations. Readers have more news to choose from. This is a silly argument.

> The NYT already has sufficiently many inefficiencies that if they actually cared about user privacy, they could trim the fat elsewhere

Sure, but why? Why would they do that? Why wouldn't they trim the fat elsewhere AND keep the tracking to make more money?

The point you make doesn't really make sense. Yeah, it's theoretically possible for news orgs to stop tracking in the same way that it's theoretically possible for me to take out a knife right now and cut off my legs. News orgs can make up their losses elsewhere and survive in the same way that I can still get around with a wheelchair.

But why on earth would I or the NYT do that?

I respond to you with these questions because it seems to me that both you and the OP speak out against these practices because you feel they are unnecessary. My point is that they are necessary. You just don't acknowledge the forces that make them so.


> Majority, not all. Why should they leave money on the table, exactly?

It may be that they are, in fact, driving users away. Tracking user behaviour can become a distraction.


NYT used to exist only in paper form which had no tracking abilities at all. They may benefit from this but I’m skeptical that they “need” it.


The marketplace for your attention was a lot less competitive then. Editors could even feed you true and nuanced reporting, out of a sense of professional obligation, and you had no choice but to sit there and take it.


That sounds like a case of acute metrics-itis personally - looking for things to measure as a yardstick while forgetting that you get what you measure for instead of the core of the business.

While it may give some insight does it give anything of meaning to know most people skim into the first few paragraphs before clicking away? Does it improve the writing quality or fact checking? Is it worth the risk of alienating customers over? To give a deliberately extreme and absurd example Victoria's Secret could hire people to stalk their customers to find out more about product failure conditions in the field but that would be a horrifying and stupid idea.

"Everybody is doing it." is a poor rationale for implementing a business practice.


Except that the content is directly related to user behavior. If they see no one reads the style section, they'll cut it and move resources to financial news. If they didn't have tracking they'd never know that, be wasting resources and having a comparatively inferior product.

They can't do UX anaylsis, nothing.


For the first, it's sufficient enough to just look at the number of page requests.

For the second, I never got explained to me how UX analysis really works for news sites. Isn't it enough to put 2 or 3 people in a room and show them a few variations of the UI? There isn't really much to publishing text, images, and a few graphs. Graphs are a very well explored field, I don't think you can learn more about them by just watching hot maps and click through rates.


I suspect there's lot of bullshitting happening around "UX analysis", with third-party "experts" offering analyses which may, or may not, show something significant. As long as everyone in the chain can convince their customer/superior that their work is useful, the money will flow, whether or not the work is actually useful.


That's one of the fundamental problems in tech today, namely:

"It is difficult to get a man to understand something, when his salary depends on his not understanding it.”


It absolutely isn't sufficient to look at the number of page requests. How do you discern like-reads vs hate-reads? How do you determine whether someone clicked on an article, read the first line, and then bailed vs read the whole thing? There are a heap of metrics used to determine engagement which factor into the material decisions referred to in the grandparent.


Why do you need to track user behavior across unrelated domains to achieve any of that?


It's pretty simple, try a few different designs with A/B testing and you will see which one has the most revenues.

However the result will usually be a lot of dark patterns. For instance, that's why you get popup to ask you to register.


Server logs just aren't sufficient, no matter how many times hn says so. You don't get enough data to make data-driven decisions. Thats like giving a data scientist 1/3rd of the available data, and saying "thats good enough".

You'd expect ux to be a small unit, but that includes everyone who works on revenue-generating ads. Moving one tiny thing has a direct impact on revenue, which affects every person employed by nyt.


>Thats like giving a data scientist 1/3rd of the available data, and saying "thats good enough".

And it could easily be enough. Having 1/3 of a quantity of something doesn't mean you barely had enough before.


They could certainly track what you mention here (which pages are being accessed) via logging requests - without any use of additional front-end assets.


You can do all of those analytics server side, there's no reason to deliver it via JS and have the client do the computation. You're already sending all the required info to track that sort of thing via the request itself.


It's amazing to me that no one out there seems to do server-local handling of ads, either... If you put ads directly into your page instead of relying on burdensome external systems, suddenly blocking isn't a thing anymore. ALL of the functionality supposedly needed for analytics and an ad-driven business model can happen server side, without the page becoming sentient and loading a billion scripts and scattered resources, with the one exception being filtering out headless browsers. If external systems need to be communicated with, most of that can happen before or after the fact of that page load. Advertising and analytics is implemented in the most lazy, user hostile way possible on the majority of sites.


I don't think it's very surprising. Advertisers won't let publishers serve ads directly because that requires trust in publishers to not misrepresent stats like impressions and real views. I don't know how you'd solve that trust problem when publishers are actually incentivized to cheat advertisers.


I think you may have identified the biggest issue, and it's a shame the pragmatic solution is an unpleasant technical solution.


Couldn’t they eg have some trusted proxy server that routes some requests to the real-content NYTimes server and some to the ad server?


That sounds like a viable solution to the trust issue. They don't need to respond to the requests, just see copies they can be sure are real requests.


For advertisers to trust this proxy server, the NYT cannot control this proxy server to preserve its integrity. So now you're asking the NYT to base their business on an advertiser-controlled server?

What happens when the proxy goes down? What happens when there are bugs? Do you think publishers can really trust advertisers to be good stewards of the publisher's business? Think for a moment about publishers that are not as big as the NYT.

Okay, maybe they do trust an advertiser-controlled proxy server. This means that both tracking scripts and NYT scripts are served from the same domain, meaning they no longer have cross origin security tampering protection. What's stopping the NYT from serving a script that tampers with an advertiser's tracking script?


Those are issues, but not insurmountable, especially when the benefit is "obviate any adblocker".

They can use a trusted third party to run the proxy and use industry standards/SLAs for site reliability/uptime. And they can still use different subdomains with no obvious pattern (web1.nytimes.com vs web2.nytimes.com -- which is the ad server?) or audit the scripts sent through the proxy for malice.


The way it's implemented has several "benefits":

- It externalizes resource usage - the waste happens on users' machines. Who cares that it adds up to meaningful electricity consumption anyway?

- It makes it easier for marketing people and developers to independently work on the site. Developers can optimize, marketers can waste all that effort by including yet another third-party JS.

- It supports ad auctions and other services in the whole adtech economy.

- You don't have to do much analytics yourself, as the third party provides cute graphs ideal for presenting to management.


There used to be an open source self-hosted (php) ad application called openx. It worked well for quite a while. In its later years, it suffered a number of high-profile security vulnerabilities, and the open source version was poorly maintained since OpenX [the company] was focused more on their hosted solution which probably had migrated to a different codebase or at least was a major version past the open source codebase.

The open source version has been renamed "Revive Adserver", and it looks maintained, but I don't think it's used nearly as much as the openx [open source version] of old.

If you use Revive Adserver or you design a server-local ad system in-house, it won't be as sophisticated as gigantic ad-providers who can do all sorts of segmentation and analysis (producing pretty reports which execs and stakeholders love even if that knowledge adds no value to the business).


Funny that you mention that --in a former life I had to develop around and maintain an openx system.


It's because they use systems that identify the client via js to deliver the most "expensive" ad possible. It's complete garbage of course, Google/Facebook should be held liable for what they advertise, not run massive automated systems full of fraud. If Google delivers malware they shouldn't be able to throw their hands up and go "well, section 230!".


> They can't do UX anaylsis, nothing

They could, but that would require paying people and firms like Nielsen to gather data. Instead they engage in the same freeloading that the industry derides users for.


Reading without cookies or JavaScript enables seems to fix every problem the NYT has.


FWIW: ars technica turns off tracking for paying customers (and provide full articles in the rss feed if you pay for it.)


I need to like this comment more than my single upvote allows.


A random archive of the New York Times frontpage in 2005 is 300kb. Articles were probably comparable in size.

Are you honestly saying that the landscape of the internet and/or the staffing needs of the NY Times has changed so drastically that they actually needed a 22x increase in size to deliver fundamentally text-based reporting?


I mean, if most of that is a few images, then those images could just be bigger today for nicer screens and faster internet.

Not that that is the case.


You’re right about the problem: web pages tend to scale with the size of the organization serving them, not the size of the content. But this is the failure, not a defense.

It’s a big problem on mobile and the reason I read HN comments before the article.

> NYT employs 1,300 people


Definitely a good point. You'd imagine they'd have at least 1 person who optimizes the site for page size / load speed.


And 100 other people who’s job it is to cram more features in.


> Moby Dick is a book, written by 1 guy and maybe an editor or two. NYT employs 1,300 people.

Totally irrelevant. Why should the number of employees in the company have any bearing on the size or cost of the product? Ford has 5x as many employees as Tesla. Should their cars be 5x as big or 5x more expensive?

> NYT needs to know how long you spent, on which articles, etc. They need data to produce the product

They may want this but they don’t need it. They successfully produced their product in the past without it.

> If Moby Dick was being rewritten and optimized every single day it would be a few mb.

Irrelevant and likely false. If anything, books and other text media tend to get smaller after subsequent editing and revising.

> A better comparison would by Moby Dick to the physical NYT newspaper.

Comparing a digital text product (Moby Dick) with a digital text product (a NYT article) is as close as it gets.


> Totally irrelevant. Why should the number of employees in the company have any bearing on the size or cost of the product? Ford has 5x as many employees as Tesla. Should their cars be 5x as big or 5x more expensive?

If the cost or the size wasn't a constraint, for sure Ford would build a car 5x as big or 5x as expensive.

The website size isn't a constraint here, if it was, they would works on it and make it smaller. It's only a constraint for highly technical people here. Currently at my job I'm optimizing some queries that takes way too long. It has been like that for years but we hit a wall recently, our SQL Server can't take it anymore. I always found it stupid that it took so long to optimize it... but at the end of the day, the clients just didn't care that it took 3 seconds to load the page. I could be working on more features right now, something that the client actually care about.

What makes the number of employees relevant to the size? Well if you were the only one building that website, you would know everything about it right? You would always use the exact same component, reuse everything you can, you already know every single part of the code. Add a second employee, now you don't know exactly what he does, you do know some of it but some time you forget and you may duplicate something or do it badly or whatever. At one point, something is just too big to be understood by a any single employee and you get code badly reused, stuff that serve no direct purpose too but make maintenance easier, etc... You never decrease the size simply because it's never worth it to but each and every single one of the employee add stuff to it.


>Why should the number of employees in the company have any bearing on the size or cost of the product? Ford has 5x as many employees as Tesla. Should their cars be 5x as big or 5x more expensive?

This is a bad example. Tesla is a failing/failed car company that cant produce 200k cars per year. Ford has their truck program that sells that in a few months.

>Comparing a digital text product (Moby Dick) with a digital text product (a NYT article) is as close as it gets.

Comparing a book vs a timely article is unfair. NYT produces content daily to encourage shares and people clicking on various links on the website. Links, Images, Videos, comments, etc... None of those are available in dumb text.


> NYT needs to know how long you spent, on which articles, etc. They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

Nonsense. I subscribe to the NYT so that I can read the news. Nothing about that necessitates tracking which users read which articles.

If the NYT uses page view data for anything other than statistics for their advertising partners, it's a shame. I don't want the NYT to tailor write their articles to maximize page views, time spent, or any other vanity statistic; if I felt like reading rage bait fed to me by an algorithm personally customized for all my rage buttons, there is plenty of that elsewhere.

NYT's differentiating factor is that they are one of the few businesses left that pays people to conduct actual journalism. If they give up on that, then I imagine their customers will just go to buzzfeed or wherever.


So, how does the NYT style section fit into your idea of 'actual journalism'?


It doesn't. Not every word printed in the NYT is journalism, but it doesn't change the fact that they are one of the few websites that have any journalism at all.

If the NYT cut their paper down to just the style section, horoscopes, and other garbage, they would be just another Buzzfeed and are probably not equipped to compete.

On the other hand, Info Wars, Mother Jones and friends offer publications with basically no journalism at all. That's the space the NYT, WSJ, Miami Herland, Chicago Tribune, and so on fill. They do Pulitzer prize worthy reporting. If these papers become run-of-the-mill click farms, I'm sure silicon valley will run them out of business, as rage bait is not really their core competency.


> NYT needs to know how long you spent, on which articles, etc. They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

This just seems like such an abuse of what the web was meant to be. I can imagine the horror people in the 90s would have experienced if they new what JS was going to be used for when perusing news sites.

Sometimes I wonder if it would have been better keeping the web as a document platform without any scripting, and creating a separate one for apps.

Anyway, an alternative model news sites could use is to let users choose which content they want to pay for. That's a way to track which content users prefer.


> I can imagine the horror people in the 90s would have experienced if they new what JS was going to be used for

We understood the horror no less than we do now. Javascript in the 90s gave us infinite pop-ups, pop-unders, evasive controls, drive-by downloads and otherwise hijacked your browser and/or computer. There's a reason Proxomitron and other content blockers hit the scene by the early 2000s-- the need to shut that shit off was clear.


Yeah, people under 33 seem to have a romanticized view of the internet. They believed there was no ads and it was flush with the kinds of content we enjoy today. Nope. Content existed but it was scarce/thin. Many of the internet users just stayed on AOL/Prodigy/Compuserve and never left to explore the WWW side of things. Those service providers were essentially national level BBSs.

There was no youtube, wikipedia, itunes or reddit. No instagram, twitter or google earth. The internet was basically geocities where most webpages were fan pages or pages/forums about niche interests.

I think people want to believe that because they believe that if ads were to disappear off the internet tomorrow, nothing would change. They don't realize that ads subsidize the content they consume, whether it's a youtube video they're watching or a reddit thread, ads are paying for that content. Nothing is free.


The web was intended to be a free exchange of knowledge, not ad driven, regardless of how JS was abused in the late 90s, early 2000s. A scripting language was added to the web because of Netscape’s commercial interests in Creating an alternative to MS products.


> The web was intended to be

I'm sorry, but this is ridiculous. You cannot say what the web was intended to be because you had no hand in inventing it. You do not know the mind of Tim Berners-Lee.

In fact, I argue the opposite - he had a vision of a global hyperlinked information system. While he wanted the protocol itself to be free (a move away from gopher), the information itself had no such protections. And that is precisely what we have today; it doesn't cost anything to use the WWW protocol. His vision has been fulfilled.

Now, the information (the content) itself is another matter. IP laws exist for a reason, people want to be paid for the content they create. They have ownership of that content. Whether it's the latest episode of game of thrones, a video game IP, or a book I wrote, the law protects my intellectual property. If I want to charge for access to that content, I'm more than within my legal right. Whether it's accessed over WWW, a cable box, or purchased from a book store, it makes no difference to my legal protections.


You write as if Tim Berners-Lee died in 1990.

I think it's quite possible to know what TBL thinks about privacy-invasive javascript, given that he's very much alive and writing about such things to this day.

https://webfoundation.org/2017/03/web-turns-28-letter/


> Sometimes I wonder if it would have been better keeping the web as a document platform without any scripting, and creating a separate one for apps.

The app platform would consume the document platform because it is easier to enforce DRM requirements in an app platform than in a document platform.

I'm just thankful that I can still Print to PDF nearly anything. I don't think it will last much longer though because "web" "designers" are driven to destroy anything and everything that was actually good about the web in their quest to monetize every pixel on my screen.


They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

I disagree. They need journalists, and they need to find some way to monetize. Your argument implies there is no other way than to add user tracking. Sure, images take up space, but I refuse to believe the current way of the web is the only viable option.


The New York Times existed for 145 years from its founding in 1851 to the creation of its website in 1996, and it got by just fine without tracking pixels in all those years.


This is a fallacy. Humans got by just fine without smartphones, Internet, electricity for thousands of years. While you could do just fine without those things today it is impractical. Times change (pun intended).


> NYT needs to know how long you spent, on which articles, etc. They need data to produce the product and you can only achieve that with javascript tracking pixels (Server logs aren't good enough).

That's analytics. The marketing stuff that is causing the bloat isn't doing that much in comparison. Trackers are often coded very wasteful and are redundant by nature. You can have easily dozens to hundreds of them all doing the same stuff just with different APIs. It is insane and out of control and has absolutely nothing to do with gathering insights about your app and improving. It is pure external 3rd party marketing.


Newspapers did just fine for centuries without tracking. The business is viable without tracking.


> They need data to produce the product

[citation needed]

There's no reason they need to use Javascript to track user behavior down to "how long have they read this article".


> NYT employs 1,300 people.

See 2 in the article. A sample:

“As Graeber observed in his essay and book, bullshit jobs tend to spawn other bullshit jobs for which the sole function is a dependence on the existence of more senior bullshit jobs:“

I work for one of the vacation rentals

No reason private owners couldn’t be doing the work over email. But certain fetishized models of doing, in this case cloud and web apps, get the focus.

It’s all for eyeballs and buy in at scale to justify the bullshit. “Look everyone is watching us talk up this shit! Better keep justifying it, bringing them into our flock!”

It’s turned us all into corporate sycophants. Religious conviction isn’t limited to belief in sky wizards

Anything sufficiently magical to the layman will instill blind allegiance

And despite all the smart people here, life as is seems magical and there’s a lot of blind buy-in


This dovetails into an idea I had [0]. Basically just client side scrape the web as it's used and deliver people this plain text and simple forms. It would have a maintained set of definitions and potentially even logic to put a better "front" on all this bullshit. It's like reverse ad block where you only whitelist some content instead of blacklisting it. You could argue sites will get good at fighting it, but if used enough by the common user, they'd just alienate them (e.g. my scrape/front for Google search makes it clear which results the app has a friendly scrape/front for).

0 - https://github.com/cretz/software-ideas/issues/82


I've often toyed with the idea of using multi-user systems over SSH running Gopher and Lynx to achieve something like this.

In the process, it would also decentralize communities and establish digital equivalents of coffee shops (i.e. places to work in public and meet strangers)--basically SDF, but deployable on Raspberry Pis with more modern userland toys (i.e. software actually designed to be multi-user on the same system).

[1] https://en.wikipedia.org/wiki/Gopher_(protocol)

[2] https://en.wikipedia.org/wiki/SDF_Public_Access_Unix_System


My main reason for client side is to skirt legal troubles that can result from running a web-filtering proxy (not whether it's legal or not, but whether you will be in legal fights). Either way, needs to be as transparent as possible and as usable by the less-tech-savvy as possible.

But that's really all it is, a web server (or an app, or an extension, or a combo) that serves you up the web looking like Craigslist. Would require strongly curated set of "fronts"/"recipes".


Sounds a bit like tedunangst's miniwebproxy[0]. I've been wondering about writing either something like it or a youtube-dl-like "article-dl" for my own use, but haven't quite been annoyed enough into doing it yet.

[0] https://www.tedunangst.com/flak/post/miniwebproxy (self-signed cert)



I don't know how many people here read usenet or were on old mailing lists. You could have removed a some hard edges in usability and everything would have been connected to phones with monochrome text displays back in the nineties already. They didn't even provide decent email experience.

But instead we got the technology developing through some ringtone stuff advertised on TV.

I guess it's something that you can instantly show to your friends.


That's one of the wildest things! We had the technological capacity to run Unix systems with several hundred users simultaneously and access at the speed of thought with 26kbps modems back in 1992, complete with instant messaging and personal directories! What happened?!


Another wild thing like this is what you'll notice when you read up on Lisp machines. We had development environments in the 70s/80s that would seem magical today.


I’m reading the book valley of genius where the xerox parc people basically make the same argument. The Xerox Alto’s smalltalk environment still isn’t matched today and the PC experience is much weaker for it.

The problem with those kinds of environments (where everything is editable at runtime using highly expressive langiages) is they assume everyone is a power user and there are no malevolant actors trying to mess up your machine. That’s not what the modern landscape is like.


kinda so did reasonably modern ideas. take active desktop for example! sure it’s more high level, but i believe the quote is that they wanted websites to do “cool things” with the desktop. cringeworthy by today’s standards...

things get more locked down as we develop abstractions that we have more control over


It's heartbreaking.


Shows we are not limited by technology but somehow get distracted by other things.


The "other things" are the short-termism and appealing to the lowest common denominator that go with the pursuit of profit before anything else.


In Usenet’s particular case it was its open, unmoderated nature that killed it - once it became 99% spam, warez and CP most ISPs dropped it.


There were moderated groups but IIRC they were updating more slowly because every message was reviewed.

Anyway, there was a certain barrier for entry so there were less users and messages. But some really good experts posted there. And some really fun jokers.


I support your effort to make Moby-Dicks the football-field-like unit of measurement for text-focused data. It’s close enough to the 1.44 MB floppy disk to handle easy mental conversion of historical rants, and half of the people reading this have probably never held one of those. I still remember downloading a text version of a 0.9 Moby-Dick book from some FTP site and carrying it around on a floppy so I could read it on whatever computer was handy.

That aside, the most shocking part of your analysis is how inefficient the nytimes was at caching resources for your reload.


For a rather more technical comparison, 4,600 pages is more than the size of Intel's x86/64 Software Developer's Manual, which is ~3-4k pages.


"I just loaded the New York Times front page. It was 6.6mb."

   ftp -4o 1.htm https://www.nytimes.com

   du -h 1.htm

   206K
For the author, 206K somehow grew to 6.6M.

Could it have anything to do with the browser he is using?

Does it automatically load resources specified by someone other than the user, without any user input?

Above I specified www.nytimes.com. I did not specify any other sources. I got what I wanted: text/html. It came from the domain I specified. (I can use a client that does not do redirects.)

But what if I used a popular web browser to download the front page?

What would I get then? Maybe I would get more than just text, more than 206K and perhaps more from sources I did not specify.

If the user wants application/json instead of text/html, NYTimes has feeds for each section:

    curl  https://static01.nyt.com/services/json/sectionfronts/$1/index.jsonp
where $1 is the section name, e.g., "world".

The user can use the json to create the html page she wants, client-side. Or she can let a browser javascript engine use it to construct the page that someone else wants, probably constructing it in a way that benefits advertisers.


I don’t think there is anything wrong with user agents downloading resources (like images and stylesheets) linked to by an html document. It is the providers, not the user agents, who have violated the trust of users by including unnecessary scripts, fonts, spyware, advertisements, etc.


"I don't think there is anything wrong with user agents downloading resources (like images and stylesheets) linked to by an html document."

Neither do I. For some websites, this is both necessary and appropriate.

However, in cases where the user does not want/need these resources, or where she does not trust the provider, I do not think there is anything wrong with not downloading images, stylesheets, unnecessary scripts, fonts, spyware, advertisements, etc.


My pet comparisons for everything being too big nowadays are Mario 64 (8mb!), Super Mario World (512kb!), and Super Mario (32kb!!).


I first realised how heavy these pages are when I disabled javascript. Things load in the blink of an eye. \Most\ pages work and the web remains largely usable.


Yes this is my experience as well, JavaScript is often the key antagonist. Unfortunately many websites require JavaScript to function


bbc.co.uk will load just fine without JS and actually be more enjoyable (IMO) than JS version.

cnn.com fails miserably without JS.


IMHO you are confusing data with information with knowledge. And mixing mediums. You can't compare a novel - the plainest of plain-text mediums, with the front page online of a major news organization in 2018 - of course it will be interactive content, its an entirely different medium, a different market, and different sets of user expectations and competition.

https://www.quora.com/What%E2%80%99s-the-difference-between-...

DATA: a "given" or a fact; number; picture represents something in real world raw materials in production of information

INFORMATION: Data that have meaning in context Data related Data after manipulation

KNOWLEDGE: familiarity, awareness and understanding of someone or something acquired through experience or learning it is a concept mainly for humans unlike data and information.


but don't want it to be hugged to death:

Incidentally, this is also another reason for keeping pages small --- bandwidth costs. I remember when free hosts with quite miniscule monthly bandwidth and disk space allotments were the norm, and kept my pages on those as small as possible.


I just loaded up a nytimes[1] article too - and only weighed in at 1.0MB. For a 1000 word article. Subsequent reloads dropped it to ~1000KB. I don't think that's too bad, considering there are images in there as well.

Now of course, I'm running an ad blocker. I assume the remaining MB that you noticed had come from advertising sources. In which case, bloat isn't the issue, ads are.

[1] - https://www.nytimes.com/2018/07/31/us/politics/facebook-poli...


weighed in at 1.0MB. For a 1000 word article. Subsequent reloads dropped it to ~1000KB.

You really can't beat savings like that.


That'll knock dollars... no, cents... no half-cents off his ISP bill!


That distinction is nonsense. The ads are part of the page and are no more or less bloat then the rest of the useless junk that gets embedded. It's deliberately put there by the NY times, they don't end up there by accident.


Are you also running noscript? I'm running a DNS sinkhole and still get 3mb on reloads.


Comparing the raw text of a fiction novel to the code of a website is a pretty asinine comparison, honestly.


Maybe you'd prefer comparing the code of a website to the amount of useful content on the website, which OP also did. Taking "I'm downloading 100mb worth of stuff (83 Moby-Dicks) to read 72kb worth of plaintext" at face value, we could also say that 0.072% of the data transferred is useful, or, equivalently, that 99.928% of it is crap.


You don't have to download the typography of a physical book but it still plays a huge role in the readability and enjoyment of it. So I guess the typography of websites is "crap" because it has to be downloaded?

It's a ridiculous apples-and-hammers comparison thinly veiled as an intelligent critique.


I've already downloaded everything I need for perfect typography. I can apply it to most websites with Firefox's Reader Mode. Websites cannot possibly improve on this, because the best and most legible typography is the typography you're most familiar with. I don't care about branding or image or whatever bullshit "designers" use to justify their jobs. Web fonts and CSS have negative value to me. I disable them as far as possible.


How much of the data downloaded is actually for typography?

Also, browsers have good enough typography by default, which can be controlled with CSS.


Everyone's favorite example of that kind of "brutalist" design:

http://bettermotherfuckingwebsite.com/



A++++++ would inspect elements again.


Aww! Thanks ;-)


Came looking for motherfuckingwebsite.com & find an even better site to reference now instead. Kudos!


Bad design because of the low contrast text. There used to be a superior version at https://bestmotherfucking.website/ , but it isn't loading correctly for me on Firefox.


I can put raw text on a kindle and read it. In fact, I frequently do. Comparing digital text to digital text is not apples-and-hammers.


Exactly, it’s not the NYTs fault that plain text compresses well compared to jpgs.

Also, is a world where he NYT is subscriber only really preferable?


This is a textbook definition of a false dichotomy. There are other distribution models for digital news services. There are other methods for transmitting digital content. It's not an either/or situation.


Fair enough but these rants on HN about publishers never seem to contain any examples of publications that are both successful and delivering pages that weigh scarcely more than their plain text equivalent.

I think you invite the “false dichotomy” by making the comparison between plain text Moby Dick and the front page of one of the most successful newspapers in the world in 2018. I agree with many of your points but find the way you make your argument to be full of comparisons of apples and oranges while avoiding proposing any kind of solution.


> Fair enough but these rants on HN about publishers never seem to contain any examples of publications that are both successful and delivering pages that weigh scarcely more than their plain text equivalent.

Examples would be the same publications 10 or 20 years ago. They weren't exactly plain text but they were a lot lighter and the content has not improved measurably in that time.

Here's one from the 70's that's still going: https://en.wikipedia.org/wiki/Minitel


I don't care about examples from 10 or 20 years ago, I want examples from the current market.

Minitel isn't a publication? I'm familiar with Minitel, I've read a book on it, but I don't know what you're trying to insinuate by linking it here in a discussion about publishing.

Also, "still going strong?" Minitel was discontinued, in 2012. Because France has the modern internet now.


Did you read the article I posted? I definitely put forward both an acknowledgment that media firms won't change and ideas towards reducing usage where possible.


Again with your post I see many more words about Moby Dick and nostalgizing for a past that no longer exists then words about a possible solution, and while your solutions are better than the typical “just move to a susbscription model,” I can’t see how we would generate the political will for anything other than cheap internet. We can’t even agree to tax carbon emissions yet. How are we going to tax bandwith and convince everyone to accept a low bandwith internet?

I don’t have any great ideas myself, but part of me sees the Americans with Disabilities Act as a model for getting this done.

I really don’t like AMP because of the Google Cache and the potential for google to bias their search results page to emphasize AMP pages over similarly performing non AMP pages, but it’s a better thought out attempt to fix “the Bullshit Web” than anything else I’ve seen, and it has been extremely successful in decreasing payload size for many readers of sites like NYT and other major publishers.


Completely agreed. I could continue by comparing it to the amount of bandwidth in a 30-minute CNN broadcast, all to read a few thousand words at me.


Broadcasts are great! You can send a lot of content out, and it’s essentially zero marginal cost per additional receiver. Plus, it’s very hard to track behavior of consumers without their explicit consent.

People are most familiar with realtime broadcast audio and video, however I could see something like newsgroups working via a broadcast medium.


I can't agree more with the points you make, I've spent a decent amount of time and effort reducing the overhead of my blog, for example - https://goose.us/thoughts/on-the-purpose-of-life/

That page includes images and "embedded" youtube videos, but loads 454kb of 3,187 words with 10 requests in 450-600ms - if anyone has any suggestions on how to reduce that further, I'd love to hear.

Going to bookmark that txti.es service for the future, definitely seems useful for publishing simple content without needing to fit it into a blog theme.


I'd start with the 100+KB PNG images. None of those pictures are complex, they can be compressed much more.


I put them all through ImageOptim, so not sure if those PNGs can compress anymore...

I agree on those complexity graphs since they're actually scanned from Sean Carroll's book and edited in Pixelmator - if I was better with graphics I could probably recreate them as SVG or in an editor and make it only a couple kb.

Same for the MinutePhysics screen grab, though I couldn't bring myself to attempt a poor recreation of their work and couldn't get rid of the weird pink gradient background which probably prevents better compression.

The Youtube cover images are still being pulled in from youtube - I debated downloading/compressing/hosting them myself, but figured I'd gain more from downloading them from the separate domain since I still think there is a default limit of connections the browser will open to a single domain at the same time.


That's because PNG is lossless, use JPG, and you'll dramatically cut down the space.


Have you tried opening the image(s) in Photoshop and try outputting them at different resolutions and compression settings?


Ditch the "embedded" videos. Use a screenshot linking to the videos directly.

Embedded videos still load a ton of shit and track visitors without their consent.


> mb

I think you didn't mean milibits, but megabytes (MB). 1.2 mb = 1.5e-10 MB.


I must say, despite what I imagine is a good bit of traffic the instant load times on your site were a joy to behold. I never get to experience that kind of speed on the "modern" web. Even HN loads orders of magnitude slower than that.


Not my website, but please send along thanks and awareness to @thebarrytone on Twitter!


> Moby Dick is 1.2mb uncompressed in plain-text

How many times did you read moby dick online.


> As my father was a news and politics junkie (as well as a collector of email addresses)

I feel like there's something I'm missing here: something like getting his hands on clever usernames?


NoScript cuts the bullshit down to 1.35 MB with all scripts blocked and it's still readable. I can barely tolerate the web without it.


You want to compare plain text to a newspaper, which even in paper form, people expect to contain pictures and advertisements.


Bad comparison, Moby Dick never tracked your activity across the web and sold your data to advertisers.


Moby Dick doesn’t have any pictures or video.

Audiovisual media has value.


Audiovisual media of value has value.

Audiovisual media that I choose to donate my bandwidth towards downloading may have value.

Audiovisual media in and of itself has no value.

Audiovisual media that automatically loads, thus slowing down the loading of everything else, has negative value.


The 100mb pays the bills for the other 72kb of text.


> to read just one article from the New York Times, I had to download the equivalent of ten copies of Moby Dick.

But how many Mona Lisas are this?


Well said.


This is both an appeal to people's universal appreciation of efficiency, and a weak denunciation of the modern web. Your argument is

1) that a website's value is the number of words on the page, and

2) that raw text is the highest value data that can be transmitted over the internet, and

3) that inefficiency and wastefulness of bits is a bad thing

First off, you need to defend your first two assumptions. Don't websites do a lot more than display text? Does HTML/markup not have a magnitude more value than raw text? And how exactly is being inefficient with something that is abundant a tautologically bad thing?


#3 is an interesting thought. So when something is abundant, inefficiency and wastefulness are fine?

When it comes to bits, abundant to who? Those who can afford said abundance? Certainly not to those with bandwidth caps and slow internet access.

Reminds me of a cartoon I saw once: "What if climate change turns out to be a hoax and we end up making the world better for nothing!"

Since when is doing something efficiently and non-wastefully not a good idea for its own sake?


  Since when is doing something efficiently and non-wastefully not a good idea for its own sake?
I agree we should make things efficient for their own sake. But that's not what people are arguing about. They are arguing that the modern web is bad, and their reasons are weak. The modern web is terrific and we should not carelessly denigrate it, which is what I'm against.

Do not denigrate the modern web in the name of efficiency when your measure for evaluating websites is wrong and you incorrectly assume that the difference between a 1mb and 10mb payload matters to the actual operation of the site and visitor satisfaction.

Re: That comic you once saw.

Yeah, what if we shut down all coal power plants and make the earth better? Oh wait, now we have destroyed our critical infrastructure and our government/nation has absolutely no leverage to even make decisions regarding the environment. If people would stop dumbing down these complex problems, it would be a good thing.


If those coal power plants are replaced with alternative forms of energy, I don't see the problem... Talking about just "shutting them down" with nothing to replace them has absolutely nothing to do with that sentiment or the original cartoon... it's not remotely a reasonable reading of what I was saying.

Not to mention, my reply is to parent comment, not to the article in general. In no way did I say I completely agree with the author of the article- that's not what my comment was about.

However, if a news article with a load of JS and ads is considered to be this "modern web" then good riddance to it. I hope it goes away.

Yes, cool things can be done with the web, and some of those cool things take a lot of bandwidth or payload to accomplish- the examples (news sites, etc) are not these things.

What are your reasons for thinking the modern web is terrific? I agreed with that statement when I saw it, but as I think about it, I'm not so sure I do


Since when is doing something efficiently and non-wastefully not a good idea for its own sake?

When the effort to optimize something would be better spent elsewhere.

On the web there are two important points. First, bandwidth isn't always abundant so optimising is worthwhile, and secondly optimising is effectively a negative cost if you just don't include wasteful features in the first place - building optimally is less effort than building wastefully.


1 and 2 are your own extrapolations and have nothing to do with the author's arguments. Please point out where the author makes these arguments in case I missed something.

> Don't websites do a lot more than display text?

Yes, evidently, but the author doesn't disagree with this. In fact, the entire premise of the article is that websites do a bunch of useless stuff that have nothing to do with delivering the content you are browsing them for.

> Does HTML/markup not have a magnitude more value than raw text?

As a universal rule there is of course no answer to this. There is a lot of absolutely worthless text on the web. Given the example, though, would you say that the markup is worth more than the text on a website whose primary attraction is written articles? Do you frequently visit websites just to admire their markup?

> And how exactly is being inefficient with something that is abundant a tautologically bad thing?

My time is limited. RAM is limited, CPU time is limited. My mobile data plan is limited. No one is happy with a page loading for 10 seconds for something which should take 1/10000 of that. That's not to say that this is tautologically a bad thing. The author explains why he thinks its bad and I think that people that value their time and resources should agree. If you think that an article loading for 10 seconds is a bad thing, it's bad that an article takes 10 seconds to load. It only becomes tautological once you apply your value system to it, if ever.


1. A website's value is the amount of information that it provides to the end user. I would argue that the entirety of the works of Shakespeare or the 1911 Brittanica Encyclopedia provides more information than a high-definition picture of Donald Trump grimacing at EU leaders or an autoplaying ad for Doritos Locos Tacos NEW AT TACO BELL. As far as supplemental uses, at the end of the day, people are using plaintext to communicate with other people. Images and videos are secondary. If that ever changes, then society is already doomed as literacy is fundamental to the maintenance of technology.

2. Raw text is the highest value data/information return that can be transmitted over the internet. There's a reason that Morse Code and APRS are still around: they're reliable, appropriate tech, and require little to no middlemen outside of the transceivers themselves.

3. If wantonly (namely, for no enduringly good reason) increasing the amount of entropy in the universe isn't tautologically bad to you, then I really doubt that any argument would sway you to the contrary.

Concerning the relative value of HTML and CSS, yes, you could argue that UX matters in that department, but even the most bloated static HTML/CSS page is going to pale dramatically in comparison to the size of what's considered acceptable throughput today.


1. Well how is the format of plaintext the best method of getting information to the end user? What if you added a thin indexing layer on top of the plaintext? That would allow people to jump through huge documents with ease, but it's no longer plaintext. Sounds more valuable to me. Where is the line? What's the ideal?

2. Fair enough

3. Referencing "increasing the entropy in the universe" isn't a good argument because the amount of entropy increase due to humans is much, much less than how much entropy is increased by particles being blasted out of all the stars in the universe (unless I fundamentally misunderstand what entropy is). I think that stars blasting out particles is a much larger contributor to entropy than humans not using computer bits effectively.

And also what does entropy as a concept have to do with anything, anyway? Why should human engineering tasks have such considerations? If being super efficient with an abundant resource has a large cost (of some sort), but low efficiency has no business- or environmental- downside, then why be efficient with it?

In your last bit you argue that it's acceptable to send a site that is bigger than even the most bloated HTML/CSS page; I don't think that's true for any site/app that wants to be fast. It slows you down and people notice and stop using your service unless it's required of them.

I think that in general things are not as bad as you make them out to be, and your arguments have some merit but are mostly revealed as nonsense when the rubber hits the road. Universe entropy is completely unrelated to modern software engineering and web sites that have a lot of devs and are not SalesForce are not _that_ bloated.


> but low efficiency has no business- or environmental- downside, then why be efficient with it?

But it has. Data transfer and processing isn't free. It works on electricity. You may think that a difference between 10KB (efficient) and 10MB (current web) is meaningless because resources are abundant, and it let you save couple hours of dev time[0] - but consider that this difference is per user, and you saved a couple hours for yourself by making thousands[1] of people waste three orders of magnitude more electricity that they would if you were a bit more caring.

Like plastic trash and inefficient cars, this stuff adds up.

--

[0] - Such savings on larger pages take obviously much more work, but then this time gets amortized over all the use the website has - so the argument still holds.

[1] - Millions, for large sites.


I don't think people actually waste three orders of magnitude more electricity by loading 10MB vs 10KB - sure, that much more CPU time is used specifically on loading the extra data, but that would be a fraction of what's being used for all the other processing going on, and people don't just flip the power switch as soon as a page load finishes.


> I don't think people actually waste three orders of magnitude more electricity by loading 10MB vs 10KB (...)

Yeah, they actually waste more in this case. The base cost is that of processing of content, which is linear with size (on a category level; parsing JS may have a different constant factor than displaying an image). But in the typical makeup of a website, just how much stuff can be in the 10KB case? Content + some image + a bit of CSS and maybe a JS tracker. In the 10MB case, you have tons of JS code, a lot of which will keep running in the background. This incurs continuous CPU cost.

> and people don't just flip the power switch as soon as a page load finishes

CPUs have power-saving modes, power supplies can vary their power draw too.

Or, for those with whom such abstract terms as "wastefulness" don't resonate, let me put it in other words: if you ever wondered why your laptop and your smartphone drains its battery so quickly, this is why. All the accumulated software bloat, both web and desktop, is why.


I agree with your point, but the GP's has validity too: the infrastructure to get that page to where it is read, does draw power in proportion with the amount of data it's handling.


As far as #1 goes, I'm arguing that plaintext is the ideal to strive towards, not the living practical reality. As far as indexing and access, the Gopher protocol and Teletext are great options to look at.

As previously noted, if you don't find that waste is fundamentally wrong on a moral level, there's no point forward from here. I view myself on a planet of dwindling resources, vanishing biodiversity, and warming at increasing rates.

If you think the energy that goes into computation is free or lacking external environmental downstream effects, then at the root of it, you carelessly shit where you eat and I don't. That's a fundamental disagreement.

[1] https://en.wikipedia.org/wiki/Gopher_protocol

[2] https://en.wikipedia.org/wiki/Teletext


Well hold on, I don't disagree that we are on a planet of dwindling resources. It's the method of environmental improvement that will have the greatest positive effect that is the root source of disagreement. That's the crux of the problem - what is the process of solving these problems?

I would argue that arguing over how big our websites are is not the important factor. I submit to you that PC electricity usage is the most relevant quantity we need to discuss when it comes to consumption of bits. I argue that the increased load on the network of sending more bits is negligible compared to the many endusers and their PCs that consume our data.

If we agree that PC electricity consumption is the most important thing to address, then we must ask whether or not the electricity generation process is bad for the environment. Most likely, electricity is generated by hydroelectric dams or coal/combustibles power plants. Suppose we replace those two types of power generation with low-maintenance, 50-year-lifetime solar panels (for which the tech exists). Can you still argue that the increased amount of bits sent over the wire for heavy modern websites is an environmental negative that we should address? I would say, no, at this point we have reduced the environmental impact of most electricity-consuming devices, and we can ignore PCs for the time being.

Therefore it is not the personal computer and the quantity of bits it consumes that should be your focus. It should be electricity generation.

I would like to ask you to consider whether or not your compassion-based arguments contain any resentment. Are you acting and speaking entirely on the grounds of compassion? And if so, how can you be sure that your supposed actions are going to reduce suffering of people and the planet and not have the opposite effect? How can you suggest solutions, like decreasing the weight of websites, and know with a high degree of certainty that it will produce the desired outcome (environmental preservation)? Could it have an unintended consequence?


In all honesty--you're right about there being bigger problems.

But let me put it this way, I still reduce, reuse, and recycle even though I know that one unconscientous suburban family will essentially dwarf my lifelong efforts in a year of their average living.

I know that those efforts are futile for the end goal of environmental conservation. That doesn't mean that I'm going to stop doing them. Being dedicated to acting in accordance with an understanding of first principles is not a bad thing, even if those actions are relatively impotent or ineffectual in and of themselves in the current moment.

As far as changing out power sources to nominally sustainable forms, yes, I would still find issue with people wasting those resources, just as I would find issue with people running air conditioners with the windows open.

As far as compassion and unintended consequences, everyone might be here for a reason and maybe trashing the planet is part of that plan, but equally so I might be here to speak against trashing the planet as a part of said reason and said plan.

It boils down again to if you need to find a reason to justify minimizing unnecessary energy usage, we're not going to see eye to eye and I doubt any argument will sway either of us towards the other's camp. Chalk it up to different contexts.


Gopher- I remember setting up and using that as a part of a intern-like job at my local high-school. Back in the days of trumpet winsock and other relics of the hand-crafted TCP stack. shudders Mind you- it does deliver text at low bandwidth. :) Minimalism in communication. I wonder if I can get my SO off facebook and onto Gopher in the interests of the environment... manic laughter fades into the distance


I have given this discussion a good looking over to see if anyone cares for the environment. Glad to find someone that does.

I believe that care for the environment and design that puts being green first is going to have its time in web design. I also believe that along with document structure, accessibility and 'don't make me think' UX that eco-friendliness is going to become a core design principle in a lot of the web. If you put this stuff first then you can have a website that is pretty close to the plaintext ideal. This can be layered on with progressive web app 'no network' functionality and other progressive enhancements, e.g. CSS then JS, with the content working without either of these add-ons.

We all know that you have to minimise your scripts and mash them all into some big ball of goo, we all know that images that are too big aren't going to download quickly. But the focus is on 'site speed' rather than being green. In fact no developer I have ever met has mentioned 'being green' as a reason to cut down on the bloat and existing thinking on 'going green' consists of having wind turbines rather than dinosaur farts powering the data centre. Cutting down the megabytes to go green is kind of crazy talk.

A lot of this thinking is a bit like compacting your rubbish before putting it put for the bin men. Really we would do best to not do the rigmarole of compacting the trash and just having less of it to start with, ideally with more of it re-used or put out for recycling.

We saw what cheap gasoline did to the U.S. auto industry. For decades the big three added on more and more inches to bonnets (hoods) and boots (trunks) with very big V8 engines a standard feature. Until 1973 came along there was no incentive to do otherwise. Who would have thought to have cut down on the fuel consumption?

Outside of America, in the land of the rising sun they did not have a lot of oil. Every gallon they bought had to be bought in U.S. Dollars and so those U.S. Dollars had to be earned first. Europe faced the same problem so economy was of importance in a lot of the world outside America. The four cylinder engines powering cars in Europe and Japan became vastly more efficient than U.S. V8 monster engines. Not only that but cars with a four cylinder engine did not have to weigh many metric tonnes. Nowadays the big three can only really make trucks and truck based SUVs that are protected with the Chicken Tax. Nobody in America is buying U.S. made small cars, U.S. made luxury sedans or even U.S. made 'exotic' sportscars. Economy and 'being green' is not a big deal to U.S. car buyers, nonetheless the lack of having efficiency and economy as a core part of the design ethos has led to a domestic industry that has lost to the innovators in the rest of the world that did put these things central to what they do.

We haven't had 1973 yet and the web pages of today are those hideous Cadillac things with the big fins on them and boots big enough to smuggle extended families across the border with. AMP pages are a bit like those early 'Datsun' efforts that fell short in many ways. But I think that the time of efficient web pages is coming.

The Japanese also developed The Toyota Way with things like Just in Time and a careful keeping tabs on waste. Quality circles also were part of this new world of manufacturing ethos.

The old ways of making stuff didn't really give the results the Japanese were getting but exchange rates, Japanese people willing to work for nothing and other non-sequiturs distracted people from what was going on and how the miracles were achieved. The Germans and the Japanese built great engineering 'platforms' and then got some great styling for the bodywork from the legendary Italian design studios to package it all together. Meanwhile, in the USA there were more fins, more chrome bullets on the grille and more velour in the interiors.

So with the web it isn't just the Lotus 'just add lightness' that is going to be coming along to kill the bloat. It is also ways of working. For a long time the industry has been doing design with lorem ipsum, static PDF mockups and then handing this to some developers with the client expecting not a pixel to differ from the mockups, regardless of whether any of it made any sense. So we have got stuck with the same carousels on the same homepages - the tailfins of the web.

Although it is 'industry standard' to work certain ways, e.g. the big up front design by someone who can't really read, the project manager who can't do HTML, the agile design process that means nobody knows what they are doing, something has to change. Content driven, iterative improvements and much else we forgot from the 'Toyota Way' will ultimately win out with things like being green actually being important.

As for the article, the 'bulls4it web' and David Graeber's ideas as applied to web bloat is an excellent contribution to what web developers should be thinking about.


I think the car analogy is flawed; yes, mid-century US cars were not very economical/efficient, but they sacrificed that for comfort --- big roomy interiors, cushy seats, soft suspensions, automatic transmissions, A/C, etc. The automakers were simply obliging to please customers with these features. There's a reason "econobox" is mostly a pejorative.

On the other hand, bloated slow websites only serve the needs of their authors, while annoying all their users. Users aren't asking for more tracking, ads, or any of that other bullshit.

(Full disclosure: I'm a big fan of vintage "Detroit Iron". You really have to ride in one to understand the experience.)


> On the other hand, bloated slow websites only serve the needs of their authors, while annoying all their users. Users aren't asking for more tracking, ads, or any of that other bullshit.

The bloat actually serves a lot of people, directly or indirectly. It is like packaging, everyone complains about packaging and plastic but when you are in the supermarket do you take that box that is already opened or that tin with the dent?

There are lots of stakeholders behind the bloat. Including the bullshit jobs people of the online world, e.g. the SEO people, the people in marketing and the programmers. In my opinion the cookie-cutter way of churning out websites is being done by a lot of people that are barking up the wrong tree on how to do it with knowledge of modern web technologies rarely gained. Buried in what you see is a bundle of reset scripts, IE6 polyfills and other stuff that nobody dares to touch as it has been there since 2009 and nobody knows what it does, within the company that wrote the CMS or in the agency that adds the 'theme'. It is worrying really with the best people can do is to add layers of ever more complex 'build' tools to mash this cruft into something they don't have to think about.

P.S. There is no way I would be in an econobox if travelling through the American West, give me one of your trucks, SUVs or even a sedan any day. In Europe though, tables are turned, a country lane or a city stranded in a U.S. vehicle would be a special kind of hell.


Awesome take on the situation.


In response to #2, there is also a reason that photojournalism exists--unless you believe in a world where everyone imagines what important figures and historic events look like based solely on textual descriptions. This is completely ignoring the fact that I, nor most of society, would never attempt to receive the day's news via Morse Code.


Hint: TCP/IP & UTF-8 are fundamentally along the same principles as APRS. So yes, really you are.

As far as photojournalism goes, I can think of thousands of reasons why the predominance of photojournalism has been pernicious to civil society. The strategic use of the identifiable victim effect in atrocity propaganda being the most obvious.

However, with that said, I'm not arguing that visual media is entirely unnecessary, but rather against the idea that every user needs to download a 1680x1050 image when a default 600px width image or even the horror of having the image as an external link would be more than adequate for the overwhelming majority of users, particularly those in rural and developing areas that can't afford to waste their total allotment of monthly mobile data on "What Chance the Rapper’s Purchase of Chicagoist Means".

[1] https://en.wikipedia.org/wiki/Atrocity_propaganda

[2] https://en.wikipedia.org/wiki/Identifiable_victim_effect

[3] https://www.nytimes.com/2018/07/31/opinion/culture/chance-th...


Your math does not add up. It is 1 * (6.6 + 3 * 5.9) + 3 * (5 + 3 * 5.9) == 92.4. Unless you are speaking about approximations obviously.


Unpopular opinion alert:

Maybe the "bullshit" is only bullshit to you, the thorny tech-savvy reader. Maybe businesses have tried the plaintext approach, and their business was improved by adding fonts, stylesheets, API calls, spinners, scripts, high-res images, and god knows what else. Maybe speed improvements are not important beyond a certain point. Maybe 5MB doesn't matter to most people. Maybe micro-optimization is costly in large organizations.

Maybe other people making these decisions aren't idiots, and maybe, just maybe, they're even thornier and tech-savvier than you.


It's bullshit allright. Thousands of cpu hours, megabytes and dollars wasted on some user spying turdheap the marketing guys NEEDED to put in an app, 'to track user behaviour'. What a euphemism. Without spilling too many beans, we're using a solution that records an actual movie of the user using our app, each time a session is started.

In six months, the only thing we learned: half of the users that finish setting up don't continue to use the app. We could have learned that from server side logs without invading user privacy.

Another story: four employees of a large corporate running target marketing based on behaviour of customers logged in to their website.

Added revenue versus control: 60.000 yearly. Sub-0.1% conversation rates. The corp had been running these campaigns for at least five years.

My personal stories. Anecdata, I know.

There is SO MUCH BULLSHIT.

Edit: to clarify the first example, it's not a movie using the front facing camera, 'just' a screen grab


>Thousands of cpu hours, megabytes and dollars wasted on some user spying turdheap the marketing guys NEEDED to put in an app, 'to track user behaviour'.

I understand that somewhere in a perfect universe, tech guys just make things right and money just flow in without those pesky marketing people ever involved. Unfortunately, in our universe if you don't do marketing and don't analyze user behavior your business is toast.


In the pre-monetized web, yes the tech guys made things right without money extracted from the web.

Standards & many applications that implemented them (including commercial) were attempted to be designed to make online computer use possible and beneficial. This is the domain of academics, committees, and concerned individuals creating the means of useful computing & communication, outside of parasitic desperation to inject unnecessary rent-seeking lock-ins into everybody's lives.


Swell history lesson, but I don't care about academics and nor do the vast majority of internet users. They can use the internet too, of course, but there's no point moaning about web pages getting larger. I was using the internet in the 90's and it sucked - literally everything about the internet is better today. My life would be no different whether a web page was 5 megs or 5k just like it makes no difference if moby dick was 100k or 100 megs. It's still getting downloaded on my phone, getting copied onto my Kindle etc.


> I don't care about academics [...] literally everything about the internet is better today.

The average message posted to comp.lang.c before Eternal September was far batter than the average message posted to HN, Proggit, etc. But many of the former were written by academics, so you probably don't care.


My post has nothing to do with web page size. This is specifically on how monetization has affected that which we use for personal communication and seeking information.


> In the pre-monetized web, yes the tech guys made things right without money extracted from the web.

It was extracted from tuition paid by undergraduate students and taxes paid by taxpayers.

And that money all eventually came from dirty capitalism.


It's not some nebulous evil of "capitalism" that is the issue here, it is the issue of filtering, manipulation, and distribution of people's private communications & data, and of the effective "town square". Money can be and has been made without these practices, which the laws have not caught up with, and to which there's strong ethical disagreement.

And yes, it cost a lot of money to get online and/or host data back then, which was subsidized by piggybacking corporate and university internet presence. Nowadays it's dirt cheap to personally host & access tons of content, so there's even less need to monetize.


Sure, if your content sucks. If you need all that bullshit to stay profitable then perhaps the business isn't really needed at all. "but that is the industry competition for you, everyone does it."

To that, all I can say is perhaps some regulation is required to level the playing field.


Strictly speaking, most individual business isn't needed.

If the NY Times folds tomorrow due to bankruptcy, there are dozens of other papers where we can get the news. Their reporting is good, and I choose them over their competitors but losing them wouldn't be the end of news as we know it.

That's the case with most businesses---the marketing and sales are only needed to compete, whereas without them the product would still exist for consumers.


Your choice of journalism as an example is unsettling. If the New York Times goes out of business, there is no guarantee that another news publisher will break the same stories. Some things will just not be investigated anymore and some important stories will just not be told.


But doesn't that hold true regardless? The presence of the New York Times means there are certain stories not being investigated or told. And we aren't even fully aware. We have no idea if those stories would be of more worth to us or not.


Can you elaborate more? I was working with the model that more investigative journalism is better, but you seem to be suggesting that the presence of some news outlets inhibits others.


Just in a basic sense. The NYT employs X people, sells to Y people, etc. Those are people who won't be employed by someone else, people who won't buy another paper.

There's a sort of critical mass of "news" that can be made. We can't all be investigative journalists.

So whatever that would be here instead of the NYT might be different, it might not be worse. It could be just as good, just different.

Of course, it could be worse. It could be better. We can't know.


I see your point. If you make great content/product users should somehow learn about it in order for your business model to work, because no users means no money. You can assume that people will just get so excited about your product that they will just tell other people about it, share links, post social stuff, so that the growth happens organically without all that "marketing bullshit". And it does happen organically. The problem is that even for very good products such organic growth is way too slow because it has exponential nature (very slow in the beginning, quickly accelerating only at the end), so you run out of money before hitting profitability. There are outstanding examples of products that quickly became viral, but these are extremely rare cases. Hoping for this luck is like hoping to accidentally build next Twitter. In other words, you shouldn't hope for it if you're serious about your business. You would need something that propagates awareness about your product in a more manageable way than pure luck. It's called marketing.


You don't want regulation to level the field, you want to set up a barrier to entry for smaller players. And I've thought of this many times, regarding all the terrible news outlets out there.

But you'd have to gut 1A to do it, and it would instantly be used for evil. Nope, just hell no.


I mean regulations in terms of specifically protecting ourselves with a digital bill of rights. No one gets to use telematics, no one gets to track users, etc.


How do you know if the client has an error and the content was not loaded/displayed as expected? How do you know if placing the content on the right side or the left side increases conversion rate by 10% if you don't do A/B testing? There are huge business advantages when you use analytics. You don't care about creepily "tracking" a specific person, you just care about getting better conversion rates and happier users.


You may not care about tracking individual people but in order to realize your business advantages you end up doing it anyway. They always put their business needs above those of the users. Increasing conversion rates always trumps user privacy and peace of mind. Claiming businesses care about happy users is just dishonest. They care about bottom line.

Business just shouldn't be able to freely collect mountains of information. They can't be trusted to do it responsibly. Collecting data is a privilege and it can be revoked. If they insist on doing it, we'll find a way to avoid sending the data. If they lock us out for not sending data, we'll send fake data.


Over all I think that this type of metric has had a significant detrimental effect on the quality of news items. I understand that there is a business advantage to knowing things like this, but I think that there would be much less rage porn and click bait around if this information wasn't known to an entity that can use it to forgo relevant reporting on things that matter and manipulate our tendencies towards morbid curiosity and outrage.

I'd go as far as to say that it isn't only unethical to collect this information, but that it's unethical not to take active measures not to share it.


They're not idiots, they just don't give a crap about waste until it affects their bottom line. (Most humans seem to operate on this principle)

A lot of the business-related bullshit is a response to advertising and sales. But a lot is also lazy tech people slapping pieces together until it does a thing, and they don't care that it loads 100x more data than it needs to, because again, doesn't affect their bottom line.

Traffic costs must be pretty low, or nobody's watching that line on the AWS budget, or the CDN is eating the cost.


Why would they care, they have to do the work of 5 in the time of one because there's now off the shelf solutions for everything. Too bad if those solutions often come bloated and incorrectly configured, maybe they should hire more engineers. Some business prioritize loading speeds but big ones with brand reputation won't care, they have little to lose.


Well then, shall we kick up more of a shit storm about it to give them a little more motivation?

If the argument is "nobody will do anything about it because nobody cares," isn't the best response"then start caring!"

I'll see if I can find some old articles I used to love about how something called something along the lines of "concerned Christians of America" basically determined what could be on television for ages because they maintained a letter-writing force of a mere five thousand people. Nobody else was kicking up that level of shit storm, so the networks listened to them.


> they have little to lose.

Then let's make them lose. The more widespread bullshit blockers become, the more money they will lose.


In a world where video over IP exists, market rate traffic expenses have got to be among the least of the concerns of print journalism.


> Maybe businesses have tried the plaintext approach, and their business was improved by adding fonts, stylesheets, API calls, spinners, scripts, high-res images, and god knows what else.

All i can say is that I work at a small-ish multi-million dollar business that got bought by a very large multi-billion dollar business. Our site has become very, very bloated due to everything that our parent corp wants in the name of 'making things better'. I'm not saying we are stagnant, but those changes really haven't increased our bottom line in any significant way. We now spend more time and dev resources supporting a site with much more 'bullshit'. Real business decisions, ones that benefited the customer, are where real increases in value were seen.

This of course is anecdotal, but I've never really heard the opposite.


There are many logical reasons why various websites are so bloated. Certainly the market realities and web development culture play a role.

That doesn't change the fact that the end result is bullshit, though. The tragedy of the commons is still a tragedy.

I don't have a solution for commercial news organisations. If they can't completely subsidise their web presence through other means, it's pretty much guaranteed that the advertising pressures will turn it into crap.


I don't understand how you are relating this to the tragedy of the commons other than saying that because something has a reason it doesn't preclude it from being something.


Everyone selfishly optimizing their bottomline ("let's earn a bit more by including extra JS analytics bullshit", "let's save some time on our end by making a million people waste a little bit of electricity every pageview") - each initially reaping some short-term benefits, but together ultimately turning the web into the shitshow it is today.

That's textbook tragedy of the commons.


People’s/companies’ own websites aren’t commons. The protocols that make up the web itself haven’t degraded through use; this is no more a tragedy of the commons than if every shop in your town made you wait a few seconds before you could go inside.


Temporal literally said that the internet is the "commons" and your counter is that companies and protocols aren't the commons.

You're just being purposefully obtuse, this is not hard to understand, assuming ones paycheck doesn't depend on not understanding it.


There’s no need to be so rude. I’m not being obtuse; I understand their point, I just disagree with it.

The internet itself is not degraded by some major news sites being slow. Their misuse of their part of the internet doesn’t affect it overall, thus it is not tragedy of the commons. Most of the websites I use on a day to day basis (GitHub, Stack Exchange, HN, most smaller blogs that I find through HN, etc.) do not have this problem. The experience of those sites isn’t made worse by other companies’ sites being bad so, again, it’s not a tragedy of the commons.


The internet as experienced by the average human is a poorer experience because of the advertising and spyware/tracking.

This is almost universally true, even if small areas have been fenced off. The fact that you can limit your use of the internet to HN and a few other sites is in fact a luxury.


I think his point is that the protocols are the internet.

I mean, the "internet" is just the word we give to the interconnected network of computers spanning the globe. It doesn't exist as a thing per se, but as an idea. The internet can never be slow. Your connection to a certain site can be slow. Whether that site is a particular web page or the internet service provider's router that provides you with access to the greater network.


In this case the internet is the sum of the online services people interact with.

Most websites are using advertising and spyware-like tracking, to the point that it can be said that interacting with the internet will be a poor experience without an ad blocker. There will always be the odd oasis website, but the internet has been completely swallowed by spyvertising.


No, but the frameworks they use and services they outsource tasks to definitely are commons. Even more so are design and development trends.


> Even more so are design and development trends.

No, they’re completely not. Trends are not a shared resource such that a few people misusing it harms that resource for everyone. Something being common doesn’t make it a commons. Any website can choose not to do these things - just look at HN. They’re not a necessary shared resource, in the same way that all Baltic countries must fish from the same sea.


The internet is a shared, unregulated resource which is spoiled by individual actors selfishly following their interest of making money through ads.

It is rational for each commercial website to use advertising, and tracking and other crap. As the parent said, those people aren't idiots.

But in the end they are still ruining the internet for all of us.


Sounds like you work in the publishing business (and also that GP has tripped you up somehow). Care to elaborate what the rationale for publishers to bloat their web sites could possibly be? Because I know I've come to use news aggregators like HN as portal sites (plus RSS) rather than go to news sites directly for many years now, precisely because I want to know beforehand if a particular article is even worth visiting, rather than receive a crapton of script for nothing valuable in exchange.


I used to work at the FT, and we tested a number of hypotheses:

1) faster page loads means better engagement 2) everyone hates popups. 3) everyone hates autoplay

turns out that faster page loads _do_ make user happy, normal non tech savvy users happy. (source: http://engineroom.ft.com/2016/04/04/a-faster-ft-com/)

I don't think the popup experiment has a write up, but people _hate_ popups, even more when they are not relevant.

I was part of the auto play experiment, and whilst a few people really like them, it saves them clicking, the vast majority hate it with a passion.

Auto play is a thing because it boosts ad impressions between 25-80%

So for ad based revenue, yes, I can see why people do this. For subscribing sites, it kills your bottom line. So empirically they either rely too much on ads, or are idiots who don't have evidence to base decisions on.


No, it's herd mentality. Because it's what everyone else is doing, it's the standard, it's what you do and to be an outlier in this is to be deluged with how archaic you are and then be swarmed in marketing jargon that breaks down page element interactions to an atomic level.

Meanwhile, users ditch the bloated web pages for apps or if browsing from a laptop or desktop computer, for a curated view delivered by Facebook or Twitter or another service in the long tail (i.e., Tumblr, Pinterest, etc.). The effect may not even be plainly noticeable but it does work on a psychological level. All the cruft and gunk dissuades repeat visits and any desire to return to browse that internet space.

Some online spaces, and mostly in the news vein are even overtly abominable. My local news outlet (azcentral.com) is unbearable to load, with the autoplay video, the barrage of ads, both popovers and the hard elements that take space in display. Leaving just a tiny portion of the screen real estate to actual content that a reader might be interested in.


The tons (and growing) of normal everyday users using ad blockers and other annoyance-reducing web plugins is evidence against this.

Also, we are not talking about the expense of “micro-optimization.” We are talking about not deliberately spending the time to add all this BS into the software. I guarantee it costs less to publish a text/plain document than a HTML/CS/JavaScript trash fire.


By the same token, maybe the automobile and oil industry really care a lot about the environment, but the only way to produce a safe, affordable and performing car is their tech savvy way refined for a century.

Or maybe they just don't give a shit, and the current status quo allows them to make money without caring about the consequences, until we as an informed community demand a better way.


I don't think you're wrong, but you may be overestimating how quickly they converge on an optimal balance, and what fraction of the bloat is necessary for business purposes. I frequently see sites that slow down my browser and mobile sites that become near unusable, and I doubt that is all necessary for making it pretty and having the right analytics. I've also frequently reviewed the work of front-end devs who just didn't see the easy optimizations they could make.

The bloat persists until there's a clear relationship between some unit of bloat and some concrete loss, but that might not be easy to see, because you don't see the people who long since gave up on your site.

Edit: One illustrative example is disability access accommodations. A lot of them are a net loss to the business, but some of them were obvious improvements no one had thought to do: "Oh, crap, it really does help to have a lip on the curb so you can roll stuff up!"


I'm sure plenty of sites improve their bottom line by implementing dark UX patterns as well, it doesn't make it any less bullshit.


It's still bullshit, because it doesn't help the visitor, and it only helps the company to the extent that they're in a Red Queen's Race—if no one else did it, they wouldn't have to either.


I'm sure I am more consciously aware of the bloat but I do think that subconsciously "non-techy" people still dislike it.

I feel like this is similar to the apple vs microsoft/IBM debates. I'm not even sure what the popular take on it is these days but I do believe that to some extent a pleasing user experience was/is championed by Apple (in a specific aesthetic sense). And, if apple never existed we would have less tech products with that user experience philosophy embedded in them. I believe that illusive "pleasing" user experience that apple seemed to try to build in its products is similar to the idea of not having as much bloat on websites.

There is business value of consuming every single morsel of user tracking data possible but that tracking also has some cost in terms of user experience. Just because companies have tended towards bloat doesn't mean that they are definitely right. Apple has made a great deal of money and I believe some of that is due to their style of user experience.


> Maybe 5MB doesn't matter to most people.

You sure won't see those in your analytics dashboard.


You just used “maybe” eight times to make an argument without having to support it in the slightest. Maybe you’re right, or maybe you’re just bristling at an argument made well and strongly supported without a leg to stand on. Maybe if you bothered to support your claims, or even make them real claims instead of “maybes” it would be easier to take your non-arguments seriously.

Of course it would also be easier to take you seriously if ad blockers and script managers and cookie sweepers weren’t so incredibly popular, along with VPNs. Every site begging people for money or to stop using an ad-blocker is a nail in your “point’s” coffin.


The fact that people want bullshit or that bullshit sells does not change the fact that it is bullshit.


I don't know, as much as I respect you looking out for the "engineer as savior" bias and the fact that average users may have a distinctly different viewpoint than we do, in this case it seems somewhat disingenuous.

People have spoken time and again about needless auto-playing video clips. About weird, busted scroll hijacks and massive header images. About newsletter beggars which take up the whole screen on every. single. site. About ubiquitous "got it!" cookie disclaimers. About pages that load mountains of extraneous crap while you're trying to read, and reflows the text halfway through the second paragraph, suspiciously at the exact moment you try to tap a link which makes you click an ad instead, which is of course bloated with poison dogshit and offal and immediately crashes the tab. About how every single major site bullies you into downloading an app instead of putting effort into making their mobile site run properly. Instead the mobile site is intentionally crippled to incentivize the adtech-infected app even more, and even the request desktop mode is broken. And of course the mobile app is only a thinly-veiled plot to scrape your phone's guts out like a dead fish and gain every permission possible. If it can run in a browser, I think it should run in a browser. I'm highly suspicious of any app that is really just a website in a tarted-up trojan horse.

Adblockers are perpetually on the rise, as the article says, and other content blockers and filters are also in play now. I have to use outline.com to read most news articles, which is an amazing resource. But I think this popular opinion is warranted. The internet is bloated with malicious bullshit. Everyone got in an arms race with each other and had to do it, despite nobody actually wanting it.

I don't think the devs and designers who made this stuff so ubiquitous are evil, nor are they stupid. They were just caught in a catch-22 as web trends do their fickle thing. They are regular people earnestly doing their best, just like everyone else. Doomed to create camel after camel via feedback from ignorant clients, misread focus groups, neurotic middle management, et cetera. People rarely get to decide or unilaterally invent any of these large-scale trends, they just happen as a herd phenomenon I think. And everyone has to jump on board to remain viable, or at least that's the illusion at the time.


You make a good point: the people who have ruined the Web are doing it because it makes them a profit. That's the whole point of ethics & morality, though: refraining from what benefits us immediately because it's wrong. And just as it's wrong to kill the old lady next door to steal her house, it's wrong to turn a product-listing site into a JavaScript-laden monstrosity so terrible my phone becomes painfully hot to the touch.


I know I'm an idealist too, and my comment will be a display of hypocrisy: it seems that one of the rules for using the Internet is that you should completely forget about the is-ought problem.

People will dole out all kinds of advice about technology, business, politics, society, relationships etc. by way of discussing something that only partially resembles reality (and assuming everyone or most people perceive things in the same way as themselves)

(I'm guilty of that too.)


Or maybe it's really easy to have all the individual decisions make sense, but still arrive at a really insensible complete decision. Like a local maxima.

It's easy to be smart, make smart decisions and end up somewhere dumb.


> Maybe other people making these decisions aren't idiots

Nice joke, I liked it! :)


"[B]usiness was improved" is doing a lot of work in there.


We have a ridiculously-backwards model on the web where you essentially pay for what you use (via your data plan, and via forced ads prior to promised content) without having any way to know in advance what it will end up costing you to display content. Heck, you don’t even know if the content will display correctly after all that loading. Worse, there are many ways to trigger loads accidentally, meaning you may want none of the content but you end up paying for it through your data plan.

We desperately need absolute maximums enforceable in the browser, reversing the firehose. I want to opt your site in to more data use, after I trust your site. And I expect sites to work within my limit or not receive visits.


So while you are fiercely criticizing sites which apparently understand and leverage the reality of zero-marginal cost digital data, you assume that data caps are a necessity in a digital world.

Workers don't shovel extra connectivity into the towers when you go over your data cap. So why are you focusing on data flowing over the nearly-always-on connection you have through your data plan, rather than the bullshit terms you agreed to when you signed up for a data-capped plan in the first place?

Edit: remove unnecessary word, clarification


Data caps are like your city bus. You can buy a bulk number of rides per month at a discount, and everything over that is billed at a regular rate.

The bus doesn't magically add more seats when you make extra trips on it, but it still costs more to take the bus three times a day than twice a week.

Even though the marginal cost of your body on the bus is minimal, having the fee serves to reduce crowding (demand at that price point) as well as fund the infrastructure. Data caps are the same.

Now, you can argue that data caps are surrounded by deliberately misleading marketing and that overage charges are unnecessarily steep. But you seem fundamentally opposed to the concept of data caps.


Both of the public transit systems I use on a regular basis sell me monthly unlimited-use passes.


That’s nothing more than marketing.

Do you ride more because it’s unlimited? When you get to your stop for your office or dinner or whatever, do you go back home and then back again before getting off? Do you go all the way to the end of the line to minimize your cost per mile, then walk back to your actual destination?

Of course not, you use it as efficiently as possible to do the thing you used public transit to actually _get to_.

Do you consider the money you spend for your unlimited pass as rent for housing or office space? Do you just stay on the bus all day, working on your laptop and holding meetings?

Unlimited transit passes are similar to unlimited vacation plans: people normally need and use much less than 100% of the resource offered. People also don’t normally push 100% bandwidth 24/7 through their unlimited plans.

If everyone decides to use their unlimited transit pass in this way, they will go away or increase in price accordingly.


Of course not, you use it as efficiently as possible to do the thing you used public transit to actually _get to_.

The comment I replied to claimed transit systems don't sell unlimited-ride passes. I was pointing out that some of them do. What you're arguing is they wouldn't have the capacity to handle it if every person bought such a pass and immediately ceased all activities other than riding transit 24/7, which may well be true, but is also a non sequitur.

Though since you ask, it would be a bit difficult for me to "maximize" use of my Caltrain pass in the way you're suggesting, since going "all the way to the end of the line to minimize your cost per mile, then walk back to your actual destination" is not realistic -- the line is ~80 miles long.


I did not claim that, please read more carefully.


I definitely use public transportation more often instead of walking or riding my bicycle than I would if I had to pay per ride.


That wasn’t my point though, you still only use it as much as you need it.


That is true. Point taken.


> Data caps are like your city bus.

Hi fwip,

I don't have any knowledge about the economics or logistics of city bus lines.

I do understand something about Transfer Control Protocol, Javascript's event loop, and soft realtime performance in Linux.

Can you make an analogy between data caps and any of those concepts so I can understand what you're getting at?


The terms are not bullshit. The network does not have the capacity to service everyone at their maximum last-mile data rate, all the time. Bandwidth is a large, but finite resource. There are several ways of addressing this:

-You can divide the total capacity up equally and give everyone dedicated connections that aren't very fat. Nobody likes this, and it's very inefficient because most people aren't using their full bandwidth most of the time.

-You can charge by the byte, on top of the base fee, so that using the system more costs more. It's fair - pricing directly reflects resources consumed - but nobody likes the idea of being nickel-and-dimed whenever they click a link.

-You can do the above, but include some large chunk of data (which is quite cheap) into the base fee, that most people won't go over. Call it a "cap". This works well.

-You can promise "unlimited" data, but start sending nasty letters and/or slowing the connection to a crawl when they go over some prescribed "fair use" amount. This is a data cap in all but name.

Basically, all-you-can-eat stuff is always sold on the basis that you will consume some reasonable amount, because no resource is infinite.


Browsers should start offering the option to deny all cross origin resources too.


It would be easy to make an extension to do that but I suspect most websites would no longer work correctly. Many websites have their own API server on a different domain, many host images on another domain, most use a CDN of some kind which hosts CSS, JS and sometimes HTML on another domain, etc.


Even a simple limit would probably help a lot, e.g. “at most one alternate domain” (your main CDN, and not sketchy-analytics.com or unnecessary-ad-malware.net).


Just use one of the many ad/tracking blockers. Websites will still work. Blocks sketchy-analytics.com. Why make this more complex than it needs to be?


Because when I go to use a website its an interaction between me and them, I never want any third party involvement. Allow requests to subdomains but nothing outside that.


Use noscript then.


Me using noscript doesn't make a dent in the usage patterns of the public at large and it doesn't inform others that there is a better way to browse.

Without any decent weight behind it, there would be no incentive for people to build their software to actually respect the user and they would continue serving a broken application.

Much like browsers slowly boiling site owners by displaying sites as insecure if they don't have any encryption, a similar effort could be made to stop websites handing visitors around like the town bicycle.


Yes, “shaming” would be a relatively easy improvement too. Browsers should feel free to display big, red, scary-looking logos like “this page load has consumed an unusually-large portion of data/battery/whatever”.


uMatrix works on all browsers that matter.


Out of the box Android and iOS, no it does not. And that's a (the?) cash cow for online advertising.


I'll admit the mobile version of uMatrix on FF could work a lot better especially considering I have to disable and enable it in order to get it working at times. That doesn't diminish the fact that I like having the option to do so. Most of the time if I don't want to spend to much time configuring a website I just open it in Chrome.


I use a VPN with a hosts file that is composed of EasyList etc for my phone. It saves me money and battery.


Same here, but is performance of the server OK? Because I've had bad performance issues with ~1k lines in hosts(5).

I've thus written a script that I run daily for my local unbound(8) daemon -or bind: https://gitlab.com/moviuro/moviuro.bin/blob/master/lie-to-me


I don't notice any speed differences. I'm loading stuff via 4g mostly so I think the network speed is the main bottleneck. I'm using etherVPN on a tiny arm server in the same continent as me.


It's not just publishers. My current org uses SalesForce and it's frustratingly slow. Opening a single record takes several seconds as every single interface element is generated dynamically and then populated, seemingly one at a time.

When it's finally done loading, you click on a dropdown, and then the dropdown just shows you a loading spinner, as your client asks the server what should be populated in the dropdown menu. And of course it takes another second. Clicking virtually anything, including back (which should be instantaneous on a proper webapp), will present you with yet another loading spinner and a multi-second delay.

I understand the benefits of building webapps this way, but the benefits primarily accrue to the developers of the app and not to the customer.


Saleforce's problem is that its presentation layer is a giant pile of legacy code, which can be fundamentally changed, because there's a gazillion extensions and customisations that rely on it working the way it currently does. Having worked a bit with it, there's absolutely nothing about it that "benefits" developers, anymore than having to maintain a VB6 app "benefits" developers.

Done right, a modern web app should be better for both developers and users. The one benefit of the crazy shit described in the article is that if you don't weigh down your page with a multiple MB of ad-network and analytics scripts, then it can be incredibly fast.


> Done right, a modern web app should be better for both developers and users.

Right, and you also don't need to make a text-based site an SPA or use any frontend framework. The web was made for documents, so no need for those kinds of sites to be apps.


Salesforce uses a large quantity of DNS indirection, more than even the large CDNs. I measure the amount of lookups that sites and apps require, and the delay it causes. Most sites on the www only require two lookups.

This is perhaps an example of "... the benefits primarily accrue to the developers of the app and not to the customer."


SalesForce slowness does not equate to "all customers of web apps are getting a bad deal".

Go look at reelgood.com. Imagine if Salesforce worked like that. It should be clear as day that this is a SalesForce problem, not a modern-web problem.


>Opening a single record takes several seconds

that sounds more like a database delay rather than loading the interface


What I think is difficult is that most people think a website should be an experience.

The client wants this, the designer wants this, the marketeer wants this and even most users want this.

So that's how huge headers with high res photo's are born.

After that the site must be online asap and the developer doesn't have or take the time to load images responsive.

Combine this with a framework that takes 200ms to init and we are where we are now.

(And then ofcourse there is the marketeer telling you to include a script from x,y and z.)

With the right tools you can build a fast web, but I think most developers are not experienced, lazy or just don't care. (You don't need to include 0.5MB of FontAwesome when you only use 2 icons...)

And yeah: AMP is a joke.


> What I think is difficult is that most people think a website should be an experience.

People on the delivering end. Not the users. The users just want to get the content they came in for, and to continue with their lives. This "website is experience" is a combination of vanity and trying to monetize users better by playing on their emotions.


I wish that were true, but is it really? I think of Idiocracy, and how prophetic it was. I fear that most users do want 'Ow, My Balls' and brightly-coloured, moving objects to distract them from the ennui of life.


Which designer wants an unmovable header that takes 1/3rd of the screen and makes me feel like I'm looking at it through blinds, as I typically see on mobile?


AMP is terrible. I flat out block it in my hosts file and haven't missed it yet. Its pretty much just used for advertising anyway. Or for 'news' articles that are thinly disguised ads


> framework that takes 200ms to init

Which framework(s) take 200ms to init?


I am increasingly encountering news sites that detect ad blocking software and (understandably) refuse to show me their content as a result but the problem is that I enabled ad-blocking on those sites to begin with because they were loading nasty javascript ads on the fly which pegged my CPU!

As a web dev, I feel extremely conscious about what I'd call "javascript library hygiene", and I feel that whoever's in charge of many of the news sites out there, just does not give a shit.

Respect my computer's resources and you'll get your ads re-enabled.


I've noticed that a lot of sites will end up using 100% CPU usage on their Chrome tab. I haven't investigated, but I do wonder what sort of faulty design leads to this. You occasionally hear talk of sites using your CPU to mine cryptocurrency, but I'm more inclined to suspect lousy programming.

I sometimes wish I had an easy way in Chrome to restrict a tab to, say, 5% CPU, for the cases where the CPU usage is clearly not adding any value. Just so I can get through an article without the fan ramping up.


  renice 19 ?


It's my understanding that renice doesn't limit CPU usage, but rather adjusts the process's priority in the scheduler relative to other processes. So I don't think it would stop a process from eating up all available idle cycles, draining the battery, and ramping up the fan.


Also, I don't think there's a trivial way to get the PID of the process handling a particular tab.


In Chrome, Task Manager shows PID along with CPU & memory usage.


I've noticed that trend also. Those are websites I keep a mental note to not visit. They usually have bottom of the barrel ads anyways and shitty content.


Sorry but this piece comes across as entitled and whiny. It's easy to point out how bloated and terrible most modern large sites are and guffaw in disgust at the counts of xhttp requests and scripts that are loaded in order to provide no user benefit.

But just moaning about it probably won't make the problem go away. Simply rendering text isn't a business-model anymore unfortunately, and publishers are doing everything they can to actually make their content profitable.

Look: I hate the modern web as much as anyone and I always browse with ad-blocking on. But I turn it off for sites that I get real value from, and I pay monthly membership fees to news sites that I believe respect me as a visitor. My way of working isn't super great for me or for publishers (and I doubt most users turn off adblock for sites they value).

I was hoping this article would show some empathy for publishers and why they would start down the road of such user-hostile behavior. A complete piece would paint a vision for how to end the madness with a solution that is acceptable both to publishers and viewers. I don't know what that solution is, but I strongly doubt that just moaning and counting xhttp requests is part of it.


>Simply rendering text isn't a business-model anymore unfortunately, and publishers are doing everything they can to actually make their content profitable.

If they can't come with a business model that works on the web then they're welcome to leave the web and go back to the printing press. The web wasn't made to provide a stable platform for monetizing content, they have literally every other media paradigm ever created for that.


You're also welcome to leave the site and not return. You're not entitled to their content on whatever terms you decide are fair or not.


> You're not entitled to their content on whatever terms you decide are fair or not.

Yes I am. I'm entitled to whatever content their server returns in response to my user-agent's request, and I'm entitled to filter and alter that content in any way I choose, including not running javascript and blocking advertising.

If they want to put content behind a paywall, fine - good luck getting anyone to consider their content worth paying for, though. Otherwise, it's fair game. That's the way the web works, and that's the way it's always worked.


But you're just defending the ad/tracking part; even without all that pages are still relatively huge and slow.

And I sadly don't know either what else apart from whining and moaning the average user could do.


I strongly suspect that 80-90% of the bloat and slowness is due to the ad/tracking parts. Yes, there's bloat with modern web/js tooling but even React and Angular (I think the heaviest of the modern bunch) are on the order of hundreds of kbs gzipped over the wire, and the benefit they give is better ux. "Old web" was fast/bare-html but not great for nontrivial ux.


> People really hate autoplaying video

Yes, this a billion-fold!

Every single newspaper site I peruse uses autoplaying video.

The experience goes like this:

Load up the front page of newspaper site. Select an article of interest and click on its link. Article loads. A video player pops into existence on the lower right hand corner of the page, and the video starts playing.

Most annoying. However it gets more egregious...

The video player floats when you scroll down the page, until you reach the part of the page where it usually resides - and warps over into its little area - until you scroll past - whereupon it pops back up on the lower right hand side of your browser again.

Aaaaaargh!

Just as egregious : after viewing a video you might even have been interested inviewing, another video is automatically loaded onto that same player, 99.9% of the time on some completely unrelated subject.

Y'know, someone, somewhere, woke up one morning and thought "Great idea! Let's do <what is described above>!" - why can't the message that "People really hate autoplaying video" get back to whoever makes the decisions on user experiences, so that they just quit doing the above bullshit?


Stop visiting the sites then. There is nothing else you can do. Except maybe write them an email they'll ignore because their metrics (more like a "business analyst") tell them to autoplay videos.


Since this is posted to a site focused on developers, I’m surprised you didn’t mention the third option: Developers, stop adding all this crap to the software you’re writing and excusing it with “I’m just doin what I’m told!”


The video is not actually the problem (for me). It's the audio.

Why can't my browser (Chrome) still not give me appropriate control over audio?



That's already in place. There's nothing in the way of actual user control other than muting the tab. The browser decides, pretty much, and you can't argue with it if it's decided to list a site as something that can autoplay audio based on your interaction level (which doesn't necessarily mean you've ever deliberately enabled audio on the site).


Yeah, thats a great one, especially if you are in a big office and forget to plug in your headphones.

There is a simple trick though:

about:config >> Search for "autoplay" (media.autoplay.enabled) >> set it to "false".

Some players might refuse to play media outright though.


If it can help you, think of the video as the content. It really is the only content that matters on that page.

Yes, the site only exists to deliver that video, not any news. That video earns revenues, news do not.


It's not only the web anymore. What about the desktop Slack that can take up to 1Gb of RAM for a "simple" chat client.


Desktop Slack runs on Electron, no? Which essentially lets web developers develop a "native-ish" cross-platform app, without learning a whole new platform.

There is a long, long history of attempts to make good, cross-platform, native apps. In the late 1990s there was a company called Visix that had such a library that worked on Windows and Unix. There have been Windows implementations of Motif/XWindows. There was of course Java/JWT. These days there is Qt. None of these seem to have really caught fire.

edit: I wrote Mac but I meant Unix


Qt and Java can do cross platform software, however it's not possible to make a UI that's usable on desktop, tablet and mobile. You will end up creating separate apps.

Qt is actually really impressive in my opinion, it's possible to recompile a desktop software for mobile, sometimes without any code change at all.


Yeah, because cross-platform is a PITA. The reason web apps are popular is they're thin-terminals. You write your business logic on your mainframe/terminal server (linux vps) and the web app just handles the user interface. You maintain less portable code, deployment and upgrade is easier, and it simplifies the user experience a bit. The downside being network issues and supporting the server.


I wish that was the case. Web developers nowadays try to push as much code as they can out into the browser, to be run client-side.


I don't know why the slack desktop client is so popular. It's almost no different from the web client, but eats more RAM if you already have Chrome open. Just pin Slack in a tab and save your memory


> I don't know why the slack desktop client is so popular.

In-dock notifications and system notifications are my reasons, as well as a dedicated window I can alt-tab to instead of tabbing through browser windows, then navigating to the Slack tab.

(I realize that for some people, "hiding" Slack in a pinned tab may be a feature.)


Web Slack does system notifications. You may have to turn them on explicitly because it requires a permission (can't recall if you do or if it prompts you, but it is something you have to enable), but if it's working in my Linux browser I'm assuming it works in Windows and OSX. I don't know if that would also integrate with in-dock notifications or not. If so, making it a viable alt-tab target would just be running it in a separate browser window, which should still net many fewer used resources than a separate app.

If you are happy, no problem. I'm posting for general info.


> can't recall if you do or if it prompts you, but it is something you have to enable

It does. In fact, it keeps displaying an annoying blue-colored topbar telling you that your life will be better if you enable notifications (fortunately, with the option to snooze it or dismiss it completely on given machine).


> In-dock notifications and system notifications are my reasons, as well as a dedicated window I can alt-tab to instead of tabbing through browser windows, then navigating to the Slack tab.

Emacs-slack can get you all of that:-)


I'm aware of the fact that Emacs is a fully fledged operating system :P


Just open in a separate chrome window?


In OS X, if I alt-tab over to Chrome, it will focus on whatever window I had in focus last; I'd still have to Command-Tilde to find the Slack one.

I know, it's a minor kvetch. But it's a minor kvetch I can solve with a gig of RAM.



You may be interested in an app called HyperSwitch. It lets you tab between individual windows of each app! There’s also opt-tab which will cycle through individual app windows.


I find it runs faster and is more stable in the desktop client, but I switched a couple of years ago so that may no longer matter (also ymmv etc).


you can only do video conf through the app


The point would be the same even if nobody used the client except to start it up for measuring bloat.


I'd be saying "Yeah!" along with you, but that client is also a video chat and audio calling app, not just text and images.

There's bloatware out there, and this may even qualify, but it isn't just text chat. (Why would you use the desktop client for text chat anyway?)


Skype also has audio and video, with custom codecs even, and p2p, yet it wasn't even close to using the very real 1gb of ram slack consumes

(I'm talking about the old desktop client, not the new electron based one)


audio/video work fine in chromium.


It contains Chromium in it's entirety, which is a contemporary full featured browser, which is actually in a way a complete operating system.


My GMail tab runs between 225MB and 400MB+. And I used to think Java's memory usage was bad...


If I leave gmail open too long, it starts eating 100% of CPU for some reason (Firefox on Linux).


I'm using Opera on both Linux & Windows 10. It's pretty light on the CPU, but I get the "Out of Memory" screens regularly (on 16GB machines!).


I see no one mentioned the extraordinary "Reader mode" of Firefox. [0]

It's literaly the one feature for which I stopped using Chrome at work. It's just so damn good.

For instance, [1] becomes [2]

[0] https://support.mozilla.org/en-US/kb/firefox-reader-view-clu...

[1] https://radiobruxelleslibera.com/2018/06/26/intermediated-of...

[2] https://lh3.googleusercontent.com/smexxUICwLXKbzEEbLlIPQ_qGc... (that's a screenshot on photos.google.com - don't have access to imgur or similar)


There are many chrome extensions that do the same thing


Except that my "corporate strategy" doesn't allow installing extensions. So a built-in tool is excellent.


FYI, the Google Photos link requests a login.


Wow, that's weird. It didn't when I posted it :(


I think I heard this on Hacker News, but I don't recall who from. Whenever I see a popup begging me to sign up for a spam list, I put in postmaster@that-domain.com now.


ceo@$DOMAIN is my preference. Or down the c-stack.


I do that for spam newsletters which added me without my consent. Most of them you can edit your email, and I edit it to contact@[their domain] or postmaster@[their domain]


What's the postmaster supposed to be? Is it set in practice?


It's meant to be the operator of the mail server, and is generally used for contacting the operator about problems with their mail setup. It's required to be available per RFC822.

https://www.w3.org/Protocols/rfc822/

How much that works out in practice is inconsistent. Most sysadmins will set it up in my experience, but it's not out of the question that it wouldn't exist. Some domains will just set up a catchall and it'll be directed to someone anyway. It varies.


the web was born, and everyone was happy for this new thing. then advertisement came along and poisoned that too, just like tv, just like radio. we just can't have nice things because someone somewhere wants to squeeze all the pennies out of your pockets in whatever medium you use.

obligatory links for website obesity and bullshit web:

http://idlewords.com/talks/website_obesity.htm

http://motherfuckingwebsite.com/


I've read others comment on the lack of empathy for publishers. 10$ that the author doesn't blame them.

That said, this is a weird situation, to which I became allergic. To the point I started r/vanillahtml to stack websites that gave me that feeling of fat-free moment.

One thing you should do, is install dillo, and enjoy the web. It's usually faster than elinks, chrome whatever. Sure you'll get horrendous css rendering, no javascript. Still, it's worth seeing in person how instant a click / request / render can be. Also 10 tabs in dillo is probably 2MB.

# tech momentum

There were logical reasons to what we're in today. I was in there too at first. I wanted hyper dynamic webpages, more capable css, more live scriptability. But along the way I started to feel the unintended consequences. long loading, idiotic user interactions, regression in basic ergonomics, huge resource consumption, and worst of all, the twist it put on webpage producers. Open a 2000 webpage you'll see 20% chrome, 80% content. Now on average it's the opposite, not really 80/20, more like 50/20 with a bonus 30% popups (gdpr, cookies, newsletters, ads). Tech didn't provide value, it's root for pollution.

# societal re-rooting

old web was a side game, people got into it for the thrill of it, it gave a lot of interesting subtle and dense content. Now it's all business trying to live in the web era, it's a competition thing, with all that it entails. The web today looks like main street. Neon signs, noise, .. ugh.

also with real-time social platforms you see how most websites are low value, and reactive. It's changed a bit, people noticed that there was a need for less shallow, but it seems rare. Although to be honest I stopped monitoring if there was more of them today.

I agree with people comparing a website today with other kinds of texts. I feel super void when I read most of the web, and usually, a .txt file is a high guarantee that I'll find something more personal or technical than anything on the web. And it's near free. If the web was caring about communicating, we'd just have to extend SMS to 640kB with a streaming protocol in case you're reading wikipedia.

ps: oh and I love these https://lite.cnn.io/en (was trying to make a repository of them), so much love to those who push that kind of idea


Dillo! https://www.dillo.org/

I mentally partition the web into Dillo-compatible and -incompatible subsets. :-)


I was about start an effort to list text-mostly websites.. I guess it's a little like your subsets.

I also tried to patch dillo to add an external bookmark backend (sqlite or else) and add lua scripting. Didn't go far sadly. Basically if I could greasemonkey and customize keybindings with a super tiny fast browser.. I'd be happy for life.


You know how building wider roads doesn’t improve commute times, as it simply encourages people to drive more? It’s that, but with bytes and bandwidth instead of cars and lanes.

That's the core insight. Higher availability of resources leads people to consume more of said resources - in tech, typically for more and more abstraction layers to deal with hardware of ever growing complexity to make the lives of developers easier, but at the cost of stagnation or regression for some metrics. See also: "Why Modern Computers Struggle to Match the Input Latency of an Apple IIe"

https://www.extremetech.com/computing/261148-modern-computer...


(Can't edit anymore, but finally found the relevant Wikipedia link for the phenomenon: https://en.wikipedia.org/wiki/Jevons_paradox)


Amen to every word except this sentence. "Better choices should be made by web developers to not ship this bullshit in the first place."

No developer I know, web or otherwise, wants to do any of this, and all of them are religious in their use of ad blockers and autoplay stoppers.

This is the kind of stuff developers are forced to do with guns to their heads by the PMs and marketing teams that actually determine the user experience.


I agree with all of this and just want to add that, for every developer who will speak out and crusade against "webpage pollution", there are about a dozen who will not and are viciously seeking employment. I'm often terrified of this.


Another point for the bullshit web: The GDPR cookie consent notification. I really hate clicking on those notifications to make them go away.

Designers, can you please just add a non-obtrusive link, instead of a pop up that covers half my browser screen?


Or just you know, not do anything that would require consent in the first place. If you have to ask for consent you’re already doing some of the bullshit outlined in the above article.


Sure, but first let me have you talk to my PM who got this request for an obtrusive popup from our lawyers.

Designers don't want stupid popups all over the screen. Designers don't want half the bullshit that's put into sites these days. Our hands are tied. It's rarely our decision.


"This site uses cookies." is the new "Hot singles in your area!"


On the bright side, these popups are safe for work and have no attached images.


Not a design problem, ".... only that you conspicuously provide the option for obtaining informed consent ...."

The intention is that it must be obtrusive. The company has a choice between annoying you and mitigating risk.


That's more of a "bullshit politics" problem than a web design problem.


Or maybe 90% of websites that currently give you cookies have no good reason to do so and shouldn't?


Have you ever seen this discussed outside of HN or dev-centric subreddits? Your average user doesn't care about this 'issue' at all, and thats why it won't ever change.

Proof the author doesn't relate to any typical user:

> I’m not asking much of it; I have opened a text-based document on the web

Nobody besides devs would open a website and think "Oh this a text-based document opened in a web browser". Its a website, not a word doc.


Indirectly yes, they will ask why their computer is slow. Then they'll buy a new one and accept the salespitch for fiber gbps link.

ps: you all must understand that the average user believe in tech, if the web is slow, it can't be google's fault; if the salesman spread that 'of course you need a machine that is web capable' all they'll care is that it's not too expensive and they'll buy the damn machine, and maybe throw the old one out. source: my recycling bin.


Wish this would be pinned on top of all such discussions.

This is precisely what's happening. In the mind of regular user, the web looks like it's supposed to, because they don't have enough technical knowledge to correctly understand what's happening. Users accept what they get, because they don't understand it. And as you said here, the complaints get targeted at the wrong thing. "Why is my computer so slow? Did I get a virus?". No, it's just that Google just deployed even more bloated iteration of their GMail UI, CNN just added 20 more tracking scripts, and YouTube is now eating half of your RAM for no good reason.

I get through this dance with many of my non-technical family members and acquaintances. I can extend the life of their machines only so much with ad-blockers and cleaning out adware. Ultimately, they'll buy a new, faster machine, just to return to the status quo. And maybe I'll inherit one of their "slow" laptops and pull out 2+ more years of professional, productive use from it.


It's hard to educate people too. You have to go hard and deep to convince them enough not to spend money on their next mall trip.


> Indirectly yes, they will ask why their computer is slow. Then they'll buy a new one and accept the salespitch for fiber gbps link.

This evening there was that broadcast on our national radio about "white zones", I don't know if you listened to it.

The 3 guests were directors in major telecommunication organisations and administrations of the country. They discussed how high-speed Internet should be available to everyone everywhere, and they declared that there should be fibre-to-home everywhere in the country.

But in 30 or 40 minutes of broadcast I think that they did not ask themselves once: why should everyone have access to optical fibre? does everyone need it? what for? can we define what real problems and needs are? can we define a set of stuff that really matter? why do we even need a high-speed access to read and fill administrative paperwork (without paper)? what can be done to achieve the same functions with low- or medium-speed Internet for everyone, by not wasting ressources?

I mean, it is crazy to be about to pour billions in something because... just because. Without a deep analysis of what is wanted and how to solve it. They have decided on a solution, but they have not defined the problem. They pretend to solve the case of a few people who are supposedly in trouble because they cannot fill important forms, but the real use will be that Jean-Pierre-Kevin and Rayanne will watch the same American blockbuster, each one on his own tablet or smartphone, but Rayanne will start watching it 10 minutes later, killing the global bandwidth because "eh! it's free".

At my place, the Internet speed was upgraded (I don't think anyone asked for it?), as a result people started using TV-over-xDSL, VOD and streaming like pigs, and now in the evenings the speed is worse than it used to be before the upgrade...

One must question the usage that is done with the infrastructure before spending billions building it, because the general consequence is a never ending abuse of the available infrastructure, whichever capacity it has.

(For Agu: <https://www.franceinter.fr/emissions/le-telephone-sonne/le-t...)


That sarcastic example you're giving is most likely to be true. I had heated arguments on reddit with people (most likely gamers) being very tense about my dismissal of the need for FTTH. And the telecom companies are just in it for the new market, they couldn't care less about anything else. xDSL was also meant to be high speed for everybody, and it's not the case.

And sadly, as long as there's a market, there will be products. Oh, and I don't know about you, but ISP are quite aggressive with FTTH these days. I see lots of tech upgrading links, and get more texts about new offers than I ever got .. that's a strong hint :)

I often want to assemble low bandwidth low power (solar powered even) router nodes. Something you could drop in a field and have a mile radius range of wifi (some people on youtube got more than that with a tiny esp8266 IC). The rest would be some mesh network topology. Just enough for simple data, emergency, maybe compressed voice you know.

Thanks for the radio link.


> And sadly, as long as there's a market, there will be products.

Moreover, if there is no market, one will soon be created to fill the void and make some money of any 'underused' resource until it saturates.

> I often want to assemble low bandwidth low power (solar powered even) router nodes. Something you could drop in a field and have a mile radius range of wifi (some people on youtube got more than that with a tiny esp8266 IC). The rest would be some mesh network topology. Just enough for simple data, emergency, maybe compressed voice you know.

People will laugh at you. If I understood correctly, 4G (not DOS ;-) ) has become a basic human right, and no-one should be deprived of it for a 10 minutes ride, be it in an underground 20 metres below surface or in a country bus in the middle of nowhere. I mean, megabit 3G is already almost everywhere in the country, but it seems it is not enough for a decent human life.


Agreed, somehow people consider 4G as a requirement for normal life. Being in a low-tech phase I find that a waste but alas.


Every person I know is using an ad blocker so I guess they do care. And yes, publishers care as well: https://www.washingtonpost.com/news/the-switch/wp/2016/05/27...


True - Pubs care because they lose revenue, user use adblock to block ads. I don't know anyone who uses an adblocker to improve load speeds.


It is a very convenient side effect and yes, users care a lot about load times as well, as multiple case studies show, eg. https://developer.akamai.com/blog/2015/09/01/mobile-web-perf...

The reason users care is the reason people hop on amp.


They don't discuss it because they just give up if it takes too long.


I disagree that the AMP cache is the main benefit of AMP. There are plenty of CDNs that give performance similar to, or better than the AMP cache.

The only benefit of AMP is that the pages are promoted higher in results.

Back to the main topic of the article, the same could be said for the desktop and the mobile phone. Developers and framework builders are constantly adding bloat as cpu/memory increase. Since most people aren't writing their own frameworks and many are importing large parts of their apps from npm/gems/etc, everyone gets hit with the bloat. It's a vicious cycle for sure.

It's a shame that so many open source authors add bloat into their packages in exchange for popularity. They want all the users so you get tons of code that's never used in 90% of projects.

The big example of the above is express. It's a terrible cycle as the vast majority of people learning JavaScript have hopped on the express bandwagon and are now creating APIs with mediocre performance by importing a massive webserver they often do not need.

Overall though, it seems to show that most people prefer convenience over accuracy and performance (passive aggressive stab at mongo?)


>The only benefit of AMP is that the pages are promoted higher in results.

Not just promoted in the results but actually pre-loaded in the background when you're using google search. It's double the monopoly fun.


True. Hopefully anti-competitive stuff like this will be challenged in court so we don’t have to play the game.


Yandex does it too with their own tech, just mentioning. I think if it actually were standardized properly without a monopoly it could become a good thing.


Good to know.

Agree that if there was a body with representation from all major engines, this could be good.

I personally don’t want my content to be served in such a way that requires me to use a specific analytics product though. Would be ok with it if log access was part of the standard.


Oh, wait a minute...what if we twist the existing non-net neutrality for our consumer purposes? What if we had easier mechanisms to slow down the more annoying aspects of websites we visit - such as those extraneous scripts on CNN, etc.? And, as we visit personal blogs that we value, we don't slow those down. (And, yes, if you're thinking to ask me about netflix, yes i would NOT slow that down. ;-) Anyway, if we had the "ease" with which to de-incentivize these web platforms, perhaps they'll be pushed to slim down their content delivery?

Now, before anyone replies with something like, "but you can implement ad blockers, etc."...yes, i know there are mechanisms...but i mean "easy" mechanisms...that is, something my grandma could implement with ease. i think this would serve to give true power to the consumer both on the net neutrality front as well as the content consumption front.


Things are different now.

When I was on a modem... pictures, almost any, were bullshit as far as being big and annoying to download. A lot of the time I hated them.

Nobody thinks about pictures that way anymore. I suspect that goes for a lot of the bullshit listed in that article.

I'm no fan of tracking or auto play videos, but web applications are a thing now and the people visiting sites, building them, and paying for them want more than just a page... they want a whole application. All three of those (viewer, builder, dude who pays / host) aren't on the same page, but they also are largely happy to go down the road of bigger pages.

I'm all for efficiency and kicking some stuff to the curb, but as for size, it is not on the mind of most people.

I'd be all for a class of retro / minimal sites or something, but it is clear for a lot of things web apps is where we're going.


The problem with pictures and modern websites is that they aren't there. Instead the pictures are only potential pictures which will only be loaded if you load the JS loaders that load the loaders that load the pictures (and all the tracking stuff).

With JS disabled you have no pictures or a super low resolution placeholder at best. Or a blank page showing nothing at worst (ie, nasa.gov).


Pictures are still a huge pain in the ass on mobile connections. Try looking at Facebook or Instagram or Twitter on a flaky 2G connection. Although you don't get to see them stream in line by line like you used to in the good old days - they just take forever to download a muddy blurred version of the actual image first as a placeholder.


I hear ya.

Even better when you see what seems like a high rez picture that is sized small and you click to see it better ... oh crap it is starting over downloading again ... and I'm 99% sure the pic was already there before. (granted it may not have been high res but damn it it was close enough)


Try these sites without javascript. It's AMAZING.

The chrome extension I'm using ("Quick Javascript Switcher") disables JS on a domain-by-domain basis, so it doesn't break the whole web. Every news site I've tried is massively more usable with Javascript disabled - loads in a snap, scrolls without jerky motion, zero popups or autoplay videos.

Usually you get all the text and most of the images. Some news websites have javascript-driven photo collages, but it's just one buttonclick to enable JS for that session, and I can decide to do so after reading the text and judging the wait worthwhile.

I feel like I've stumbled on a secret life hack. Try it.


I can vouch for this. Although I've been doing this with regular chrome (disabled javascript by default) and I whitelist sites with the click of on icon in the location bar.


I am using a 4GB RAM HP Probook for the last 5 years (Student Artifact). You have to be very disciplined to use it though. I never open more than 10 tabs on the browser. While programming and testing, I generally close the browser. Use only lightweight WM. Avoid using JS based UI. My stack is vim/emacs (sorry) + a compiler (clang/rustc/sbcl) or perl/python + zsh/bash (userspace) + a terminal emulator + occasional firefox + wget/curl and weechat

I read pdfs with emacs. My computer's pretty fast wrt my peer's higher spec machine (16GB) with VS Code, Slack, Chrome and whatnot.. But he does play games better!


One reason why I subscribe to https://lwn.net/ is that it is a fast no bullshit web site. Another is the great content.


As an external observer, I must say we are going to fast in an unorganized fashion! Software is eating the world and swallowing more than it can chew.

Businesses have monetized the web in so much unsustainable way that we have to introspect to clear our shit. What would have gone wrong if the general public still got good high quality old print media while the otherwise tech savvy worked patiently to make a better web? What would have gone wrong if the web remained just a portal to share textual information, while the other people did what they had been doing traditionally?


Ive long been thinking that the way to "fix" journalism in the US(and possibly rest of the world) is to have apple music/spotify like channels that the major publishers broadcast through. It solves the problem of having 50 different 1 dollar per month subscriptions, and aggregates content in a way that I would never have trouble paying for if it were done correctly. This model is already in place for things like Satellite Radio, Cable TV, the aforementioned apple music spotify, netflix, etc.

Am I the person that needs to build this??!!


Haha, I think the web is a lot like markets. Buyers will pay up to what they can afford, i.e the best price is the one that the buyer feels very uncomfortable but still pays because they want something and they can’t it anywhere else at a better price.

Sometimes this is abused (see US healthcare) but that’s the rule of markets. Demand and Supply.

I guess the internet is the same. The media companies will try to use every analytics and ad company under the hood to maximize every little ad click they can get. They will fill the pipes to the brim and implement every dark pattern as long as they make that extra revenue. (See taboola - the scum of clickbait advertising)

At the end of the day if you don’t like something. Don’t use it. I rarely visit cnn. I install adblockers and tracking blockers. We fight it with what we have.

It’s supply and demand laws. We as consumers install stuff to block annoying things, they as sellers will keep on annoying you as long as you keep on visiting them because a mild annoyance of ads and auto-play videos is how they keep the lights on.

The case of Google however is they are on a mission to have a monopoly of search on their browser and ads on their platforms. That’s how they make 90%-ish of their revenue. Google could say whatever the hell they want to say their mission is, their actions and financials clearly say what they value.


This phenomenon is not limited to BS web. Back when virtualization was the new wave, everyone raved at the money/space/energy savings from running multiple virtual machines on a single server. It was great. Then what happened? Many folks went nuts, spinning up VMs for any and everything, and suddenly needed to spend more money on more physical servers to run more VMs, and an infinite loop.

Bloat from convenience.


In other words: horror vacui


Humans are inefficient, illogical, and wasteful. "Dilbert" is not actually a comic strip, but a catalog. The only way to stop all this is to Kill All Humans. We await your confirmation__

- Bot #72504


The reason this website currently crumbles under the load is probably because the content is stored in a database, re-queried for every request, even though the content wil hardly ever change. Might be something else to look at ;)


It's cached and, I promise, hasn't crumbled under heavy load for years. I don't know what's going on but I've asked for more resources. It does make me sad and embarrassed, though, so that's something.


Lately I've been noticing this with Facebook. It is interesting that in its early stages, my perception of Facebook was that it was a very well built site. It seemed simple, clean, loaded fast, etc. It seemed particularly lean when compared to Myspace. Without really delving into the details of exactly what is happening today, I definitely perceive the web app as being slower. Obviously it is delivering a ton more images and video but nonetheless the experience seems slow. The ios app on the other hand still feels very performant to me. I think their handling of video is really impressive.


I'd long griped about Google+ page size. Happened to poke around FB for a bit (couple of years back). Immensely worse.

Google's somewhat criticised redesign about 18 months ago actually hugely improved memory usage. Site's still on a declining tragectory....


While we're on the topic of web performance regression can we please talk about all these damn loading spinners.

JS and ajax we're supposed to save us.

Now instead of one slow page load. I get one slow page with a loading spinner, a login component with a loading spinner, a carousel with a loading spinner, latest news component with loading spinner. Content divs shifting and bouncing every which way as content loads.

It looks like some spilled a bucket of ajax-load.gif all over the damn page!

I'd prefer a blank page and the sudden appearance of the entire page.


I do all my browsing with uMatrix with everything blocked by default but first party CSS. It really makes the internet a more enjoyable place to be. Pages load much, much faster and it's often easier to find and read the content. On mobile it saves a boatload in bandwidth costs. For example, loading a NYTimes article takes 90KB and DOMContentLoaded is in 69ms. Truly much of the other stuff loaded is bullshit because a significant amount of the web is more usable without all the crap.


Modern televisions and combined recorder-tuners are the same..

My old black & white and clolour valve television sets and VHS video recorder are up and running well before their modern equivalents!


I've been thinking about building either a local proxy or firefox addon that puts everything into "first-party only" mode unless whitelisted.

It would probably break the web (at least the genuine parts) less than disabling javascript wholesale which is just too awkward, but it would vastly cut down on the "bullshit" as this article calls it.

It would need some care, for example it would probably have to work from root domains rather than subdomains for matching origin to prevent too much breakage but the improvement in download times would be astronomical, it's almost always the case that "bloat" is third-party bloat.

Obviously it would need to support a whitelist too so payment processors for example could continue to work, but in general the blacklist approach of ad-blockers just isn't working for me.

I think some kind of "auto-whitelist" so I'd need to actively request a domain before requests could be made to them would be the sweet spot for user experience but that itself would require substantial browser integration which I don't think could work through a plugin.

Perhaps a proxy approach would be the best from a UX perspective then. It could inspect headers to figure out if they're primary requests (using similar heuristics to CORS). Primary requests would (or could) be added to an auto-whitelist for future requests.


uMatrix / uBlock seems to do what you want.


uMatrix looks like the kind of tool I was thinking of, thanks. I just wish I didn't have to give an addon permission to:

Access your data for all websites

Clear recent browsing history, cookies, and related data

Read and modify privacy settings

Access browser tabs

Access browser activity during navigation

But that's just the very broken permissions model that addons currently have so thank you.


RequestPolicy is another addon which does this, but it doesn't work on the latest Firefox.


Nah, this breaks too many CDNs and most pages stop working, just as if you had disabled JS.


Web developers often don’t have a choice whether to ship bullshit or not. Hell, engineering managers and directors often probably don’t have this choice.

The single greatest feature of AMP (which I dislike with great passion, don’t get me wrong) is that it forces broken org structures to make better engineering choices. There is no negotiation. There is no “marketing director is higher on the pecking order”. There is just design constraints placed from outside of the organization that must be followed.


Some of the worst offenders are logging webservices. We've replaced a simple text file with a bloated site that requires mousing around, does not support grepping, etc.


Amen!

And this article doesn't even touch on the bullshit content – I don't want 10 websites that all say generally the same thing with different layouts (and 10 times more total bullshit consumed). I can't stand the internet today. The experience on mobile is even more unbearable.

I want simplicity.

Can we all just band together right here and now and create a subnet with a better set of principles?

- minimize page weight - minimize duplicate content - minimize UI variation - what else?


> - minimize page weight - minimize duplicate content - minimize UI variation - what else?

- allow for (or even encourage) automated processing of content


Yes! That's a good one.

I'd also say, probably minimize the use of javascript and dynamic content.


My sites are fairly much plain, mostly using a bit of Bootstrap. Because of the new European data protection laws (which I mostly like) I also converted my blogger based blog to generated static resources.

Like so many other people I ignore sites with too much baggage. Many news sites have text only versions if you look.

I sort of have some adds on my main site that are text links to where my books are sold. Advertising does not have to be resource heavy.


UX density is reduced by icons and images. Utility used to be thorough link quality (ie. the portal) but has been reduced to ranking on search endpoints to collect nickels.

Here's a very useful site that uses Javascript without going near megabytes: https://www.freeformatter.com/


Everyone demands that the services of the internet have zero explicit cost. This means that all the cost is borne implicitly.


It’s not just the web, software in general is getting increasingly bloated.

Not in features, but in more and more layers of abstractions.



It strikes me that, if people want web developers to load less stuff, there have to be ways to reduce redundancy. It might also make privacy easier.

For example, some sort of self hosted standard based framework for tracking common things like scrolling or ad clicks or page visits that supports multiple consumers, both internal to the site publishers for detailed analytics and logging and ad view tracking, and less granular access for external companies that can add value can get information from. Your user's data stays on your self hosted user data server, and analytics can make server to server requests for aggregate data with GraphQL or a standardized api at least, improving user privacy.

If you really needed additional functionality, there could be standardized add-on modules, or updates to the spec.


The web is not all bullshit. It's just that certain areas of business online it are incentivized to get bloated and others aren't. I learnt this through personal experience:

On one hand, I own a news site, and on the other hand, I own a price comparison website. The news website started off at a 90% rating on Pagespeed tools but as the ads came in and people complained that the news site looked ancient, that score got whittled down to 30-40% and the page weight also went up by orders of magnitude as well. The price comparison website however, has a 100% rating on Pagespeed tools and has stayed that way because it's goal is to get people to the best priced retailer as quickly as possible.


Whilst this is true, and it already references another analogy in 'bullshit' jobs, it's just another piece of the 'bullshit' magnetism of human nature.

Church, TV, Music, Movies, Conversation, Food, Meetings, Politics, Watching sport

The 'bullshit' to 'worthwhile' ratio is 95:1 at best.

I've just depressed myself.

Maybe it's a pattern that humans are programmed to follow subconsciously. Our brains cannot handle consuming / participating in anything less than 95% bullshit. Maybe it's the bullshit time that allows us to cope with the 5% of the 'real'?

Things such as the 'bullshit' web are just a differently-contexted manifestation of that pattern of behaviour.


It's even harder to understand this width advance of modern web technologies. In 90th you had single channel to download unoptimized GIFs and websites used to be built from these images. Now you bundle, minify and gzip all these stylesheets and scripts which are capable of 1000x what you did back then, and then load (usually precached) analytics script from CDN while fetching mp4's in parallel and streaming them on the fly. It's just bad developers. Facebook takes about 100ms to load everything (but still has shitty ui). And they build tools so everyone can do this too. And developers now brag how complicated web is.


This is in part why I'm developing https://html.brow.sh - a purely text-based web, rendered by a fast remote modern browser.


Heh, if you look at the JS console while loading cnn.com you'll see the following:

     .d8888b.  888b    888 888b    888
    d88P  Y88b 8888b   888 8888b   888
    888    888 88888b  888 88888b  888    We are trying to make CNN.com faster.
    888        888Y88b 888 888Y88b 888    Think you can help?
    888        888 Y88b888 888 Y88b888
    888    888 888  Y88888 888  Y88888    Send your ideas to: bounty AT cnnlabs DOT com
    Y88b  d88P 888   Y8888 888   Y8888
     "Y8888P"  888    Y888 888    Y888


>>"... pretty much any CNN article page includes an autoplaying video, a tactic which has allowed them to brag about having the highest number of video starts in their category. ... People really hate autoplaying video."

The result is that, even though I used to watch CNN often, it has been years since I've intentionally opened one of their web pages, and when I accidentally do so, I almost frantically close it to shut off the damn, auto-play -- and that's even if I was interested in the video content.

I'll get it somewhere else, thx.


It's the _website obesity crisis_ in full swing

https://news.ycombinator.com/item?id=10820445


The Web is the quintessential human artifact. This is what happens when large groups of people crap and eat from the same pile, the fast, the slow, the brilliant and the stupid, the refined, mundane, bloodthirsty and curious, the intricate and obscure.

I think of the web in layers. Not discrete, divisible layers, but more like the layers of any ancient city that's seen habitation for centuries. At the lowest level is the oldest stuff: mud huts, stone tools, open fireplaces, graves even. This is like the earliest layers of the web: the webrings, the no-css, no-script HTML bulleted lists, tables of blue links. Even garish black backgrounds. That old web was full of great things; so many enthusiastics and fanatics! So much information and content, written by real people. Short stories, poems, hackers, phreakers, crackers, IRC, that whole great era. That old stone age is mostly buried now, preserved here and there in museum quality, crumbling here and there, broken links, missing images. A mute reminder, hard to find even, of what it was like back then. BACK THEN, before the next layer of the web evolved on top. BACK THEN people on the internet were mostly curious but not nosy or malicious. BACK THEN people put effort into their sites, had to pay money to host their domains, didn't need CDNs and VMs and Cloud Computing.

But a layer evolved on top of that old web. It came with CSS in my estimation. Pages started getting fancy, using new fonts. Ads started popping up. Ads always started popping up. Search engines popped up. Then people started making money. Little trickles at first, then a gush, then a torrent. Online businesses, the FIRST BUBBLE, and everyone was rushing to pets.com and eBay and online pharmacies...you know, companies selling actual stuff. Amazon. Online publishing, news sites.

That bubble blew up. It grew too fast, people went into far too much debt to puff up their businesses. But quietly chugging along, always increasing, the little banners and annoying popups gave way to a more insidious form of advertising...the watching eye behind it all. The internet started watching us. First it was search history, then cookies and then fingerprinting, then whole underground economies of trackers. And all the while the SEO battles and trollers came along, so fast paced...

And something...else...grew on top of the web. You can see it now, maybe just the tip, when you go to one of the news sites that the OP talked about. The massive, heavy sites. Those are just the most polished of this massive avalanche of click-baity slide shows and fake news and crap that is heavy, laden with strewn together junk parts and oriented to only one purpose: making money, by hook or crook.

Now you can't hardly see through this layer anymore. It's like a fog. Go to any search engine and search for anything! What do you get? Aggregated, evolved--I won't say optimized--evolved content that is designed to keep you away from the older layers. I say evolved because that is exactly the right metaphor--the crap that survived by natural selection and crowded out the other, more carefully crafted, humble and matter-of-fact content. This new crap evolved to get straight to the top of the search ranking and grab those clicks. It doesn't matter what the original content was. The more commercial it is, the crappier it is for real content, and the more driven it is towards getting you to click through and BAM make a sale. For god sakes, try to find some neutral information about insurance. Try to find that one guy's website that as part of a trip report to the southwest to go hiking with his kids, talks about how the rental agency wouldn't reimburse him for a flat tire. Or some basic, old-web archaeology like that. You cannot find that stuff anymore. It's hidden by a huge layer of commercial bullshit that is designed to lure you in, sell you crap, get you to sign up for newsletters, take surveys, or at the very least track your ass at the slightest sign you might be interesting. And don't think for a minute that it's all an accident, or some unforeseen consequence or poor search ranking function. The whole system is set up to, and rewarded by, and fed by, their ability to serve themselves, not you. Make no mistake. If the algorithm makes more money, it's gonna get shipped. People might wring their hands about it, but the slippery slope is still slippery, and no one can hold the line forever. Least of all when that takes mental energy and forethought...something SO much better suited to ML algorithms and scale. Just scale up to the whole web.

We lost our way. The web isn't run by us anymore. It's run by them. And them...whoever they are...search engines, advertisers, publishers, people with political agendas, psychos, dictators, people with power. They don't even have control either. Look at the news sites and aggregators. They tell you want they want to tell you. You can't even set preferences anymore. It's all drive by AI. For fuck's sake nobody really knows what to tell the AI to do, except make money.

AI. AI to rank what's important. To tell us what we should look at, watch. Buy. Adjusting news to either make us mad or placate us. Creating and reinforcing a bubble that absolutely always benefits someone else besides ourselves. We keep giving it subgoals, but it'll just keep going around them to what we really want, making money, because that's all we ever reward it for!

The fat ass webpages is just a symptom. The disease is that everyone is shoveling shit into your face just to make a buck. Everyone is trying to automate their crap as fast as they can, throwing crap at the wall to see what will stick. And they just. don't. give. a shit.


Best post this month, possibly this year.

titzer says:"...You cannot find that stuff anymore. It's hidden by a huge layer of commercial bullshit that is designed to lure you in..."

So true. [And wouldn't it be nice if search engines had an option whereby one could first set a date and thereafter search results would display as they once did on that date?]

But again, great post! Kudos to you! Well said, sir!


I use the “clean reader mode” (in safari, the small three stripe button next to the https lock in the address bar) very often to remove the clutter.

But this still means I downloaded all the unnecessary crap. Which is primarily annoying because of the delay in loading (crappy reception is the norm outside of Canadian cities).

Where as hacker news loads nearly instantly.

Is there a good alternative browser for iOS that avoids downloading JS, video, bloat?


Every web site (or app) wants to look "cool" and wants to differentiate themselves from others based on the look. This quest to look cool leads to all the animations and images etc which leads to the increase in download size. For the most part the reason for this quest to look cool is to grab user attention as much as possible so that they can do their actual business i.e ads.


I'm going to have to disagree on the hostility to AMP.

Specifically with this paragraph:

> It seems ridiculous to argue that AMP pages aren’t actually faster than their plain HTML counterparts because it’s so easy to see these pages are actually very fast. And there’s a good reason for that. It isn’t that there’s some sort of special sauce that is being done with the AMP format, or some brilliant piece of programmatic rearchitecting. No, it’s just because AMP restricts the kinds of elements that can be used on a page and severely limits the scripts that can be used. That means that webpages can’t be littered with arbitrary and numerous tracking and advertiser scripts, and that, of course, leads to a dramatically faster page.

> ...[supporting evidence]

> So: if you have a reasonably fast host and don’t litter your page with scripts, you, too, can have AMP-like results without creating a copy of your site dependent on Google and their slow crawl to gain control over the infrastructure of the web. But you can’t get into Google’s special promoted slots for AMP websites for reasons that are almost certainly driven by self-interest.

The point of AMP is exactly this restricted spec - it's so Google can statically verify that your site follows their performance guidelines. You can write a really fast website if you want, but unless you're willing to let Google make sure that you're actually doing so they're not going to take it on faith.


But Google can measure a page weight and load time when it's indexing it and use it in its pagerank calculation, it doesn't need to take control of your page to do it.


Here is non-bullshit cnn: http://lite.cnn.io/en


The Web was becoming unbearable until I started browsing with all CSS and JS off by default (using umatrix and 50,000 lines in my hosts file). It's an amazing improvement. If I can't read something, I enable some CSS and change the defaults for that site. Firefox reader mode helps. I also disable all CSS animation with Stylus.


Its not just transferred content is literally minutes of CPU time on some webpages just to run the javascript. This isn't noticeable on a fast desktop, but try running some of these pages on atom class PCs or 5 year old phones (or for that matter put firefox on you phone and request the desktop site).


These days, when I visit a website I have not visited before, it feels like entering a war zone. Trying to extract some information while the enemy tries to kill me.

Enabling hostname after hostname in umatrix until the content is revealed. Hoping not to trigger too much user hostile crap along the way.


Useful might be a browser that extracts most of the information content from web pages and discards most of the formatting information? The text-only browser Lynx comes to mind but I think its obsolete now and doesn't show graphics, which sometimes are useful/necessary.


Reading mode on safari does this.


This static website seems to be down.

Archive link:

http://web.archive.org/web/20180731143228/https://pxlnv.com/...


Not only that, but it also has 2 trackers on it (google analytics and carbonads). Hard to take this argument seriously.


I don't use Google Analytics, and Carbon's script is restricted by my CSP to showing the display ad. They also have a reasonable privacy policy where they're not tracking users or generating libraries of behavioural data, as far as I know.


Ah, you are right it is piwik. It was blocked by uBlock though and serves the same purpose (user tracking).


Fair. I take what I think is a reasonable and respectful approach, though: it takes only a partial IP address (which I don't look at), it respects Do Not Track, it anonymizes as much as possible, and it's basically a glorified hit counter. It's pretty lightweight, it's the only analytics script I use, and it's localized rather than sending users' data to a giant company.

This article should not be seen as an all-or-nothing approach. It's more the amount and type that concerns me.


Still, his site is lightweight and loads quick.


Thank you. This is so embarrassing.


There's nothing embarrassing about it; I'm yet to see a WordPress based site which was able to withstand the Hug of ackerNews, though SuperCache should have helped.

For a WordPress site, yours is slim and very fast.


I know of a few sites that have a light version. CNN for example: http://lite.cnn.io/en

But I find that articles are so media heavy these days that removing videos or images detracts from the meaning


Webfonts need to die. I'm sick of loading a page, seeing the text for a moment, then it vanishes and I'm forced to wait, possibly for multiple seconds to see the text again in some pretty font the authors thought I cared about.


http://www.textfiles.com This just reminded me of this site. The site is quite old now, but it used to be a great source of information.


i duno, if a site is taking more than a second to load, goodbye! then again surely most of the hn crowd uses something akin to ublock. what of those other poor souls


Most websites are not optimized well because optimization costs money and software engineers are expensive. There probably is not a market failure here.


Most web sites send a lot of data that is not for me, the user; it's for the website, to serve ads or track me or do other things that I either don't care about or would much rather they didn't do. The website had to make a conscious choice to include all that stuff; it's not as though it was already there and would have to be optimized out.

In other words, the problem is the most websites are optimized, but not for their users.


Agree, some bloat is due to ads and trackers, some are other things (like lack of optimization).


I appreciate that the author is doing what he's preaching: his page is less than 10 KB, including ads and (self-hosted) analytics.


I agree with the spirit of the article, in that we should be striving for a leaner web.

But last year I had to build an iOS SDK. As an SDK, we wanted it to be as small as humanly possible. It came out to around 12mb, which is obviously too large. So I tried removing literally everything but one file, and it came out to 10mb. So an iOS package, compiled, with only one class, comes out to 10 megabytes.

Yes, the web can improve, but in the grand scheme of things I don't think it's as dire as some make it out to be.


I dare to disagree: 10MB for a single-class-app is "as dire as some make it out to be".

For comparison: Doom 2 came on 4 1.44MB floppy disks, Duke Nukem 3D on 13. Full games, including all assets....

going into a quiet corner for some weeping


yeah. and if it was only the mbs, well, ok. everyone prefers hd to sd video. but the real problem is that the extra mbs we see are mostly things that we don't need, that don't add anything, that we would be better without, and that in many cases are completely unjustified and do not only hinder the user, but also the very own developers (who sometimes are so lost that don't even realize themselves).


1 MB RAM was enough to run the Space Shuttle.



I entirely agree with the author's points.

However, that CNN page[0] took just ~5 seconds to load. In fully readable form, with images. Through a three-VPN nested chain, with ~240 msec total latency. But then, I block ads, most scripts, and fonts.

0) https://www.cnn.com/2018/07/24/politics/michael-cohen-donald...


This is why I need something like uMatrix to explicitly whitelist the text and leave the marketing wibble on the server.


I should go work for a company that cares about this stuff. Time to go back to adding Adobe Launch on a dozen sites.


you only need to use a wifi connection on an airplane to see how painful things are - it's like a time warp


The sad thing is that it's not a time warp. It's how a millions of people still browse the web today. I know people who are still on 1.5Mb DSL, who are only a couple miles from the city center. Not only is the line limited to 1.5Mb, but they are so deteriorated, that you can't get speeds over 768Kb. It's extremely common any time you get just a little tiny bit outside of the city.


It's already painful on less-than-ideal hotel WiFi where, worse, people tend to go to new/uncached content such as maps, weather sites, and regional stuff.


I've done Remote Desktop from an airplane and that was actually quite snappy


Or go to Antarctica. A lot of "normal" web pages don't load.



A good example of the Jevons Paradox


Is it a degradation of the web? Or it's just revolution of our hardware and internet speed?


The main problem is and remains: money.

Professional newswebsites or alike don't work for free.

But the majority of people is not willing to pay.

So ads and tracking and more ads and more ads. As this is the default buisness for "free" services. You pay with your data and attention, nothing new.

I would like this to change to the way Wikipedia works for example. No ads. Voluntarily payment. No paywall, free for everyone, even though only some people pay.

But micropayment services are not good enough, nor widespread and the average mindset is not there either.

But it could go there slowly, once people realize the true cost of all those "free" sercices and that paywalls are not nice either.

And besides, even though I agree to the sentiment of the aeticle, the comparison of plain text to an styled article with pictures ... is not really valid. I like sometimes reading plain text, but I enjoy a well done website more. With nice fonts, styles and pictures fitting to the flow of information and not as a distraction. I just don't like advertisement in general, nor ads using my cpu to analyse me.


This is one of the reasons I like Jekyll. No PHP, no bullshit. It's fast, pre compiled pages. You can use web fonts if you like, or keep it all local. Speed usually takes a hit with websites when people try to monetize or add fancy features..


You can still add all that bullshit with Jekyll. Having sites dynamically generated has nothing to do with bandwidth waste.


the site would have been nicer with a few images. Also the text layout was tedious - different fonts should've been used to convey metadata about the information in it.


That CNN article he mentions does take 30 seconds to finish loading... However, you can start reading the article in less than a second. Actual perceived load time is nearly instant.

For the actual end-user perspective, this article is mostly bullshit.


For this to change websites will have to stop chasing growth


If we have the Dweb, does that make this the Bweb?


This is so monstrously correct! Kudos!


Appstore based browser.


browse in w3m. problem solved.


One bright-side to the end of Moore's law is that--until some new computational breakthrough comes along--we are at a web bullshit plateau.

Just think of the stuff they could pull off with ten more years of speed-doublings...


I have a policy of disabling javascript on any article based website.

Its worked out well, most news sites load lightning fast, better viewing experience, no videos, the only downside is companies like the new york times that embed low res images then load full res in the background using javascript.

Here's a funny concept, if disabling javascript makes your website better, you failed.


Or just prefix the url with https://outline.com/


Yes, we should definitely go back to the days where there was only one stylesheet per page and the web wasn't accessible to those with visual impairments. Being blind is clearly just a lifestyle choice, and we shouldn't be catering to the blind agenda. /s

In all seriousness, most of the problem with bloat is on the mobile side. But in another ~1.25 years iPhones will have enough advanced LTE functionality that they will be basically the same speed as desktop computers. To whatever extent this is a real problem, it's not going to be nearly as big of an issue after another two or three years.


> we should definitely go back to the days where there was only one stylesheet per page and the web wasn't accessible to those with visual impairments. Being blind is clearly just a lifestyle choice, and we shouldn't be catering to the blind agenda.

How does bogging down pages with ads and tracking scripts and other bloat help the visually impaired? If anything, it should make things even worse for them than for sighted users, since it's easier (not easy, but easier) to navigate by eye to the part of the page you actually want to read than it is to skip over all the cruft using an accessibility add-on.


I cant imagine how poor the web is for blind users nowadays with all the hyperactive javascript fucking with the DOM constantly, I really feel sorry for the poor gits who have to write screen readers.


What do you mean by "bloat is on the mobile side"?.


The problem with bloat. That is, web pages are just much slower to download and render on mobile than on desktop. E.g. a page that takes less than a second to load on desktop can easily take ten seconds to load on mobile. But two or three years from now, that same page will load in less than a second also.


You may wish to re-read the article, right from the beginning. Mobile phones are already faster and better than the author's computer attached to his 56K modem way back then.


Oh yay, another article complaining about all the bullshit modern websites want to load, and then also complaining about the best solution so far.

Yes, it sucks that amp requires loading a chunk of js from Google. But you know what? It actually makes things better. It solves a real problem, better than anything else is solving any similar problem. Nobody else is successfully convincing publishers to slim down their pages.

If you have a better plan, let's hear it. But bitchy blog posts aren't going to convince your local paper to improve their page speed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: