Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does YouTube have any incentive to “preserve” anything once it costs them more money than it brings in? If not, maybe it's a little early to say they are actually preserving anything. (The article does kind of acknowledge this.) Maybe its real role is as a link in a chain that leads to /r/datahoarders


Isn't this true of any entity? Even a Museum will eventually have to pay bills and have to decide to downsize or make changes. Even government funded preservation can have voters decide it isn't worth the costs and preservation can be dropped.

I do think private companies will be the ones most fastest to drop preservation and the ones least likely to look for external entity to swap preservation with, thus making YouTube worse than a museum.


I don't think this is a fair comparison as a museum is normally opened with the sole desire of preserving history ( Profit is often desired but not the incentive for opening )

Do you think google bought YouTube originally to preserve history or to create a new line of profit and enhance its brand?


Museums don’t own all of their collections though, and they too make decisions on what to display at any moment. Not all objects are worthy of display, some get rotated out back into some rich person’s attic.


Almost the entire collection of a museum is in storage or in archives if we consider number of objects as a whole. Only a very small proportion is on display and some of those on display will be from other museums.


I'd be very interested to hear about any respectable museum that sends off items that are not displayed to be stored into some rich person's attic (or for display at a private residence as I assume you are implying).


If you have a valuable artifact that you don't have on display, renting it for a large sum to someone is a reasonable thing to do. Of course it goes without saying that you ensure whoever gets it takes care of it. Better it be on display at a rich person's house where I might get an invite to see it than in the basement where nobody can see it.

As has been noted museums tend to be on the brink of financial insolvency. Getting a rich person to rent some art is a valid way to get enough money to stay solvent and continue to provide art to others. There is a lot more great art worth preserving than there are museums in the world.

Note, the above is not a comment on IF it is done. I have no idea if it is possible for someone to actually do this.


Nothing like "loaning" out cultural heritage on the basis of who's the most solvent, museums or rich people.


As opposed to letting it sit in a basement/attic unseen?


> It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest.


The big museums all have a warehouse somewhere: https://en.wikipedia.org/wiki/Smithsonian_Museum_Support_Cen...

If you constantly collect items, and never discard them, then your collection will just keep growing (as will your warehousing bills). There's far more than can be put on display at many locations.


The modern art museum here in stockholm often borrow items for their exhibitions, from other museums, artists or private collectors.


Borrowing items to display is not the same as giving away your own when you’re bored with them.


It’s kind of the opposite — lots of museum pieces are privately owned and loaned to museums.


What else would a museum do with a piece on loan to them that they don't want anymore besides return it to the owner? That could be another museum or a private collection.


That's neither what you said nor how museum loans work though.

Loaned items are specifically loaned to be displayed for a length of time, it's not like borrowing a grass trimmer off of facebook.


Considering their mission statement about making the worlds information universally accessible and contemporary goals around literary archival it’s not a stretch to believe they had history on their mind.


“Mission”


What's that supposed to mean?


Aliens


Furthermore, museums curate what they hold whereas YouTube probably has far more i can haz cheezburger videos than humanity will ever need


Nothing lasts forever, but I still keep my milk in a refrigerator that might stop working rather than leaving it on the counter.


If you make it condensed milk you can store it on the counter for longer than it would've otherwise lasted in a fridge. Is maximizing longevity the only metric we can to optimize upon?


I think you’re looking at it the wrong way. YouTube is not preserving video game history by providing a place to host these videos. Instead, they’re offering an incentive for people to create these videos that are currently hosted on YouTube but will eventually be archived somewhere else.

In other words, YouTube is somewhat responsible for the creation of this new content, but has no expectation or responsibility to maintain its legacy.


And the same thing can be said for any type of videos, not just video games videos. YouTube gave an incentive (not necessarily monetary incentive) to preserve something to those who previously wouldn't if such a service didn't exist.


I was going to say something similar. I actually think YouTube (as a hosting/sharing platform) is a net-negative for the internet because there are already millions of videos that were embedded in pages that are no longer working because YouTube removed them, the account was deleted, or any number of things. If the internet worked the way it was intended, these videos would have been hosted on the site rather than embedded and would be still working. If the video wasn't there anymore, that usually means the content isn't there anymore. It's almost worse than dead hyperlinks because it's almost always supplemental content that references a video that can't be accessed anymore. :(

That being said, it's a net-positive when it comes to content created. The internet wouldn't be as crazy without YouTube, so thanks for that.


In nearly all of those cases, the content was removed because either the uploader wanted it removed, or the owner of the intellectual property in the video wanted it removed.

In both of those cases, I don't see how this is Youtube's fault or decentralized hosting would be better.

In the first case (uploader wants it gone) you have a sticky situation where loosely managed websites refuse (or can't) respond to requests to remove, which can start interfering with various laws in various places.

In the second case, where IP has been violated, those same loosely managed sites run significant legal risk of lawsuit by not responding.

From the perspective of a content owner or video creator, why would you want your work decentrally stored in places that you can't control it?

The other angles here have to do with monetization, too. When you use youtube, you can serve user aware ads and track who is watching your video as it spreads. If it is self-hosted, your content has been stolen and is enriching someone else illegally.

From a creators perspective I can't see how decentrally stored videos would be good.

Perhaps what we need is a Library of Congress type organization to begin a serious long-term archive of significant portions of this type of media.


I would agree with you if thousands of videos and accounts weren't removed daily due to false "intellectual property" claims and automated reporting. YouTube's system is terrible because they're so large that they've had to try to automate a system that can't really be automated. A decentralized system would require those claims to actually be validated before they're processed because, otherwise, it would be a waste of time and money for the companies making the false claims.

I agree with you on the Library of Congress. We need a digital LoC.


After thinking about it, a decentralized system of intellectual property theft would not work the way you think. The owners would stop contacting individual sites for individual confirmation and would send safe harbor / DMCA reports to the host who would nuke the content without spending ten seconds on it to preserve their business.

The only sites safe would be those self-hosting, and those wouldn't be able to serve a lot of bandwidth without having the means to be sued.


Even in that case, you'd have to deal with every hosting provider which, at the very least, would require an actual legal DMCA request rather than an automated one and, since people would be able to self-host, a way to contact their ISP who would have to verify that the claim and request are even valid. With YouTube, the automation is what's killing content creators and the archival value of the platform.


> I actually think YouTube (as a hosting/sharing platform) is a net-negative for the internet because there are already millions of videos that were embedded in pages that are no longer working because YouTube removed them, the account was deleted, or any number of things. If the internet worked the way it was intended, these videos would have been hosted on the site rather than embedded and would be still working.

If it wasn't for YouTube, most of those videos never would have been embedded in those pages in the first place. If the situation you describe was in any way feasible, people would have done that in the first place instead of using YouTube. You're asking for an imaginary free lunch type of situation.


What? That's exactly how the internet worked before YouTube. YouTube became the standard because you could easily upload things for free since the content was no longer the product. YouTube is only free because they can sell the data surrounding the videos to advertisers. Without that, we'd have exactly the situation that I'm describing because that's exactly the situation that we did have before YouTube came around (and that YouTube helped before it became beholden to companies rather than creators).


> What? That's exactly how the internet worked before YouTube.

Yeah--there weren't very many videos because most web hosting services didn't let you consume that much hard drive space.


I don't think that HD space was the limiting factor. We'd still have gotten to the point of unlimited storage space like we have now.


You overestimate the cost of storage at this scale.

It's totally worth them storing the data given the marginal cost.

A 10TiB HDD is only 150$ retail. These videos can likely be stored in less than 200MiB on average.

Storage has gotten cheaper YoY


Just a few days ago there was a discussion about how Google is now indexing less contented from a years ago[1].

If Google can't keep indexes which are relatively inexpensive why should they keep videos that are more expensive and possibly no one will watch? Especially as the length and number of videos is increasing faster than ever?

[1] https://news.ycombinator.com/item?id=19762907


You're not scanning through bulk storage like that all the time. Indexes have to be managed to be useful too which is another cost.


Videos still need to be indexed, why would Google keep the videos without indexing them?


Let’s not confuse the videos with the index of the videos. Videos can be stored on the slowest and cheapest medium, indexes might be kept in memory or flash. Each video takes hundreds of mb compressed, an entry in the index could be constant per video


Indexing and storage are two different things. If something if not accessed regularly, then you don't want to index it. You can still access it, it just takes longer to find.


Okay, so the videos may become preserved but inaccessible.


Google might have thrown out some old pages, but the internet keeps growing exponentially. I highly doubt that Google is indexing less pages than a couple years ago.


The Google indices are very expensive to create and serve. For one, they use a ton of memory.

YouTube has TPMs whose only focus is making storage very cheap.


I'd like someone to do the math on that 200MiB estimate. YouTube stores every video at multiple resolutions, and has replicas in data centers all around the world.


Ah! Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.

But, I'm not in industry, and the last time I poked my head in I recall things being way uglier and more complicated than I had imagined. There was a good conference talk that was linked here once but I've lost it. Talked about the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.


> Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.

A big note. All of this wasn't used by YT because they've always used a widely used standard codecs and media containers. All the lower qualities of videos they provide use storage nearly the same as the highest one (so 2x total), all bitrates are the same for all videos. As for download part, the players in auto quality would get first a maximum quality they can for current download speed, if there's a room or connection isn't stable - also 480p version, and if the speed was enough to download current and next chunk in lower quality it would download the best and switch to it over on completion (there were a times when player was downloading the whole video in the best quality after it had finished in an optimal).

> the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.

YT has very narrow list of formats allowed to upload with hard breaks in converting process, especially for the audio.


Interesting, thanks! I'd love to read more about ways to encode video that allows streaming at a variety of resolutions with a minimum of redundancy, if anyone has any links.


For me, it came up in a course I took last term which used the textbook "Video Processing and Communications" by Y. Wang, J.Ostermann and Y-Q. Zhang.

It's a bit out of date (2002), but the core video encoder material was solid. It's a bit heavy in representing concepts using symbolic/equation notation. That, for me, made the jump from 1D signal processing to multidimensional signal processing tougher than it needed to be.

There are probably better resources so definitely don't just take my word for it! :)

(For scabilility, IIRC the terms "enhancement layer" and "base layer" are particularly important, as are the block diagrams that generate them.)


For low-traffic videos they likely only keep a copy in a single region.


Interestingly they could also just keep the best resolution version of rare videos, and have really powerful encoders if anyone suddenly asked for a lowres version. Then they can deliver as well as cache the lowres version in case someone else wants to look at it an hour later, but maybe drop it after a week of no access.


They don't even have to be all that powerful. GPU transcoding is fast as heck. You really just need to decide where the line is between the storage tradeoff and needing a bigger GPU farm to handle more frequent transcoding requests. I'm betting you could get away with having just one copy for 90-95% of Youtube's content. There is an ocean of video on Youtube that is accessed less than once a month or even once a year.


That, and I’m sure they employ “cold storage” using slower (and cheaper) spinning disks.


You are correct about the cost of storage. However that is not the cost of storing these videos. When you set up a server you just buy a hard disk and some other hardware correct. But Youtube also has to buy staff to maintain, update and services and networks, offices for them, legal stuff, free food, insurance, electricity and every other thing a huge business needs of course.

Google after 2007 has been about doing things that make business sense. Just storing these videos is a cost. It's why there's been a glacial pace in new features, APIs and support in youtube for over a decade - it's not a financial priority.


That was retail price. Most the operational aspects at google are somewhat fixed costs.

Power Cooling are per Server. However given the low load on longtail video content & ability for google to co-mingle high IOPs workloads with low IOPs workload.

Its not that simple...

TLDR, storage is fairly cheap in scheme of things. Even if having infinite video storage costs 100mil/year, google likely makes more from content monitization.


These videos are hours long, and unless your doing low quality 480p, which will make it hard to follow, would definitely not take 200MB.


There is an incredibly long tail of videos that are (almost) never watched. All those 200mb videos add up to a lot of money.


> A 10TiB HDD is only 150$ retail.

Where is that? I can't find anything so cheap.



The bizarre thing is I can't find bare drives cheaper than that on Amazon - how are they making a profit on these?


Generally speaking the drives in external enclosures are from the bottom slice of the still-viable drive pool. You'll often find that they have shorter/less comprehensive warranties.


Larger market of average, undemanding home users for external hard drives.

Smaller market of more demanding enthusiast users for internal drives. (Larger businesses aren't sourcing drives from Amazon or Newegg, unless the company is called Backblaze.)

Latter costs more to support and as niche users generally pay more. (The more price conscious resort to shucking external drives.)

At least, that's how it's been explained to me. I don't know if that's the entire story though it sounds reasonable enough.


Maybe it's one of those drives that comes with a demo of the backup software and they're hoping you'll buy the full version?


How many megabytes of storage are subsidized by, say, 100 views per year?


Maybe I'm mistaken, but my understanding is that YouTube has been operating at a loss for its entire existence.


Sources:

https://www.wsj.com/articles/viewers-dont-add-up-to-profit-f...

http://fortune.com/2016/10/18/youtube-profits-ceo-susan-wojc...

https://www.bloomberg.com/news/articles/2019-02-04/alphabet-...

While there's nothing concrete concerning how much YouTube costs and brings in, it seems to be generally understood that YouTube is an investment that hasn't reached profitability yet and only exists in its current state due to the fact that Google is willing to lose money on it in the mean time.


> Does YouTube have any incentive to “preserve” anything once it costs them more money than it brings in?

They clearly have little incentive to do so.

If there was a law that prevented websites from using technical or legal means to stop people scraping user-generated content, then I would agree that YouTube has not such obligation.

But since they make it hard for people to make archives of their content, then I think they do have a moral obligation to keep it up.


It is a ponzi scheme right now. As long as cost of storage is falling exponentially, they can afford to keep the old stuff.


That the cost of storage is falling exponentially certainly helps but it isn't necessary if we want to keep the old stuff.

In fact most of the exponential increase in storage space is used to cover the even more exponential increase in video file size.

But there is a limit on how big video can get. Our perception is limited and passed a certain point, we won't notice the improvement. Audio has been "perfect" for quite a while now, and with 4k video, we are getting there too. There is also a limit on the amount of content users can produce before people do nothing but film themselves.

We may find ourselves running out of stuff to store before we need to delete the old stuff.


You are thinking only about quality of video increases. I agree that is essentially finite if we ignore upcoming technologies such as 360 video and eventually VR ready video (fully 3d, any perspective).

My argument was primarily based on the number of videos uploaded (see https://tubularinsights.com/hours-minute-uploaded-youtube/). Looking at the area under the curve, supporting the historic videos is a fraction of what it takes to store the most recent ones.

I am arguing as long as the youtube continues to be able to support to store the recent videos, storing the historic ones is not that much more expensive.

edit: also, running out of stuff to store is not going to be a thing ever. Amount of cameras in the world is increasing and people will keep shooting more videos.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: