Does YouTube have any incentive to “preserve” anything once it costs them more money than it brings in? If not, maybe it's a little early to say they are actually preserving anything. (The article does kind of acknowledge this.) Maybe its real role is as a link in a chain that leads to /r/datahoarders
Isn't this true of any entity? Even a Museum will eventually have to pay bills and have to decide to downsize or make changes. Even government funded preservation can have voters decide it isn't worth the costs and preservation can be dropped.
I do think private companies will be the ones most fastest to drop preservation and the ones least likely to look for external entity to swap preservation with, thus making YouTube worse than a museum.
I don't think this is a fair comparison as a museum is normally opened with the sole desire of preserving history ( Profit is often desired but not the incentive for opening )
Do you think google bought YouTube originally to preserve history or to create a new line of profit and enhance its brand?
Museums don’t own all of their collections though, and they too make decisions on what to display at any moment. Not all objects are worthy of display, some get rotated out back into some rich person’s attic.
Almost the entire collection of a museum is in storage or in archives if we consider number of objects as a whole. Only a very small proportion is on display and some of those on display will be from other museums.
I'd be very interested to hear about any respectable museum that sends off items that are not displayed to be stored into some rich person's attic (or for display at a private residence as I assume you are implying).
If you have a valuable artifact that you don't have on display, renting it for a large sum to someone is a reasonable thing to do. Of course it goes without saying that you ensure whoever gets it takes care of it. Better it be on display at a rich person's house where I might get an invite to see it than in the basement where nobody can see it.
As has been noted museums tend to be on the brink of financial insolvency. Getting a rich person to rent some art is a valid way to get enough money to stay solvent and continue to provide art to others. There is a lot more great art worth preserving than there are museums in the world.
Note, the above is not a comment on IF it is done. I have no idea if it is possible for someone to actually do this.
If you constantly collect items, and never discard them, then your collection will just keep growing (as will your warehousing bills). There's far more than can be put on display at many locations.
What else would a museum do with a piece on loan to them that they don't want anymore besides return it to the owner? That could be another museum or a private collection.
Considering their mission statement about making the worlds information universally accessible and contemporary goals around literary archival it’s not a stretch to believe they had history on their mind.
If you make it condensed milk you can store it on the counter for longer than it would've otherwise lasted in a fridge. Is maximizing longevity the only metric we can to optimize upon?
I think you’re looking at it the wrong way. YouTube is not preserving video game history by providing a place to host these videos. Instead, they’re offering an incentive for people to create these videos that are currently hosted on YouTube but will eventually be archived somewhere else.
In other words, YouTube is somewhat responsible for the creation of this new content, but has no expectation or responsibility to maintain its legacy.
And the same thing can be said for any type of videos, not just video games videos. YouTube gave an incentive (not necessarily monetary incentive) to preserve something to those who previously wouldn't if such a service didn't exist.
I was going to say something similar. I actually think YouTube (as a hosting/sharing platform) is a net-negative for the internet because there are already millions of videos that were embedded in pages that are no longer working because YouTube removed them, the account was deleted, or any number of things. If the internet worked the way it was intended, these videos would have been hosted on the site rather than embedded and would be still working. If the video wasn't there anymore, that usually means the content isn't there anymore. It's almost worse than dead hyperlinks because it's almost always supplemental content that references a video that can't be accessed anymore. :(
That being said, it's a net-positive when it comes to content created. The internet wouldn't be as crazy without YouTube, so thanks for that.
In nearly all of those cases, the content was removed because either the uploader wanted it removed, or the owner of the intellectual property in the video wanted it removed.
In both of those cases, I don't see how this is Youtube's fault or decentralized hosting would be better.
In the first case (uploader wants it gone) you have a sticky situation where loosely managed websites refuse (or can't) respond to requests to remove, which can start interfering with various laws in various places.
In the second case, where IP has been violated, those same loosely managed sites run significant legal risk of lawsuit by not responding.
From the perspective of a content owner or video creator, why would you want your work decentrally stored in places that you can't control it?
The other angles here have to do with monetization, too. When you use youtube, you can serve user aware ads and track who is watching your video as it spreads. If it is self-hosted, your content has been stolen and is enriching someone else illegally.
From a creators perspective I can't see how decentrally stored videos would be good.
Perhaps what we need is a Library of Congress type organization to begin a serious long-term archive of significant portions of this type of media.
I would agree with you if thousands of videos and accounts weren't removed daily due to false "intellectual property" claims and automated reporting. YouTube's system is terrible because they're so large that they've had to try to automate a system that can't really be automated. A decentralized system would require those claims to actually be validated before they're processed because, otherwise, it would be a waste of time and money for the companies making the false claims.
I agree with you on the Library of Congress. We need a digital LoC.
After thinking about it, a decentralized system of intellectual property theft would not work the way you think. The owners would stop contacting individual sites for individual confirmation and would send safe harbor / DMCA reports to the host who would nuke the content without spending ten seconds on it to preserve their business.
The only sites safe would be those self-hosting, and those wouldn't be able to serve a lot of bandwidth without having the means to be sued.
Even in that case, you'd have to deal with every hosting provider which, at the very least, would require an actual legal DMCA request rather than an automated one and, since people would be able to self-host, a way to contact their ISP who would have to verify that the claim and request are even valid. With YouTube, the automation is what's killing content creators and the archival value of the platform.
> I actually think YouTube (as a hosting/sharing platform) is a net-negative for the internet because there are already millions of videos that were embedded in pages that are no longer working because YouTube removed them, the account was deleted, or any number of things. If the internet worked the way it was intended, these videos would have been hosted on the site rather than embedded and would be still working.
If it wasn't for YouTube, most of those videos never would have been embedded in those pages in the first place. If the situation you describe was in any way feasible, people would have done that in the first place instead of using YouTube. You're asking for an imaginary free lunch type of situation.
What? That's exactly how the internet worked before YouTube. YouTube became the standard because you could easily upload things for free since the content was no longer the product. YouTube is only free because they can sell the data surrounding the videos to advertisers. Without that, we'd have exactly the situation that I'm describing because that's exactly the situation that we did have before YouTube came around (and that YouTube helped before it became beholden to companies rather than creators).
Just a few days ago there was a discussion about how Google is now indexing less contented from a years ago[1].
If Google can't keep indexes which are relatively inexpensive why should they keep videos that are more expensive and possibly no one will watch? Especially as the length and number of videos is increasing faster than ever?
Let’s not confuse the videos with the index of the videos. Videos can be stored on the slowest and cheapest medium, indexes might be kept in memory or flash. Each video takes hundreds of mb compressed, an entry in the index could be constant per video
Indexing and storage are two different things. If something if not accessed regularly, then you don't want to index it. You can still access it, it just takes longer to find.
Google might have thrown out some old pages, but the internet keeps growing exponentially. I highly doubt that Google is indexing less pages than a couple years ago.
I'd like someone to do the math on that 200MiB estimate. YouTube stores every video at multiple resolutions, and has replicas in data centers all around the world.
Ah! Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.
But, I'm not in industry, and the last time I poked my head in I recall things being way uglier and more complicated than I had imagined. There was a good conference talk that was linked here once but I've lost it. Talked about the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.
> Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.
A big note. All of this wasn't used by YT because they've always used a widely used standard codecs and media containers. All the lower qualities of videos they provide use storage nearly the same as the highest one (so 2x total), all bitrates are the same for all videos. As for download part, the players in auto quality would get first a maximum quality they can for current download speed, if there's a room or connection isn't stable - also 480p version, and if the speed was enough to download current and next chunk in lower quality it would download the best and switch to it over on completion (there were a times when player was downloading the whole video in the best quality after it had finished in an optimal).
> the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.
YT has very narrow list of formats allowed to upload with hard breaks in converting process, especially for the audio.
Interesting, thanks! I'd love to read more about ways to encode video that allows streaming at a variety of resolutions with a minimum of redundancy, if anyone has any links.
For me, it came up in a course I took last term which used the textbook "Video Processing and Communications" by Y. Wang, J.Ostermann and Y-Q. Zhang.
It's a bit out of date (2002), but the core video encoder material was solid. It's a bit heavy in representing concepts using symbolic/equation notation. That, for me, made the jump from 1D signal processing to multidimensional signal processing tougher than it needed to be.
There are probably better resources so definitely don't just take my word for it! :)
(For scabilility, IIRC the terms "enhancement layer" and "base layer" are particularly important, as are the block diagrams that generate them.)
Interestingly they could also just keep the best resolution version of rare videos, and have really powerful encoders if anyone suddenly asked for a lowres version. Then they can deliver as well as cache the lowres version in case someone else wants to look at it an hour later, but maybe drop it after a week of no access.
They don't even have to be all that powerful. GPU transcoding is fast as heck. You really just need to decide where the line is between the storage tradeoff and needing a bigger GPU farm to handle more frequent transcoding requests. I'm betting you could get away with having just one copy for 90-95% of Youtube's content. There is an ocean of video on Youtube that is accessed less than once a month or even once a year.
You are correct about the cost of storage. However that is not the cost of storing these videos. When you set up a server you just buy a hard disk and some other hardware correct. But Youtube also has to buy staff to maintain, update and services and networks, offices for them, legal stuff, free food, insurance, electricity and every other thing a huge business needs of course.
Google after 2007 has been about doing things that make business sense. Just storing these videos is a cost. It's why there's been a glacial pace in new features, APIs and support in youtube for over a decade - it's not a financial priority.
That was retail price. Most the operational aspects at google are somewhat fixed costs.
Power Cooling are per Server. However given the low load on longtail video content & ability for google to co-mingle high IOPs workloads with low IOPs workload.
Its not that simple...
TLDR, storage is fairly cheap in scheme of things. Even if having infinite video storage costs 100mil/year, google likely makes more from content monitization.
Generally speaking the drives in external enclosures are from the bottom slice of the still-viable drive pool. You'll often find that they have shorter/less comprehensive warranties.
Larger market of average, undemanding home users for external hard drives.
Smaller market of more demanding enthusiast users for internal drives. (Larger businesses aren't sourcing drives from Amazon or Newegg, unless the company is called Backblaze.)
Latter costs more to support and as niche users generally pay more. (The more price conscious resort to shucking external drives.)
At least, that's how it's been explained to me. I don't know if that's the entire story though it sounds reasonable enough.
While there's nothing concrete concerning how much YouTube costs and brings in, it seems to be generally understood that YouTube is an investment that hasn't reached profitability yet and only exists in its current state due to the fact that Google is willing to lose money on it in the mean time.
> Does YouTube have any incentive to “preserve” anything once it costs them more money than it brings in?
They clearly have little incentive to do so.
If there was a law that prevented websites from using technical or legal means to stop people scraping user-generated content, then I would agree that YouTube has not such obligation.
But since they make it hard for people to make archives of their content, then I think they do have a moral obligation to keep it up.
That the cost of storage is falling exponentially certainly helps but it isn't necessary if we want to keep the old stuff.
In fact most of the exponential increase in storage space is used to cover the even more exponential increase in video file size.
But there is a limit on how big video can get. Our perception is limited and passed a certain point, we won't notice the improvement. Audio has been "perfect" for quite a while now, and with 4k video, we are getting there too. There is also a limit on the amount of content users can produce before people do nothing but film themselves.
We may find ourselves running out of stuff to store before we need to delete the old stuff.
You are thinking only about quality of video increases. I agree that is essentially finite if we ignore upcoming technologies such as 360 video and eventually VR ready video (fully 3d, any perspective).
My argument was primarily based on the number of videos uploaded (see https://tubularinsights.com/hours-minute-uploaded-youtube/). Looking at the area under the curve, supporting the historic videos is a fraction of what it takes to store the most recent ones.
I am arguing as long as the youtube continues to be able to support to store the recent videos, storing the historic ones is not that much more expensive.
edit: also, running out of stuff to store is not going to be a thing ever. Amount of cameras in the world is increasing and people will keep shooting more videos.