Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd like someone to do the math on that 200MiB estimate. YouTube stores every video at multiple resolutions, and has replicas in data centers all around the world.


Ah! Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.

But, I'm not in industry, and the last time I poked my head in I recall things being way uglier and more complicated than I had imagined. There was a good conference talk that was linked here once but I've lost it. Talked about the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.


> Small note. There are methods of compression and encoding that allow for scalability. There's some fancy signal processing you can do to encode multiple <framerates / resolutions / compression qualities> into a single bitstream without necessarily storing redundant data.

A big note. All of this wasn't used by YT because they've always used a widely used standard codecs and media containers. All the lower qualities of videos they provide use storage nearly the same as the highest one (so 2x total), all bitrates are the same for all videos. As for download part, the players in auto quality would get first a maximum quality they can for current download speed, if there's a room or connection isn't stable - also 480p version, and if the speed was enough to download current and next chunk in lower quality it would download the best and switch to it over on completion (there were a times when player was downloading the whole video in the best quality after it had finished in an optimal).

> the sort of awful, buggy things (formatting/file wise) that people try to upload that break everything.

YT has very narrow list of formats allowed to upload with hard breaks in converting process, especially for the audio.


Interesting, thanks! I'd love to read more about ways to encode video that allows streaming at a variety of resolutions with a minimum of redundancy, if anyone has any links.


For me, it came up in a course I took last term which used the textbook "Video Processing and Communications" by Y. Wang, J.Ostermann and Y-Q. Zhang.

It's a bit out of date (2002), but the core video encoder material was solid. It's a bit heavy in representing concepts using symbolic/equation notation. That, for me, made the jump from 1D signal processing to multidimensional signal processing tougher than it needed to be.

There are probably better resources so definitely don't just take my word for it! :)

(For scabilility, IIRC the terms "enhancement layer" and "base layer" are particularly important, as are the block diagrams that generate them.)


For low-traffic videos they likely only keep a copy in a single region.


Interestingly they could also just keep the best resolution version of rare videos, and have really powerful encoders if anyone suddenly asked for a lowres version. Then they can deliver as well as cache the lowres version in case someone else wants to look at it an hour later, but maybe drop it after a week of no access.


They don't even have to be all that powerful. GPU transcoding is fast as heck. You really just need to decide where the line is between the storage tradeoff and needing a bigger GPU farm to handle more frequent transcoding requests. I'm betting you could get away with having just one copy for 90-95% of Youtube's content. There is an ocean of video on Youtube that is accessed less than once a month or even once a year.


That, and I’m sure they employ “cold storage” using slower (and cheaper) spinning disks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: