I worked on a service a year ago that would stream a video from a source and upload it to a video hosting service. A few concurrent transfers would saturate the NIC. Putting each transfer job in a separate lambda allowed running any number of them in parallel, much faster than queuing up jobs on standalone instances
That’s true, the cost needs to be factored into the model. But the near infinite bandwidth scalability allows the service to exist to begin with. If every job saturates your up and down bandwidth and takes 10 minutes, and you have 100 coming in a minute, you would need to design a ridiculous architecture that could spin up and down instances and handle queuing on the scale of thousands based on demand. Or you can write a simple lambda function that can be triggered from the AWS sdk and let their infrastructure handle the headache. I’m sure a home grown solution will become more cost effective at a massive scale but lambda fits the bill for a small/medium project
Right, but without the lambda infrastructure it would be infeasible from infrastructure and cost perspective to spin up, let’s say 10,000 instances, complete a 10 minute job on each of them, and then turn them off to save money, on a regular basis
Isn't that also possible with EC2? Just set the startup script to something that installs your software (or build an AMI with it). Dump videos to be processed into SQS, have your software pull videos from that.
You'd need some logic to shut down the instances once it's done, but the simplest logic would be to have the software do a self-destruct on the EC2 VM if it's unable to pull a video to process for X time, where X is something sensible like 5 minutes.