Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Problem with fixed concurrency is that required memory per Chrome process varies a lot.

We (https://www.apify.com) are solving this by autoscaling number of parallel Puppeteer instances based on memory and CPU. Our open source SDK (http://github.com/apifytech/apify-js) implements this using class PuppeteerCrawler (https://www.apify.com/docs/sdk/apify-runtime-js/latest#Puppe...) which internally uses AutoscaledPool that provides autoscaling:

- https://github.com/apifytech/apify-js/blob/master/src/autosc...

- https://www.apify.com/docs/sdk/apify-runtime-js/latest#Autos...

Sadly this feature is currently limited to our platform because it's mainly build for running in Docker containers. And as Docker container don't know about it's CPU consumption vs limits it requires a notifications about reaching its CPU limit from underlying platform. We have solved this using Websocket events. But we are currently working on extending this to work anywhere/locally.



I've definitely noticed this as well and have been working on a "auto-scale" feature as well (to tone-down concurrency when under load). It's not an easy task, but you can see my first pass here: https://github.com/joelgriffith/browserless/blob/master/src/...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: