Scaling Playwright Crawler on GCP Cloud Run

Yo everyone. Starting to try and hit my crawler with some real load. Playwright setup in GCP Cloud Run... do you think a tonne of info logs I haven't cleaned up could be impacting performance? If so that's my lowest hanging fruit πŸ˜‚ Otherwise should I skip to looking at parallelization? What is performance like using lots of average/small/cheap instances versus overprovisioning a single cloud run instance and smacking it with load? Not looked at the internals of crawlee too deeply so hoping the community can save me some time here πŸ™
1 Reply
variable-lime
variable-limeβ€’15mo ago
Wow, didn't know this...but:
console.log calls in nodejs are synchronous(!) and block the event loop. I just experienced that when I logged the results from executing (asynchronous) sql queries with pg. Logging only 20 items and their (few) properties decreased the performance from 3ms to 300ms on my local machine.
console.log calls in nodejs are synchronous(!) and block the event loop. I just experienced that when I logged the results from executing (asynchronous) sql queries with pg. Logging only 20 items and their (few) properties decreased the performance from 3ms to 300ms on my local machine.
from stackoverflow

Did you find this page helpful?