autoscale pool trying to scale up without suffecient memory

Hi All, im running a playwright crawler and am running into a bit of an issue with crawler stability. Have a look at these two log messages
{
"service": "AutoscaledPool",
"time": "2024-10-30T16:42:17.049Z",
"id": "cae4950d568a4b8bac375ffa5a40333c",
"jobId": "9afee408-42bf-4194-b17c-9864db707e5c",
"currentConcurrency": "4",
"desiredConcurrency": "5",
"systemStatus": "{\"isSystemIdle\":true,\"memInfo\":{\"isOverloaded\":false,\"limitRatio\":0.2,\"actualRatio\":0},\"eventLoopInfo\":{\"isOverloaded\":false,\"limitRatio\":0.6,\"actualRatio\":0},\"cpuInfo\":{\"isOverloaded\":false,\"limitRatio\":0.4,\"actualRatio\":0},\"clientInfo\":{\"isOverloaded\":false,\"limitRatio\":0.3,\"actualRatio\":0}}"
}
{
"service": "AutoscaledPool",
"time": "2024-10-30T16:42:17.049Z",
"id": "cae4950d568a4b8bac375ffa5a40333c",
"jobId": "9afee408-42bf-4194-b17c-9864db707e5c",
"currentConcurrency": "4",
"desiredConcurrency": "5",
"systemStatus": "{\"isSystemIdle\":true,\"memInfo\":{\"isOverloaded\":false,\"limitRatio\":0.2,\"actualRatio\":0},\"eventLoopInfo\":{\"isOverloaded\":false,\"limitRatio\":0.6,\"actualRatio\":0},\"cpuInfo\":{\"isOverloaded\":false,\"limitRatio\":0.4,\"actualRatio\":0},\"clientInfo\":{\"isOverloaded\":false,\"limitRatio\":0.3,\"actualRatio\":0}}"
}
autoscaled pool is trying to increase its concurrency from 4 to 5 since it was in its view idle. 20 seconds later though
{
"rejection": "true",
"date": "Wed Oct 30 2024 16:42:38 GMT+0000 (Coordinated Universal Time)",
"process": "{\"pid\":1,\"uid\":997,\"gid\":997,\"cwd\":\"/home/myuser\",\"execPath\":\"/usr/local/bin/node\",\"version\":\"v22.9.0\",\"argv\":[\"/usr/local/bin/node\",\"/home/myuser/FIDO-Scraper-Discovery\"],\"memoryUsage\":{\"rss\":337043456,\"heapTotal\":204886016,\"heapUsed\":168177928,\"external\":30148440,\"arrayBuffers\":14949780}}",
"os": "{\"loadavg\":[3.08,3.38,3.68],\"uptime\":312222.44}",
"stack": "response.headerValue: Target page, context or browser has been closed\n at Page.<anonymous> (/home/myuser/FIDO-Scraper-Discovery/dist/articleImagesPreNavHook.js:15:60)"
}
{
"rejection": "true",
"date": "Wed Oct 30 2024 16:42:38 GMT+0000 (Coordinated Universal Time)",
"process": "{\"pid\":1,\"uid\":997,\"gid\":997,\"cwd\":\"/home/myuser\",\"execPath\":\"/usr/local/bin/node\",\"version\":\"v22.9.0\",\"argv\":[\"/usr/local/bin/node\",\"/home/myuser/FIDO-Scraper-Discovery\"],\"memoryUsage\":{\"rss\":337043456,\"heapTotal\":204886016,\"heapUsed\":168177928,\"external\":30148440,\"arrayBuffers\":14949780}}",
"os": "{\"loadavg\":[3.08,3.38,3.68],\"uptime\":312222.44}",
"stack": "response.headerValue: Target page, context or browser has been closed\n at Page.<anonymous> (/home/myuser/FIDO-Scraper-Discovery/dist/articleImagesPreNavHook.js:15:60)"
}
which suggests memory was much tighter than autoscaledpool was considering, likley due to the additional ram that chromium was using. Crawlee was running in a k8 pod with a 4GB ram limit. Is this behaviour intended and how might i improve my performance? Does autoscaled pool account for how much ram is actually in use or just how much the node process uses?
6 Replies
Hall
Hall•7mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
xenial-black
xenial-blackOP•7mo ago
heres a log export from my service. after this the pod autorestarts due to the memory limit
quickest-silver
quickest-silver•7mo ago
The AutoscaledPool doesn't ensure the memory never goes above the limit, it just doesn't scale to more requests if it is close. So if there is a sudden memory spike, like on very heavy page, it can still cause troubles. You can either limit maxConcurrency or play with the autoscaledPoolOptions to reduce memory scaling.
xenial-black
xenial-blackOP•7mo ago
but it seems to me that the pool was still trying to scale up, even while there was no extra memory to be had?
Pepa J
Pepa J•7mo ago
Hi @Crafty if the defaults settings doesn't work for you may adjust the ratios for scaling up by https://crawlee.dev/api/core/interface/AutoscaledPoolOptions in the Crawler options.
xenial-black
xenial-blackOP•7mo ago
Thanks for these, eventually i found the snapshotter used memory ratio and turned it down. 🙂

Did you find this page helpful?