Crawlee & Apify•3y ago

POST payload is too large

I often receive error The POST payload is too large (limit: 9437185 bytes, actual length: 9453568) Although the scraping is completes, but the actor failed to save the results to dataset. Is Apify has this restriction on result byte size ? How can I overcome this issue ? Thanks

8 Replies

Alexey Udovydchenko•3y ago

No way to overcome except splitting data to smaller chunks

!!!Joefree!!! 👑OP•3y ago

I always save to dataset in one request: dataset.push_items( items ) Do you means I have to save it in smaller chunks. for example: for 1000 items then I can divide it into 10 (100 items in each save): dataset.push_items( first 100 items ), etc..

Alexey Udovydchenko•3y ago

Yes, you need to save smaller array(s) or push to dataset item by item

!!!Joefree!!! 👑OP•3y ago

Ahh it makes sense. Thanks @Alexey Udovydchenko !

conscious-sapphire•3y ago

Actually, the Python client should be doing this splitting so I will report it

!!!Joefree!!! 👑OP•3y ago

Thanks @Lukas Krivka . I split it manually (100 results a time). it would be great if it can split automatically.

conscious-sapphire•3y ago

Actually, it is in Crawlee only now so probably will come later with Python SDK https://github.com/apify/crawlee/blob/5ec089d5628cab096e0f67955694af700a603cc3/packages/core/src/storages/dataset.ts#L259

GitHub

crawlee/dataset.ts at 5ec089d5628cab096e0f67955694af700a603cc3 · ap...

Crawlee—A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast. - crawlee/dataset.ts at 5ec089d5628cab096e0f67955694af700a603cc3 · apify/crawlee

!!!Joefree!!! 👑OP•3y ago

Oh. so sad 🥲

Gaming

Programming

POST payload is too large

Did you find this page helpful?