How to ensure dataset is created before pushing data to it?

I have a public actor and some of my users experience that either default and/or named datasets don't seem to be existing and somehow won't be created when pushing data to them. This is the error message I can see affecting only a handful of user runs:
ERROR PlaywrightCrawler: Request failed and reached maximum retries. ApifyApiError: Dataset was not found
2025-03-06T17:37:21.112Z clientMethod: DatasetClient.pushItems
2025-03-06T17:37:21.113Z statusCode: 404
2025-03-06T17:37:21.115Z type: record-not-found
2025-03-06T17:37:21.119Z httpMethod: post
2025-03-06T17:37:21.120Z path: /v2/datasets/<redacted>/items
2025-03-06T17:37:21.122Z stack:
2025-03-06T17:37:21.124Z at makeRequest (/home/myuser/node_modules/apify-client/dist/http_client.js:187:30)
2025-03-06T17:37:21.125Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-03-06T17:37:21.127Z at async DatasetClient.pushItems (/home/myuser/node_modules/apify-client/dist/resource_clients/dataset.js:104:9)
2025-03-06T17:37:21.129Z at async processSingleReviewDetails (file:///home/myuser/dist/helperfunctions.js:365:5)
2025-03-06T17:37:21.131Z at async Module.processReviews (file:///home/myuser/dist/helperfunctions.js:379:13)
2025-03-06T17:37:21.133Z at async getReviews (file:///home/myuser/dist/main.js:37:5)
2025-03-06T17:37:21.135Z at async PlaywrightCrawler.requestHandler [as userProvidedRequestHandler] (file:///home/myuser/dist/main.js:98:13)
2025-03-06T17:37:21.137Z at async wrap (/home/myuser/node_modules/@apify/timeout/cjs/index.cjs:54:21)
2025-03-06T17:37:21.139Z data: undefined {"id":"<redacted>","url":"<redacted>?sort=recency&languages=all","method":"GET","uniqueKey":"https://www.trustpilot.com/review/<redacted>?languages=all&sort=recency"}
ERROR PlaywrightCrawler: Request failed and reached maximum retries. ApifyApiError: Dataset was not found
2025-03-06T17:37:21.112Z clientMethod: DatasetClient.pushItems
2025-03-06T17:37:21.113Z statusCode: 404
2025-03-06T17:37:21.115Z type: record-not-found
2025-03-06T17:37:21.119Z httpMethod: post
2025-03-06T17:37:21.120Z path: /v2/datasets/<redacted>/items
2025-03-06T17:37:21.122Z stack:
2025-03-06T17:37:21.124Z at makeRequest (/home/myuser/node_modules/apify-client/dist/http_client.js:187:30)
2025-03-06T17:37:21.125Z at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-03-06T17:37:21.127Z at async DatasetClient.pushItems (/home/myuser/node_modules/apify-client/dist/resource_clients/dataset.js:104:9)
2025-03-06T17:37:21.129Z at async processSingleReviewDetails (file:///home/myuser/dist/helperfunctions.js:365:5)
2025-03-06T17:37:21.131Z at async Module.processReviews (file:///home/myuser/dist/helperfunctions.js:379:13)
2025-03-06T17:37:21.133Z at async getReviews (file:///home/myuser/dist/main.js:37:5)
2025-03-06T17:37:21.135Z at async PlaywrightCrawler.requestHandler [as userProvidedRequestHandler] (file:///home/myuser/dist/main.js:98:13)
2025-03-06T17:37:21.137Z at async wrap (/home/myuser/node_modules/@apify/timeout/cjs/index.cjs:54:21)
2025-03-06T17:37:21.139Z data: undefined {"id":"<redacted>","url":"<redacted>?sort=recency&languages=all","method":"GET","uniqueKey":"https://www.trustpilot.com/review/<redacted>?languages=all&sort=recency"}
` How can I ensure that the datasets are created ahead of time when running the scraper before it collects data and then fails because the dataset cant be created or does not exist?
6 Replies
Hall
Hall•3mo ago
Someone will reply to you shortly. In the meantime, this might help:
Pepa J
Pepa J•3mo ago
Hi @Casper, can you send me some Id of the Run when the problem happened, so we can investigate?
metropolitan-bronze
metropolitan-bronzeOP•3mo ago
I have just sent you them now
Pepa J
Pepa J•2mo ago
Hi @Casper does the issue still occurs? Based on the logs it really seems that there is an attempt to push data into non-existing Dataset, can you share code when where you handle managing the datasets and pushing the items into them?
metropolitan-bronze
metropolitan-bronzeOP•2mo ago
@Pepa J I can give you access to the git repository to make it easier to troubleshoot. Just send me your github username in a DM 🙂 The issue only happens to a subset of customer runs.
Pepa J
Pepa J•2mo ago
@Casper The code looks good. I was trying to reproduce it, but unsuccessfully. Does it happen often? Would you be able to put together minimal code example of such a behavior? We found out there was custom implementation of Dataset drop function that was meant for development purposes, but behaved differently on Apify Platform.

Did you find this page helpful?