Using Apify a-sync
Hi there,
I'm using Apify for quiet a while now - great product!
Yet, to get the dataset items of a scraped shop we have only 2 options-"Run actor synchronously and get dataset items" or "Get last run dataset items".
It becomes problematic as long-synchronous-requests could get interrupted, and it looks as if it's only possible to retrieve just the last run results - which means that we can work just on 1 website at a time (no parallelism)
Any work around?
A good option would be to send a request to scrape website X, get the run id Y and then just "get run id (Y) dataset items".
I also checked the "Integrations" - "HTTP Webhook" option but from the available variables it doesn't give access to the dataset items themselves. (<- This would be the best)
Any ideas?
Thanks, Arseni
3 Replies
automatic-azure•3y ago
Its actually common approach:
If you want do it by API calls instead of SDK click "API" button at top right of actor web page, you will see endpoint and link to API manual
exotic-emerald•3y ago
The async flow in explained fully in this article - https://docs.apify.com/tutorials/run-actor-and-retrieve-data-via-api
Apify
Run actor and retrieve data via API · Apify Documentation
Learn how to run an actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating actors with your projects.
genetic-orangeOP•3y ago
Thanks you very much!
I would really suggest to add the dataset iteams endpoint ('https://api.apify.com/v2/datasets/<defaultDatasetId>/items') to the menu that opens when clicking the API button top-right. It's really confusing when it doesn't show up there.
Solved 🙂