Build Asynchronous pipeline

Hi There, I'm building a pipeline with Apify, but I want to run many jobs in parallel then later I will check their status and export completed jobs. is there anything I can use to export the data of any job using job id or something ?
14 Replies
metropolitan-bronze
metropolitan-bronze2y ago
Hey there! There's runs endpoint - https://docs.apify.com/api/v2/#/reference/actor-runs/run-collection/get-user-runs-list - it return list of runs for a given actor. The list actually contains the status of each run - so you could either iterate through, grab run Ids and then fetch the default storages or whatsoever
foreign-sapphire
foreign-sapphireOP2y ago
Thank you, Andrey, Yes I used this API but I thought maybe there's a direct API to check status of specific job. So this is how I build my pipeline: 1. start job and save the job information into a database/ file. 2. later I will check all jobs in my actor, and if my job is completed I will export the data This is the best way to do that? Thank you again
foreign-sapphire
foreign-sapphireOP2y ago
Thank you for that, it's really helpful: I found that the api will be something like that: https://api.apify.com//v2/actor-runs/{runId}{?token} But how could I get runId, when I run the task I got these information: dict_keys(['id', 'actId', 'userId', 'startedAt', 'finishedAt', 'status', 'meta', 'stats', 'options', 'createdByOrganizationMemberUserId', 'buildId', 'defaultKeyValueStoreId', 'defaultDatasetId', 'defaultRequestQueueId', 'buildNumber', 'containerUrl', 'usage', 'usageTotalUsd', 'usageUsd']) I tested every id, but always the api return {'error': {'type': 'page-not-found', 'message': 'We have bad news: there is no API endpoint at this URL. Did you specify it correctly?'}} So where could I get the runId for my job to check it later and please, can we save videos directly to AWS S3, as I need to save them later there so I want to save apify storage cost for that? or at least API for deleting apify dataset after exporting it to my S3 storage
MEE6
MEE62y ago
@Mahmoud GHonem just advanced to level 1! Thanks for your contributions! 🎉
metropolitan-bronze
metropolitan-bronze2y ago
It should be just id from the list you sent above. How do you start the run? Or I mainly need to know response to which call gives you this set of props? but generally the error message tells that something's off with the endpoint/url - so maybe also double check that you're using it correctly. As for saving the videos - you can do it - but you would need to implement the upload by yourself. Or you could delete store: https://docs.apify.com/api/v2/#/reference/key-value-stores/store-object/delete-store
foreign-sapphire
foreign-sapphireOP2y ago
Thank you Andrey, Appreciate your response it will help us start using Apify, I used this api to trigger the job url = "https://api.apify.com/v2/acts/clockworks~tiktok-scraper/runs?token=***" It returned the above list * Then I tried to test using many apis to return tasks status. for example this one: f"https://api.apify.com/v2/acts/GdWCkxBtKWOsKjdch/runs?token={token}" it should return all the runs inside actor, but it just returned some of them and my job wasn't one of the returned tasks * I tested also this API: url = f"https://api.apify.com//v2/actor-runs/{runId}?{token}" I gave it runId = id when I run the task, as you described, but this is the response {'error': {'type': 'page-not-found', 'message': 'We have bad news: there is no API endpoint at this URL. Did you specify it correctly?'}} please help me in this step as I can't continue working and start using apify because I want API to check the status of the previous job using their id Yes I started the run and the task is working, I checked that from console. I even take the task_id from the url at console and used it with same api as runId but it returned the same response
metropolitan-bronze
metropolitan-bronze2y ago
So do you start an actor directly or a task? The endpoints above are for actor runs, but there's a set of API endpoints for tasks: https://docs.apify.com/api/v2#/reference/actor-tasks/get-list-of-task-runs
foreign-sapphire
foreign-sapphireOP2y ago
what's the difference between starting actor and task? I'm using this api to run: url = f"https://api.apify.com/v2/acts/{actor}/runs?token={token}" So I think I run actor. right ?
foreign-sapphire
foreign-sapphireOP2y ago
This is the api I used to run the job
No description
foreign-sapphire
foreign-sapphireOP2y ago
So I create runs at actors Then I want to check status of that runs by passing run Id or something These are the information I get when I run it dict_keys(['id', 'actId', 'userId', 'startedAt', 'finishedAt', 'status', 'meta', 'stats', 'options', 'createdByOrganizationMemberUserId', 'buildId', 'defaultKeyValueStoreId', 'defaultDatasetId', 'defaultRequestQueueId', 'buildNumber', 'containerUrl', 'usage', 'usageTotalUsd', 'usageUsd'])
metropolitan-bronze
metropolitan-bronze2y ago
Should be https://api.apify.com/v2/actor-runs/{runId}?token={token} - just check and it works. From your list id is the runId here you miss the actual param it seems (token={token})
url = f"https://api.apify.com//v2/actor-runs/%7BrunId%7D?{token}"
also double slash
foreign-sapphire
foreign-sapphireOP2y ago
Thank you Andrey, it seems I used two different tokens, one for creating task and another one to check its status, so the second one a see the task But I solved it and the resources you sent were very helpful, I really appreciate your help
metropolitan-bronze
metropolitan-bronze2y ago
Glad this is resolved 👍

Did you find this page helpful?