Prevent an actor running in parallel
I want to prevent user for requesting to much. Is there a way to prevent an actor running in parallel. When a user requesting to much in parallel, the upstream server will cry with error 429 error (to many requests) or 503 or 403. I wish an actor have a flag
Prevent Parallel
or Wait until previous request finished
something like that11 Replies
correct-apricot•3y ago
As far as I know - this is only available for schedules, but not for actor runs. Also - normally different run would use a different proxy. I will pass it to the team, but would not expect it to land anytime soon,
thanks @Andrey Bykov
I have not studied this in detail so I do not know if this is possible. But in theory: change the code so that it checks via api if another run of the actor is in progress, if yes then log some message and exit the run.
Thanks @HonzaS I've implement my code exactly like your suggestion. After several hour tortured by documentation all over the internet, I finally got it working. :perfecto: I used key-value store, to register current
run_id
so the following requests could read from it. However some request can still escapes . I still hoped there more easy way to do it via the apify platform.Not sure why you need to register current
run_id
I was thinking that you just get all runs of the actor via API and if more than 1 has status: running
then exit the current run. Seem quite straightforward but maybe I am missing something.ahh yes that should work! @HonzaS thanks for inspiring me , I shall try it and let you know the result
correct-apricot•3y ago
I've also got a similar response from the team, that it's not possible directly, but first suggestion was:
another suggestion was from @Alexey Udovydchenko (but I cannot comment on it, so sending as-is, so maybe he could comment on his own):
Thank you @Andrey Bykov @Alexey Udovydchenko for your suggestion, I noted all. meanwhile I am implementing what as @HonzaS suggestion using API's. Next step is to try to explain to a user why am I applying this restriction, and wasting their precious time waiting for a run to finish ... 😔
correct-apricot•3y ago
What's actually the website (out of curiosity)? Is it really that sensitive (even when accessed from different IPs)?
its just ordinary website for searching properties/stays, don't know why but it error very often : 429, 403 and the worst 503
even with this restriction, still got error 429: Too Many Requests
I know this is annoying for a user to limit the requests, I just want to minimize the impact of scraping activity to the upstream website
@Andrey Bykov does an actor always use different IP, even without apify proxy ?
correct-apricot•3y ago
We highly recommend to force the proxies to pretty much any actor. Without proxy it will use AWS IPs, which means the pool of IPs is probably known to any website using cloudflare or something like that. So if it's not using the proxy - IP could be different, but still from the same pool. Another thing, if proxies are used is to limit open pages per browser - this way even with higher concurrency requests should go from different IPs/fingerprints. I mean - it's all kinda basic advice, but on the other hand - sometimes all you need is such basic restrictions