CA
adverse-sapphire
Can I pass metadata onto a request that I can retrieve when getting the dataset from a webhook?
I'd like to pass some metadata (say, id=1234) onto a request so when the webhook event happens, I'll have access to the metadata when requesting the actor run's dataset.
7 Replies
afraid-scarlet•3y ago
There's already some info passed - with actor/run id, etc. You could also add webhook dynamically in the code
adverse-sapphireOP•3y ago
When I'm passing pageUrls to one of my actors, I added a #id=X to the pageUrl which is a hacky workaround, but it does work. Ideally, though, in the array of pageUrls that I'm passing to the actor via the API, I could attach an internal ID so I could grab it when the webhook reports back.
I could, to your point, setup a custom webhook with /webhook?ids=3487,899,12 etc but that isn't tightly bound to the input pageUrls.
@pj just advanced to level 1! Thanks for your contributions! 🎉
afraid-scarlet•3y ago
what I'm trying to say is that you could basically have it same way, and inside of the actor do some additional step, which would check this param and then do
Actor.addWebhook()
https://docs.apify.com/sdk/js/reference/class/Actor#addWebhook or even Actor.call()
or Actor.callTask()
Which actor we are talking about actually? Because if I understand correctly - you could add some custom data to request.userData
object https://crawlee.dev/api/core/class/Request#userDataRequest | API | Crawlee
Represents a URL to be crawled, optionally including HTTP method, headers, payload and other metadata.
The
Request
object also stores information about errors that occurred during processing of the request.
Each Request
instance has the uniqueKey
property, which can be either specified
manually in the constructor or generated automaticall...Actor | API | Apify Documentation
Actor
class serves as an alternative approach to the static helpers exported from the package. It allows to pass configuration
that will be used on the instance methods. Environment variables will have precedence over this configuration.
See {@apilink Configuration} for details about what can be configured and what are the default values.adverse-sapphireOP•3y ago
It's on the FB scraper. I've tried adding userData to the POST request as well as to the json payload but I haven't found a way to retrieve it after the webhook. I've tried calling /request-queue/requests, /request-queue/head and /dataset/items but I can't seem to find the userData. I can get the data (from /dataset/items) just not the metadata.
afraid-scarlet•3y ago
can you share your account ID or email to DM? I can't really wrap my head on exactly how are you doing it - so just wanna have a look at your runs...
adverse-sapphireOP•3y ago
DM'd, thanks
For posterity, the solution is to add the userData like this:
{
"startUrls": [
{
"url": "https://somepage.com/whatever",
"userData": {
"id": 34567
}
}
],
"proxyConfigurations": {
"useApifyProxy": true,
"apifyProxyGroups": [
"RESIDENTIAL"
]
}
}