CA
stormy-gold

Remove items from dataset output

Hi Guys, so im using this instagram scraper to get some posts info from instagram, works fine except it outputs far more info than i need, and the data ends up in different columns every-time to due to this. What im trying to do is use the extendoutputfucntion to remove the below items from the output, this used to work fine but something in the scraper changed and now the code looks all janky and i cant work out where its wrong? i have a feeling its those /n parts as i dont remember those being there previously? any ideas thanks heaps fam
{
"extendOutputFunction": "async ({ data, item, helpers, page, customData, label }) => {\ndelete item.hashtags;\ndelete item.mentions;\ndelete item.position;\ndelete item.commentsCount;\ndelete item.dimensionsHeight;\ndelete item.dimensionsWidth;\ndelete item.latestComments;\ndelete item.childPosts;\ndelete item.id;\ndelete item.likesCount;\ndelete item.locationName;\ndelete item.locationId;\ndelete item.ownerFullName;\ndelete item.ownerId;\ndelete item.shortCode;\ndelete item.type;\ndelete item.firstComment;\ndelete item.alt;\ndelete item.timestamp;\ndelete item.url;\ndelete item.images;\n\n\n return item;\n}\n",
"resultsLimit": 1,
"username": [
"subscription.box.australia"
]
}
{
"extendOutputFunction": "async ({ data, item, helpers, page, customData, label }) => {\ndelete item.hashtags;\ndelete item.mentions;\ndelete item.position;\ndelete item.commentsCount;\ndelete item.dimensionsHeight;\ndelete item.dimensionsWidth;\ndelete item.latestComments;\ndelete item.childPosts;\ndelete item.id;\ndelete item.likesCount;\ndelete item.locationName;\ndelete item.locationId;\ndelete item.ownerFullName;\ndelete item.ownerId;\ndelete item.shortCode;\ndelete item.type;\ndelete item.firstComment;\ndelete item.alt;\ndelete item.timestamp;\ndelete item.url;\ndelete item.images;\n\n\n return item;\n}\n",
"resultsLimit": 1,
"username": [
"subscription.box.australia"
]
}
7 Replies
plain-purple
plain-purple3y ago
Hello, yes, just remove all the \n, those are newline characters, you probably copied the raw string that includes them after being converted from the editor But actually, it this case they miht not matter, send me a link privately if you want. There is also second option now. When you download dataset, you can in advanced options select to choose fields or omit them
stormy-gold
stormy-goldOP3y ago
Hi Lukas, thanks so much for your reply i really appreciate it. so ive removed all the /n mentions but the exported file still has all the options that should be removed. Id love to send you a link privately but im unsure exactly what link i can send that would be helpful? If you can let me know what to send that would be amazing, as i spent hours and hours on this yesterday and its driving me mental haha thanks
Alexey Udovydchenko
Please consider https://apify.com/lukaskrivka/dedup-datasets since custom output in IG actor for backwards compatibility
absent-sapphire
absent-sapphire3y ago
Hi Alexey, i need this to happen on every run automatically before its then passed over to another software. Im trying to achieve no manual input on my part. so i don't think this will work as its asking for a dataset id? thanks
plain-purple
plain-purple3y ago
@Jacpat You can connect that dedup actor to a webhook so it is automatically started after the scrape ends
absent-sapphire
absent-sapphire3y ago
Hi Lukas, any tips on how to do this? Thanks
absent-sapphire
absent-sapphire3y ago

Did you find this page helpful?