Pepa J
Pepa J
CACrawlee & Apify
Created by IrshaiD' on 5/7/2025 in #apify-platform
How do we handle authenticated scrapers on apify cli locally
@IrshaiD' Yes, in most cases you need to handle the inital login on your side and then persist the browser cookies (for example to the key-value store), in the pre-navigation hook you can setup loading of the cookies from key-value store and setting them to the page context. You may also need to handle expiration of the cookies.
10 replies
CACrawlee & Apify
Created by IrshaiD' on 5/7/2025 in #apify-platform
How do we handle authenticated scrapers on apify cli locally
Hi @IrshaiD' I am not sure if I understand, but maybe you mean the apify login command for apify-cli? Or can you describe a bit more what you try to achieve?
10 replies
CACrawlee & Apify
Created by SJ(Jena) on 5/4/2025 in #apify-platform
Pricing clarification!
Hi @SJ(Jena), The subscription is pre-paid for 1 month. So if you remove your credit card, it will not affect the already pre-paid credits. Also after the current billing period runs out, and (without credit card) you'ill be switched to free tier for the next month.
6 replies
CACrawlee & Apify
Created by secure-lavender on 5/1/2025 in #apify-platform
How to get the run workflow via API?
Hi @Youcef Unfortunately, this is currently not available on the API.
5 replies
CACrawlee & Apify
Created by wise-white on 5/4/2025 in #crawlee-js
Wiping session between inputs
Hi @BageDevimo You can try clear the cookies and other stuff in the preNavigation hook configured in the PlaywrightCrawler options:
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
7 replies
CACrawlee & Apify
Created by metropolitan-bronze on 3/7/2025 in #crawlee-js
How to ensure dataset is created before pushing data to it?
We found out there was custom implementation of Dataset drop function that was meant for development purposes, but behaved differently on Apify Platform.
8 replies
CACrawlee & Apify
Created by optimistic-gold on 3/7/2025 in #crawlee-js
How to ensure dataset is created before pushing data to it?
@Casper The code looks good. I was trying to reproduce it, but unsuccessfully. Does it happen often? Would you be able to put together minimal code example of such a behavior?
8 replies
CACrawlee & Apify
Created by other-emerald on 3/7/2025 in #crawlee-js
How to ensure dataset is created before pushing data to it?
Hi @Casper does the issue still occurs? Based on the logs it really seems that there is an attempt to push data into non-existing Dataset, can you share code when where you handle managing the datasets and pushing the items into them?
8 replies
CACrawlee & Apify
Created by correct-apricot on 3/7/2025 in #crawlee-js
How to ensure dataset is created before pushing data to it?
Hi @Casper, can you send me some Id of the Run when the problem happened, so we can investigate?
8 replies
CACrawlee & Apify
Created by adverse-sapphire on 3/5/2025 in #apify-platform
Python Template Issues
It was fixed today.
5 replies
CACrawlee & Apify
Created by adverse-sapphire on 2/16/2025 in #apify-platform
[URGENT] Video Scraper for Reddit
Hi @! rami The best way to get information about specific Actor is to raise an Issue in the Issues tab in the Actor's Detail.
3 replies
CACrawlee & Apify
Created by conscious-sapphire on 3/6/2025 in #apify-platform
Tracking custom stats
Hi @Ryan A There is no such a feature on RequestQueue. You may want to try propose it in #💫feature-request channel. To work with statistics as you mentioned on the Platform we usually see scheduled (ex. once in an hour) Runs of custom made Actors that are capable to browse the other Runs' storage and data, do some evaluation on it and then save it to a named KV-store as you mentioned. As in this case as the scheduled Runs is only one, there is a lower chance for running into a race conditions.
4 replies
CACrawlee & Apify
Created by like-gold on 3/5/2025 in #apify-platform
Python Template Issues
Hi @Max Forbang , Thank you I was able to reproduce the issue and I raised it internally.
5 replies
CACrawlee & Apify
Created by equal-aqua on 3/3/2025 in #apify-platform
Cannot create an account on the website with gmail address. Do not want to use Google SSO
Hi @Benj , unfortunately this is currently not possible.
3 replies
CACrawlee & Apify
Created by adverse-sapphire on 3/5/2025 in #crawlee-js
Routing issue
Hi @Scai would it be possible to put together some minimal reproducible example, when it happens? And probably an example of such a URLs. enqueueLink usually put new link on the end of the queue. you may use forefront: true to change it https://github.com/apify/crawlee/issues/389 .
4 replies
CACrawlee & Apify
Created by extended-salmon on 3/5/2025 in #crawlee-js
Using BrightData's socks5h proxies
Hi @Jeno Yes, unfortunately as far as I know the socks5 is not supported by crawlee yet. There is an Issue about it on GitHub where people discuss workaround using proxy-chain just as you mentioned https://github.com/apify/crawlee/issues/389
4 replies
CACrawlee & Apify
Created by rival-black on 3/5/2025 in #apify-platform
📌 Passing a Custom Tag in Apify Actor & Webhook for Make.com
No description
4 replies
CACrawlee & Apify
Created by ambitious-aqua on 3/1/2025 in #crawlee-js
How to stop following delayed javascript redirects?
Thank you @Nth, I believe there might be and issue/bug that shows up happens on a specific website, would it be possible to put together minimal reproducible example with "real urls"?
7 replies
CACrawlee & Apify
Created by foreign-sapphire on 3/2/2025 in #apify-platform
Force language?
Ah yes, in case you are not the developer of the Actor there is not default way to do this from the API/Actor's Input. This needs to be directly developed.
6 replies
CACrawlee & Apify
Created by provincial-silver on 2/21/2025 in #crawlee-js
Disable write to disk
I'll just add another example:
import { MemoryStorage } from '@crawlee/memory-storage';
import { PlaywrightCrawler } from 'crawlee';
import { RequestQueue } from 'apify';

export const memoryRequestQueue = await RequestQueue.open(null, {
storageClient: new MemoryStorage(),
});

const crawler = new PlaywrightCrawler({
proxyConfiguration,
requestQueue: memoryRequestQueue,
// ...
});
import { MemoryStorage } from '@crawlee/memory-storage';
import { PlaywrightCrawler } from 'crawlee';
import { RequestQueue } from 'apify';

export const memoryRequestQueue = await RequestQueue.open(null, {
storageClient: new MemoryStorage(),
});

const crawler = new PlaywrightCrawler({
proxyConfiguration,
requestQueue: memoryRequestQueue,
// ...
});
etc.
7 replies