Actor don't run on Console

Started Actor from template and crawls all requests it should on Docker when run locally, but when Console runs it keeps just stopping on first request, without giving error messages. Template worked fine. Proxy doesn't affected result. Any help of what could be?
10 Replies
Hall
Hall6mo ago
Someone will reply to you shortly. In the meantime, this might help:
ratty-blush
ratty-blushOP6mo ago
Docker on Console Apify
2024-11-27T18:43:11.609Z ACTOR: Pulling Docker image of build wkLEvzADcUINe4Q5f from repository.
2024-11-27T18:43:12.988Z ACTOR: Creating Docker container.
2024-11-27T18:43:13.146Z ACTOR: Starting Docker container.
2024-11-27T18:43:13.671Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp
2024-11-27T18:43:13.674Z Executing main command
2024-11-27T18:43:14.785Z INFO System info {"apifyVersion":"3.2.6","apifyClientVersion":"2.9.7","crawleeVersion":"3.11.5","osType":"Linux","nodeVersion":"v20.18.0"}
2024-11-27T18:43:15.778Z INFO PlaywrightCrawler: Starting the crawler.
2024-11-27T18:43:27.137Z INFO PlaywrightCrawler: enqueueing new URLs
2024-11-27T18:43:31.438Z INFO PlaywrightCrawler: All requests from the queue have been processed, the crawler will shut down.
2024-11-27T18:43:36.256Z INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":1,"requestsFailed":0,"retryHistogram":[1],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":14859,"requestsFinishedPerMinute":3,"requestsFailedPerMinute":0,"requestTotalDurationMillis":14859,"requestsTotal":1,"crawlerRuntimeMillis":21027}
2024-11-27T18:43:36.259Z INFO PlaywrightCrawler: Finished! Total 1 requests: 1 succeeded, 0 failed. {"terminal":true}
2024-11-27T18:43:11.609Z ACTOR: Pulling Docker image of build wkLEvzADcUINe4Q5f from repository.
2024-11-27T18:43:12.988Z ACTOR: Creating Docker container.
2024-11-27T18:43:13.146Z ACTOR: Starting Docker container.
2024-11-27T18:43:13.671Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp
2024-11-27T18:43:13.674Z Executing main command
2024-11-27T18:43:14.785Z INFO System info {"apifyVersion":"3.2.6","apifyClientVersion":"2.9.7","crawleeVersion":"3.11.5","osType":"Linux","nodeVersion":"v20.18.0"}
2024-11-27T18:43:15.778Z INFO PlaywrightCrawler: Starting the crawler.
2024-11-27T18:43:27.137Z INFO PlaywrightCrawler: enqueueing new URLs
2024-11-27T18:43:31.438Z INFO PlaywrightCrawler: All requests from the queue have been processed, the crawler will shut down.
2024-11-27T18:43:36.256Z INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":1,"requestsFailed":0,"retryHistogram":[1],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":14859,"requestsFinishedPerMinute":3,"requestsFailedPerMinute":0,"requestTotalDurationMillis":14859,"requestsTotal":1,"crawlerRuntimeMillis":21027}
2024-11-27T18:43:36.259Z INFO PlaywrightCrawler: Finished! Total 1 requests: 1 succeeded, 0 failed. {"terminal":true}
Docker local
Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp
Executing main command
INFO System info {"apifyVersion":"3.2.6","apifyClientVersion":"2.9.7","crawleeVersion":"3.11.5","osType":"Linux","nodeVersion":"v20.18.0"}
WARN ProxyConfiguration: The "Proxy external access" feature is not enabled for your account. Please upgrade your plan or contact [email protected]
INFO PlaywrightCrawler: Starting the crawler.
INFO PlaywrightCrawler: enqueueing new URLs
INFO PlaywrightCrawler: BLUE WEEK - Ingressos - EVENTIM {"url":"https://www.eventim.com.br/artist/blue-week/"}
INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":10,"requestsFailed":0,"retryHistogram":[10],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":4598,"requestsFinishedPerMinute":11,"requestsFailedPerMinute":0,"requestTotalDurationMillis":45981,"requestsTotal":10,"crawlerRuntimeMillis":56069}
INFO PlaywrightCrawler: Finished! Total 10 requests: 10 succeeded, 0 failed. {"terminal":true}
Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp
Executing main command
INFO System info {"apifyVersion":"3.2.6","apifyClientVersion":"2.9.7","crawleeVersion":"3.11.5","osType":"Linux","nodeVersion":"v20.18.0"}
WARN ProxyConfiguration: The "Proxy external access" feature is not enabled for your account. Please upgrade your plan or contact [email protected]
INFO PlaywrightCrawler: Starting the crawler.
INFO PlaywrightCrawler: enqueueing new URLs
INFO PlaywrightCrawler: BLUE WEEK - Ingressos - EVENTIM {"url":"https://www.eventim.com.br/artist/blue-week/"}
INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":10,"requestsFailed":0,"retryHistogram":[10],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":4598,"requestsFinishedPerMinute":11,"requestsFailedPerMinute":0,"requestTotalDurationMillis":45981,"requestsTotal":10,"crawlerRuntimeMillis":56069}
INFO PlaywrightCrawler: Finished! Total 10 requests: 10 succeeded, 0 failed. {"terminal":true}
extended-salmon
extended-salmon6mo ago
Please share some code reproduction for your case and, if possible, a link to your run on the platform. Without more details, it’s difficult to assist effectively. Maybe You have different target sites in input? I mean locally and on the platform. Also, for future reference, consider following best practices when asking for help to ensure faster and better support: https://stackoverflow.com/help/how-to-ask
Stack Overflow
How do I ask a good question? - Help Center
Stack Overflow | The World’s Largest Online Community for Developers
ratty-blush
ratty-blushOP6mo ago
Code: Almost no change from basic Actor Template: main.ts
const {
startUrls = ['https://www.eventim.com.br/'],
maxRequestsPerCrawl = 10,
} = await Actor.getInput<Input>() ?? {} as Input;

const crawler = new PlaywrightCrawler({
proxyConfiguration,
maxRequestsPerCrawl,
requestHandler: router,
navigationTimeoutSecs: 20,
sameDomainDelaySecs: 5,
maxConcurrency: 10,
launchContext: {
launcher: firefox,
},
browserPoolOptions: {
fingerprintOptions: {
fingerprintGeneratorOptions: {
browsers: [
{
name: BrowserName.firefox,
minVersion: 115,
},
],
devices: [DeviceCategory.desktop],
locales: ["pt-BR"],
operatingSystems: [OperatingSystemsName.windows],
},
},
},
});
const {
startUrls = ['https://www.eventim.com.br/'],
maxRequestsPerCrawl = 10,
} = await Actor.getInput<Input>() ?? {} as Input;

const crawler = new PlaywrightCrawler({
proxyConfiguration,
maxRequestsPerCrawl,
requestHandler: router,
navigationTimeoutSecs: 20,
sameDomainDelaySecs: 5,
maxConcurrency: 10,
launchContext: {
launcher: firefox,
},
browserPoolOptions: {
fingerprintOptions: {
fingerprintGeneratorOptions: {
browsers: [
{
name: BrowserName.firefox,
minVersion: 115,
},
],
devices: [DeviceCategory.desktop],
locales: ["pt-BR"],
operatingSystems: [OperatingSystemsName.windows],
},
},
},
});
routes.ts
router.addDefaultHandler(async ({ enqueueLinks, log }) => {
log.info(`enqueueing new URLs`);
await enqueueLinks({
globs: ['https://www.eventim.com.br/artist/*'],
label: 'detail',
});
});

router.addHandler('detail', async ({ request, page, log }) => {
const title = await page.title();
log.info(`${title}`, { url: request.loadedUrl });

await Dataset.pushData({
url: request.loadedUrl,
title,
});
});
router.addDefaultHandler(async ({ enqueueLinks, log }) => {
log.info(`enqueueing new URLs`);
await enqueueLinks({
globs: ['https://www.eventim.com.br/artist/*'],
label: 'detail',
});
});

router.addHandler('detail', async ({ request, page, log }) => {
const title = await page.title();
log.info(`${title}`, { url: request.loadedUrl });

await Dataset.pushData({
url: request.loadedUrl,
title,
});
});
ratty-blush
ratty-blushOP6mo ago
Apify
Apify Console
Manage Apify, a full-stack web scraping and data extraction platform.
Apify
Apify Console
Manage Apify, a full-stack web scraping and data extraction platform.
Apify
Apify Console
Manage Apify, a full-stack web scraping and data extraction platform.
ratty-blush
ratty-blushOP6mo ago
If needed I can provide github link to full files. proxyConfiguration is empty as template goes.
MEE6
MEE66mo ago
@didiraja just advanced to level 1! Thanks for your contributions! 🎉
ratty-blush
ratty-blushOP6mo ago
Tried a first run only changing requests adresses from template, and it just didn't work. Advanvedd only locally. @Oleg V.
extended-salmon
extended-salmon6mo ago
@didiraja Check your input settings on the platform. There’s a default "startUrl" value that overrides the one from your code, which is why your glob logic isn’t working.
ratty-blush
ratty-blushOP6mo ago
@Oleg V. i'll take a look, thanks @Oleg V. just confirming that was the case, thanks so much

Did you find this page helpful?