Playwright crawler failing when element is not found
I have written crawler using playwright. I have bunch of
page.locator
functions to find elements and scrap text from them. Most of the elements are always on the page, but few elements like reviews are not always there since the product may be new and doesn't have any review yet. That would be no problem at all if not the playwright / crawlee failing because of it. What I saw is that when page.locator
can't find an given element it throws an error - that's okay. But crawlee is picking this error as like "the whole page error" and marks request to the page as failed. Even though other locators are working and there's a lot of data that has been found with other page.locator
I'm getting messages that request to url someshop/product-55 failed. How can I somehow fix this and tell crawlee / playwright to not fail if the page.locator
fails? I'm okay with having empty string if there's no reviews found, but I'm not okay with igoring other data because of one page.locator
failure. Example code:
const a = await page
.locator(a_locator)
.textContent(); // element found
const reviews = await page
.locator(reviews_locator)
.textContent(); // element not found, error thrown
const b = await page
.locator(b_locator)
.textContent(); // element found
const c = await page
.locator(c_locator)
.textContent(); // element found
2 Replies
afraid-scarlet•2y ago
That is not crawlee but playwright design. You should catch the error or test if it exists before so the error does not occur.
harsh-harlequinOP•2y ago
I was trying to do it with try catch statements, and when there was error I was assigning null to the
reviews
variable, but nothing has changed 😕
I was also wondering if I could check if the html node exists before calling page.locator
, but in node.js (crawlee is working on nodejs) I'm not able to call for exmaple document.querySelector
since the document
is not available there. Do you have any other suggestion how to check if html node exists?
Figured it out, here's the example code:
const reviewsQuery = await page.$(reviewsSelector);
const reviews = reviewsQuery
? await page.locator(reviewsSelector).textContent()
: "";
page.$
returns the html node or null, so based on that we can do the if
🙂