CA
Crawlee & Apify•3y ago
eastern-cyan

how to handle variable selectors on pages to scrapped ?

hi i am attempting to run code using apify playwright crawler, attempting to scrape using selectors, the issue is not all pages have same selectors. some page have 2 selectors some have 3, how do i manage that, the code i am using is :
const currentInvestingPosition = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[2]').innerHTML();
const investmentRange = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[3]').innerHTML();
const sweetSpot = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[4]').innerHTML();
const investmentsOnRecord = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[5]').innerHTML();
const currentFundSize = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[6]').innerHTML();
const currentInvestingPosition = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[2]').innerHTML();
const investmentRange = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[3]').innerHTML();
const sweetSpot = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[4]').innerHTML();
const investmentsOnRecord = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[5]').innerHTML();
const currentFundSize = await page.locator('//*[@id="vc-profile"]/div/div[2]/div[2]/div[6]').innerHTML();
4 Replies
national-gold
national-gold•3y ago
You can use page.locator to grab all items under a single selector, then check the length of the list with locator.count() https://playwright.dev/docs/api/class-locator#locator-count Which page are you scraping by the way?
Locator | Playwright
Locators are the central piece of Playwright's auto-waiting and retry-ability. In a nutshell, locators represent a way to find element(s) on the page at any moment. Locator can be created with the page.locator(selector[, options]) method.
vicious-gold
vicious-gold•3y ago
You can also use CSS selectors that support multiple selectors separated by comma, e.g. page.$(${selector1},${selector2})
eastern-cyan
eastern-cyanOP•3y ago
thanks for the tip, let me try it. the actual url i am attempting is https://signal.nfx.com/investors/james-currier
Signal: where top founders find and get introduced to the right investors
James Currier's Investing Profile - NFX General Partner | Signal
View who can give you a warm intro to James and 20,000+ top startup investors by joining Signal. See James Currier's recent investments in Seed Crypto/Web3, other investment areas, and co-investors.
MEE6
MEE6•3y ago
@dhruvgupta1729 just advanced to level 1! Thanks for your contributions! 🎉

Did you find this page helpful?