🧩 Plasmo Developers�PD
🧩 Plasmo Developers•12mo ago•
1 reply
Guibod

Scrape many pages with plasmo

Hello, i’m new to the framework and I feel a little bit overwhelmed. So I decided to ask the Plasmo people directly.

So my need is to scrape a few pages as the browser owner, extract a JSON and provide it as a download. Since the website is blocked by authentication shenanigans and that my use case rely on authenticated users to extract the data i’ve opted to run a web extension.

I’ve been able to use the PlasmoCSConfig and PlasmoGetInlineAnchor to add a button on the website. That’s great. I’ve also been able to change the current url, upon click event. I’ve also read and explored a little bit around WORLD script injection.

But how would you design an extension that would visit a page, determine a list of 6 to 20 pages to explore, visit each of them as the current user, extract some info from each DOM and finally create a JSON payload ?
Do I need to use a background ? If so, how the background can act upon the current tab from it ? If the scraping is too long, is there a risk that the background would die.
Alternatively, should I embed the website in a frame to keep a context between each pages visited ? Or this can be achieved by a sidebar or the project’s popup ?
Was this page helpful?