Figured this one out, turns out someone made a neat library that can extract text from PDFs that wor
Figured this one out, turns out someone made a neat library that can extract text from PDFs that works with workers. https://github.com/unjs/unpdf
Once you get the text, you can use the regular recursive character text splitter from the long-chain. Figured I'd share it if someone ever runs into this thread looking for the same.
Once you get the text, you can use the regular recursive character text splitter from the long-chain. Figured I'd share it if someone ever runs into this thread looking for the same.
GitHub
Utilities to work with PDFs in Node.js, browser and workers - unjs/unpdf



