Figured this one out, turns out someone made a neat library that can extract text from PDFs that wor

Figured this one out, turns out someone made a neat library that can extract text from PDFs that works with workers. https://github.com/unjs/unpdf

Once you get the text, you can use the regular recursive character text splitter from the long-chain. Figured I'd share it if someone ever runs into this thread looking for the same.
GitHub
πŸ“„ Utilities to work with PDFs in Node.js, browser and workers - unjs/unpdf
Was this page helpful?