Loading files along with HTML-scraped content via LangChain's ApifyDatasetLoader
The ApifyDatasetLoader for LangChain loads the records, which include the text, metadata, and fileUrl fields. All of the examples show loading content via the text or metadata fields — but what about fileUrl? Assuming the run has records for PDF, XLSX, and/or other files, is there an example of how to load those files alongside the scraped HTML content?
2 Replies
Its outside of SDK functionality: https://llamahub.ai/l/apify-dataset check their git or post quiestion there I guess
flat-fuchsiaOP•17mo ago
Got it, thanks, will check via the integration repo.