Crawlee & Apify•2y ago

How to parse out results in Langchain ApifyDatasetLoader

I'm using the Google Search Results Scraper which provides a single JSON with Paid, Organic and a few other keys. I'd like to parse out the Organic title and urls into a Langchain agent, but it's clear how to iterate over them. Any suggestions? loader = apify.call_actor( actor_id="apify/google-search-scraper",
# Prepare the Actor input run_input={"queries": query, "maxPagesPerQuery": 1, "resultsPerPage": 100, "customDataFunction": """async ({ input, $, request, response, html }) => {return {pageTitle: $('title').text(),};}""",},
dataset_mapping_function=lambda item: Document( page_content=item["url"] or "", metadata={"source": item["url"]} ), )

1 Reply

correct-apricot•2y ago

Well, the problem is one dataset item contains many organic results so you will need to create more documents. I think it would need to be done separately

Gaming

Programming

How to parse out results in Langchain ApifyDatasetLoader

Did you find this page helpful?