unwilling-turquoise

create new

I am using the following code to use apify Website Content Crawler:


from apify_client import ApifyClient

# Initialize the ApifyClient with your API token
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://docs.apify.com/academy/web-scraping-for-beginners" }],
    "crawlerType": "playwright:firefox",
    "includeUrlGlobs": [],
    "excludeUrlGlobs": [],
    "maxCrawlDepth": 20,
    "maxCrawlPages": 9999999,
    "initialConcurrency": 0,
    "maxConcurrency": 200,
    "initialCookies": [],
    "proxyConfiguration": { "useApifyProxy": True },
    "requestTimeoutSecs": 60,
    "dynamicContentWaitSecs": 10,
    "maxScrollHeightPixels": 5000,
    "removeElementsCssSelector": """nav, footer, script, style, noscript, svg,
[role=\"alert\"],
[role=\"banner\"],
[role=\"dialog\"],
[role=\"alertdialog\"],
[role=\"region\"][aria-label*=\"skip\" i],
[aria-modal=\"true\"]""",
    "removeCookieWarnings": True,
    "clickElementsCssSelector": "[aria-expanded=\"false\"]",
    "htmlTransformer": "readableText",
    "readableTextCharThreshold": 100,
    "aggressivePrune": False,
    "debugMode": False,
    "debugLog": False,
    "saveHtml": False,
    "saveMarkdown": True,
    "saveFiles": False,
    "saveScreenshots": False,
    "maxResults": 9999999,
}

# Run the Actor and wait for it to finish
run = client.actor("aYG0l9s7dbB7j3gbS").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

I want to create a new dataset for the different projects I have. When I change the run["defaultDatasetId"] to project_id this actor doesn't is executed. There is a way to create a datasetId for my different projects?

2 Replies

MEE6•2y ago

@elmatero just advanced to level 1! Thanks for your contributions! 🎉

eastern-cyan•2y ago

Every run is associated with a unique dataset ID referred to as "defaultDatasetId." When you attempt to replace 'defaultDatasetId' with 'project_id,' you are essentially trying to access a dataset using the project ID. However, this dataset does not exist.

Gaming

Programming

create new

Did you find this page helpful?