Unable to scrape and extract data. Starts to Parallelise then nothing happens

URL is this: https://am-arrowmax.com/collections/* Prompt:

Follow every product link to ensure all variants, including sold-out ones, are included. Capture the title, all variants (size, color, etc.), images per product and variant, price, and product/variant URLs. Ensure a clear mapping of product to variants to images to price to URLs.

Follow every product link to ensure all variants, including sold-out ones, are included. Capture the title, all variants (size, color, etc.), images per product and variant, price, and product/variant URLs. Ensure a clear mapping of product to variants to images to price to URLs.

The Schema was auto generated

{
  "type": "object",
  "required": [],
  "properties": {
    "products": {
      "type": "array",
      "items": {
        "type": "object",
        "required": [],
        "properties": {
          "title": {
            "type": "string"
          },
          "variants": {
            "type": "array",
            "items": {
              "type": "object",
              "required": [],
              "properties": {
                "size": {
                  "type": "string"
                },
                "color": {
                  "type": "string"
                },
                "images": {
                  "type": "array",
                  "items": {
                    "type": "object",
                    "required": [],
                    "properties": {}
                  }
                },
                "price": {
                  "type": "number"
                },
                "url": {
                  "type": "string"
                }
              }
            }
          },
          "images": {
            "type": "array",
            "items": {
              "type": "object",
              "required": [],
              "properties": {}
            }
          },
          "price": {
            "type": "number"
          },
          "url": {
            "type": "string"
          }
        }
      }
    }
  }
}

{
  "type": "object",
  "required": [],
  "properties": {
    "products": {
      "type": "array",
      "items": {
        "type": "object",
        "required": [],
        "properties": {
          "title": {
            "type": "string"
          },
          "variants": {
            "type": "array",
            "items": {
              "type": "object",
              "required": [],
              "properties": {
                "size": {
                  "type": "string"
                },
                "color": {
                  "type": "string"
                },
                "images": {
                  "type": "array",
                  "items": {
                    "type": "object",
                    "required": [],
                    "properties": {}
                  }
                },
                "price": {
                  "type": "number"
                },
                "url": {
                  "type": "string"
                }
              }
            }
          },
          "images": {
            "type": "array",
            "items": {
              "type": "object",
              "required": [],
              "properties": {}
            }
          },
          "price": {
            "type": "number"
          },
          "url": {
            "type": "string"
          }
        }
      }
    }
  }
}

9 Replies

VENGEΛNCEOP•4mo ago

I messaged the AI on the website and it passed it to a human, but its Sunday in most of the world (Monday here in NZ) and I'd like to get this resolved today nothing shows in Activity Logs, I'm at 0/5 active concurrent browsers

micah.stairs•4mo ago

Can you try again? We fixed a bug this morning that was likely causing this.

VENGEΛNCEOP•4mo ago

Just kicked off a new extract process so will see how it goes @micah.stairs I tried agian, nothing happened, cant even see the attempt in the extract overview, jus tthe previous failed attemps with empty files

micah.stairs•4mo ago

Hmm, so if you want more reliable (and cheaper) extract-like functionality, you should check out our JSON scraping mode, which is supported by endpoints like /scrape and /crawl. You can also pair that with custom actions, which is a powerful combo! Would this work for your use case?

Harsh•4mo ago

hey @VENGEΛNCE this seems to be working fine I spotted couple of issues in your setup 1. You have included https:// when adding URL, please remove it 2. Your prompt doesnt seem right, ex: you have included variants twice Make sure your prompt is simple and easy to parse for the parameters attaching a screenshot for your reference let me know if you have any further issues

Harsh•4mo ago

@micah.stairs The team might consider adding/improving these 1. The URL for sharing is too long, can we shorten it? 2. Can we add the prompt/schema that was used for a run (under recent runs)

micah.stairs•4mo ago

Those are good idea! I just passed along those feature requests to the team. I will send a message here if we end up implementing either of those.

Harsh•4mo ago

sweet!

micah.stairs•2mo ago

The prompt and schema is now included in the activity logs! And the playground links should now be shorter!

Gaming

Programming

Unable to scrape and extract data. Starts to Parallelise then nothing happens

Did you find this page helpful?