Crawlee & Apify•15mo ago

Passing memory allocation per run in Apify reel scraper

How do you go about changing the memory limits for an actor by way of the api? I am using the Python client. I had read some documentation that implied the memory limit was passed in the URL of the API call using the option of &memory=32 to set the memory limit to 32MB. However, that is not working as my actor is still defaulting to 4MB. I have also tried a few different guesses at putting the memory limit in the run_input, but I have been unsuccessful. Does anyone have documentation on how to set the memory limit for the Apify Reel Scraper (apify/instagram-reel-scraper) API?

8 Replies

xenial-black•15mo ago

You can set memory when You call actor in Your code ("memory_mbytes" param) : https://docs.apify.com/sdk/python/reference/class/Actor#call Or API call ("memory" param): https://docs.apify.com/api/v2/#/reference/actors/run-actor-synchronously/with-input It should work. otherwise You're doing something wrong and it's better to provide some code snippet with your logic.

metropolitan-bronzeOP•15mo ago

Thanks Oleg. If I understand the documentation for the call this is how I am building my API call. Note the memory_mbytes at the end of the string. Does this look correct? https://api.apify.com/v2/acts/apify~instagram-reel-scraper/run-sync?token=apify_api_*******************&memory_mbytes=32

xenial-black•15mo ago

no, it should be "memory=256" Also, note that The amount of memory can be set to a power of 2 with a minimum of 128 MB, i.e., 256 MB, 512 MB, 1024 MB, 2048 MB, ..., 32768 MB. https://docs.apify.com/platform/actors/running/usage-and-resources#memory

metropolitan-bronzeOP•15mo ago

Thank you. I clearly was not thinking straight. Unfortunately I have modified my code to create what I believe is the proper URL. However, the actor still is only getting access to 4GB of RAM. I have proved through the apify console that the actor will take advantage of 32GB of RAM if I configure it manually. Code that generates the API call is below and here is the value of apiURL: https://api.apify.com/v2/acts/apify~instagram-reel-scraper/run-sync?token=apify_api************&memory=32768

import requests
from apify_client import ApifyClient
#from apify_client import ApifyClientAsync
#import asyncio

# Apify URL build
memLimit = "&memory=32768"
resultsLimit = 10
filePath = "/folderpath/apifySecret.txt"
apiURL = "https://api.apify.com/v2/acts/"
actor_id = "apify~instagram-reel-scraper/run-sync"
apiURL = apiURL + actor_id + "?token="

with open(filePath, 'r') as file:
    fileContent = file.read().strip()

apiURL = apiURL + fileContent + memLimit

# Initialize the ApifyClient with my API Token
client = ApifyClient(fileContent)

def scrape_instagram(user):
    # Function to call Apify API and scrape Instagram data
    # Build the Apify API Payload
    run_input = {
        "username": user,
        "resultsLimit": resultsLimit,
    }

    scraped_data = []  # List to hold the results

    try:
        # Run the Actor and wait for it to finish
        run = client.actor("xMc5Ga1oCONPmWJIa").call(run_input=run_input)

        for item in client.dataset(run["defaultDatasetId"]).iterate_items():
            data_entry = {
                'id': item.get('id'),
                'type': item.get('type'),
                'ownerUsername': item.get('ownerUsername'),
                'hashtags': item.get('hashtags'),
                'url': item.get('url'),
                'timestamp': item.get('timestamp'),
                'childPosts': item.get('childPosts', [])
            }
            scraped_data.append(data_entry)
        
        return scraped_data

import requests
from apify_client import ApifyClient
#from apify_client import ApifyClientAsync
#import asyncio

# Apify URL build
memLimit = "&memory=32768"
resultsLimit = 10
filePath = "/folderpath/apifySecret.txt"
apiURL = "https://api.apify.com/v2/acts/"
actor_id = "apify~instagram-reel-scraper/run-sync"
apiURL = apiURL + actor_id + "?token="

with open(filePath, 'r') as file:
    fileContent = file.read().strip()

apiURL = apiURL + fileContent + memLimit

# Initialize the ApifyClient with my API Token
client = ApifyClient(fileContent)

def scrape_instagram(user):
    # Function to call Apify API and scrape Instagram data
    # Build the Apify API Payload
    run_input = {
        "username": user,
        "resultsLimit": resultsLimit,
    }

    scraped_data = []  # List to hold the results

    try:
        # Run the Actor and wait for it to finish
        run = client.actor("xMc5Ga1oCONPmWJIa").call(run_input=run_input)

        for item in client.dataset(run["defaultDatasetId"]).iterate_items():
            data_entry = {
                'id': item.get('id'),
                'type': item.get('type'),
                'ownerUsername': item.get('ownerUsername'),
                'hashtags': item.get('hashtags'),
                'url': item.get('url'),
                'timestamp': item.get('timestamp'),
                'childPosts': item.get('childPosts', [])
            }
            scraped_data.append(data_entry)
        
        return scraped_data

xenial-black•15mo ago

But I see, that you start actor with client: run = client.actor("xMc5Ga1oCONPmWJIa").call(run_input=run_input) So in .call() You just should pass "memory_mbytes" param. See docs: https://docs.apify.com/api/client/python/reference/class/ActorClient#call

metropolitan-bronzeOP•15mo ago

Thank you so much @Oleg V. ! I had missed how to appropriately add to the .call(). I sincerely appreciate your help!!

MEE6•15mo ago

@WesG just advanced to level 2! Thanks for your contributions! 🎉

metropolitan-bronzeOP•15mo ago

I just tested this and it works (not surprisingly) as expected.

Gaming

Programming

Passing memory allocation per run in Apify reel scraper

Did you find this page helpful?