Firecrawl•3mo ago

Self-host Firecrawl (store data)

I might be missing something but the self-host docs are kind of sparse: where can I configure persistence for scraped data? I ran a crawl, saw data when checking on the crawl via the API, took down the containers, brought them back up, and now the data is empty when I check via the API.again

3 Replies

Gaurav Chadha•3mo ago

Hi @fjorn , assuming you're following this setup for self-host https://docs.firecrawl.dev/contributing/self-host. Firecrawl uses Redis for queuing and likely stores crawl results there temporarily. When you restart the containers without persistent volumes, all the data in Redis gets wiped. To fix: you'll need to use docker volumes and need to add volumes for Redis. example config for docker compose: volumes:

# Persist Redis data
      - redis_data:/data
      # Optional: Custom Redis configuration
      - ./redis.conf:/usr/local/etc/redis/redis.conf:ro

# Persist Redis data
      - redis_data:/data
      # Optional: Custom Redis configuration
      - ./redis.conf:/usr/local/etc/redis/redis.conf:ro

I'll suggest you use AI to update the docker-compose.yml and redis.cofig files let me know if this helps and clarifies, we'll update the docs.

Firecrawl Docs

Quickstart | Firecrawl

Firecrawl allows you to turn entire websites into LLM-ready markdown

fjornOP•3mo ago

Ah cool yea I can set up a redis volume. Should I setup a volume for the postgres data too? Looking at the tables that stuff is moslty bookkeeping for the jobs

Gaurav Chadha•3mo ago

you can skip for now, only add PostgreSQL persistence later if you decide to enable Supabase features or set up a custom PostgreSQL instance for additional data storage.

Gaming

Programming

Self-host Firecrawl (store data)

Did you find this page helpful?