SAME_HOSTNAME not working on non www URLs
When using the
EnqueueStrategy.SAME_HOSTNAME
I noticed it does not work properly on non www
urls.
In the debugger I noticed it passes origin
to the _check_enqueue_strategy
but it uses the context.request.loaded_url
if available.
So every URL that is checked will mismatch because of the difference in hostname
I tested this with multiple urls with & without www
prefix and got the same behaviour.

2 Replies
Someone will reply to you shortly. In the meantime, this might help:
-# This post was marked as solved by ROYOSTI. View answer.
Hi @ROYOSTI
Feel free to create an issue about this bug )
https://github.com/apify/crawlee-python/issues
GitHub
Issues · apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo...