Does Crawlee crawl both root-relative and base-relative urls?
Root relative - prefixed with '/', ie href=/ASDF brings you to example.com/ASDF
base-relative - no prefix, ie. href=ASDF from example.com/test/ brings you to example.com/test/ASDF
If someone could point me to where in the library this logic occurs, I would be forever grateful
2 Replies
Someone will reply to you shortly. In the meantime, this might help:
modern-teal•4mo ago
Hey @Besteon
https://github.com/apify/crawlee-python/blob/master/src/crawlee/_utils/urls.py
For this purpose, yarl is used - https://github.com/aio-libs/yarl