Selenium driver unable to fetch page in Selenium deployment
I am facing an issue while deploying my app via railway.app.
Here is the description of the app:
It is a Python script which uses Selenium driver to scrape certain webpages.
The python script is containerised in Docker.
Since it is a standalone docker image, it is expected that the behaviour of the image will be consistent despite of the underlying hardware. The whole point of containerisation is to achieve consistency and not be source of hidden package dependency issues.
The Python Script along with the docker container runs fine on my local machine which is a MacBook M1.
Now I push it to GitHub and use railway to deploy from GitHub repo.
The problem is the building of the app from the GitHub repo is fine and even the app runs fine (i.e. the deployment is successful) in the railway. But the driver.get(webpage) command hangs for 1 minute and then times out. Looks like the driver.get() command is running into issues fetching the webpage from railway’s server, even if it is running fine from my local machine.
Here is a snapshot of how I am constructing the Selenium driver
I am using Firefox selenium driver since the Chrome driver s having package dependency issues when constructing the Dockerfile around it. This is not the full program, just the gist.
Here is the Dockerfile I use
I exposed the app to the public Internet in Deployment settings in Deployment - > Settings -> Public HTTP Networking
As already mentioned above, the problem is driver.get() command timing out in case of railway app deployment. I have tried with setting a manual timeout but that does not help. Seems like the webpage can’t be reached from railway’s server.
The specific error I am getting is looking like this
Message: Reached error page: about:neterror?e=netTimeout&u=h…
There is no firewall or rule deployed to block such requests in case of the webpage I want to scrape.
3 Replies
Please provide your project ID or reply with
N/A
. Thread will automatically be closed if no reply is received within 10 minutes. You can copy your project's id by pressing Ctrl/Cmd + K -> Copy Project ID.No project ID was provided. Closing thread.
sorry to burst your bubble but web scraping is a big grey area so we will not be providing support for this usecase