dependent-tan

SSL Error

Hi Guys! I'm getting SSL error when I put TripAdvisor URL, if I put other URL that works fine, please help me out.

26 Replies

Pepa J•2y ago

Hi @Deleted User , based on the error I would expect that you get Timeouted by tripadvisor, are you using any antibot-detection approach, such as proxies, fingeprints, etc.?

dependent-tanOP•2y ago

no it's a simple code for making request. it works for other url but not for trip advisor.

Pepa J•2y ago

@Deleted User can you share reproducible example of code?

conscious-sapphire•2y ago

I don't think this is related to Apify directly, might be some config in requests library or similar. We need to see the code ofc.

dependent-tanOP•2y ago

This is the code picture. Please check.

dependent-tanOP•2y ago

@Lukas Krivka @Pepa J this is the boiler plate of apify.

MEE6•2y ago

@Deleted User just advanced to level 2! Thanks for your contributions! 🎉

dependent-tanOP•2y ago

message.txt

dependent-tanOP•2y ago

please take a look. @Lukas Krivka Hi lukas hope you're doing well, can you please take a look.

conscious-sapphire•2y ago

I will try to reproduce it

dependent-tanOP•2y ago

thanks, waiting for your reply. Hi @Lukas Krivka have you check , please let me know. Thank you

conscious-sapphire•2y ago

I can reproduce it, checking with the team

dependent-tanOP•2y ago

thank you really appreciate it. what is the error? are you able to check it @Lukas Krivka Hi Lukas Please reply I need this working so I can run my Crawler, please help me. @Lukas Krivka

conscious-sapphire•2y ago

Sorry, this might take a while before the Python team figures this out cc @Vlada Dusek

dependent-tanOP•2y ago

Thanks for the update. Really appreciate it. @Lukas Krivka Any luck? Guys @Lukas Krivka

MEE6•2y ago

@Deleted User just advanced to level 3! Thanks for your contributions! 🎉

robust-apricot•2y ago

might be related to this: https://github.com/encode/httpx/discussions/2602

GitHub

ssl.SSLWantReadError when using pytest fixture with AsyncClient · e...

It seems httpx (tested with 0.23.0) does not play nicely with pytest fixtures which are not function-scoped, with both pytest-asyncio and anyio. I see such an error for every second test: FAILED [ ...

conscious-sapphire•2y ago

Maybe using requests library or something else would fix it or just finding ignore SSL option

dependent-tanOP•2y ago

Any example code? @Lukas Krivka

robust-apricot•2y ago

with little help from chatgpt 🙂 @Deleted User

from urllib.parse import urljoin
from bs4 import BeautifulSoup
import requests

# Define a function to scrape the given URL up to the specified maximum depth
def scrape(url, depth, max_depth):
    if depth > max_depth:
        return
    
    print(f'Scraping {url} at depth {depth}...')

    # Try to send a GET request to the URL
    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')

        # If we haven't reached the max depth, look for nested links and enqueue their targets
        if depth < max_depth:
            for link in soup.find_all('a'):
                link_href = link.get('href')
                if link_href and link_href.startswith(('http://', 'https://')):
                    link_url = urljoin(url, link_href)
                    print(f'Found link: {link_url}')
                    scrape(link_url, depth + 1, max_depth)

        # Extract and print the title of the page
        title = soup.title.string if soup.title else "No Title"
        print(f'Title: {title}')

    except requests.exceptions.RequestException as e:
        print(f'An error occurred: {e}')

# Main function to start scraping
def main():
    start_urls = [{'url': 'https://www.tripadvisor.com/'}]  # Example start URL
    max_depth = 1  # Example max depth

    # Start scraping from the first URL
    for start_url in start_urls:
        url = start_url.get('url')
        print(f'Starting scrape for: {url}')
        scrape(url, 0, max_depth)

if __name__ == "__main__":
    main()

dependent-tanOP•2y ago

Thanks will check this.. site is secured by ssl so we cannot ignore it i think. stuck on request...

conscious-sapphire•2y ago

JS libraries allow to ignore SSL errors, let me check

conscious-sapphire•2y ago

Try this - https://stackoverflow.com/questions/68702930/how-to-turn-off-ssl-verification-for-authlib-client-with-httpx-starlette

Stack Overflow

How to turn off SSL verification for Authlib client with HTTPX / St...

I can't seem to find a way to make Authlib / HTTPS respect the self-signed certs no matter how hard I try, so I want to turn SSL verification off when making requests as the OAuth client. How can I...

dependent-tanOP•2y ago

I’m using python.

conscious-sapphire•2y ago

the link is for the Python httpx library

dependent-tanOP•2y ago

let me take a look.

Gaming

Programming

SSL Error

Did you find this page helpful?