Who is familiar with web crawling?

Hello, everyone! Who is familiar with web crawling? I have paid project but I can't do it. Who is confident? Budget: 500$ Rerequirement: Millions of Chinese magazine articles, publication level magazines(or e-zine articles published by magazines), not news articles on websites, each with more than 500 words and must be accompanied by more than one picture(only jpg format, no less than 128 pixels per side). The final delivery is a json file and images. The text only needs the body content, not any annotative content or author information, date of editing, date of publication, etc. If the article has formulas or equations, etc., it needs to be converted to Latex format. Only the body content and the accompanying images should be captured, and the placeholder <image> should be typed on the corresponding position where the accompanying image appears in the article. The images in each article are named 0000,0001,0002... in the order they appear.
4 Replies
Hall
Hall•5mo ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by foxt141. View answer.
eastern-cyan
eastern-cyan•5mo ago
Hi! Could you please post into #💻hire-freelancers ? This topic is for support with Python SDKs
secure-lavender
secure-lavender•4mo ago
Hi, do you have a target website for scraping?
MEE6
MEE6•4mo ago
@Ван\ just advanced to level 1! Thanks for your contributions! 🎉

Did you find this page helpful?