Is it possible to make the /markdown api return both the original html and the extracted markdown? I'm building something that requires verifying results afterwards, which need a "snapshot" of original html dom (like wayback machine).
Rate limits are enforced with a fixed per second fill rate. For example, the workers paid limit of 60 requests per minute translates to 1 request per second. This means you cannot send all 60 requests at once; the API expects them to be spread evenly over the minute.
The /content endpoint instructs the browser to navigate to a website and capture the fully rendered HTML of a page, including the head section, after JavaScript execution. This is ideal for capturing content from JavaScript-heavy or interactive websites.
I don’t want to make 2 request for a single scrape. I’m thinking that for /json or /markdown endpoint, the original content is already available, could have an option to just return both json/markdown and original content.
I just saw "Currently, the Cloudflare dashboard displays usage metrics exclusively for the Workers Bindings method. Usage data for the REST API is not yet available in the dashboard. We are actively working on adding REST API usage metrics to the dashboard.". How do I monitor my usage somewhere else?
Does anyone use this api for retrieving the opengraph meta data of a website? Are there any configurations which I have to consider for best practice? Appreciate any help
Browser Rendering uses a managed Chromium environment that includes a standard set of fonts. When you generate a screenshot or PDF, text is rendered using the fonts available in this environment.
Hey i was trying to use the Rest API enpoint /content to get client side rendered content from Algolias instantsearch. I tried everything but i didn't get any items.
Browser rendering is for running a browser on the server, and from your question it looks like you more want to control a browser running on your own computer? If so, I'd recommend either looking into making a browser extension, or using selenium or puppeteer
we've worked on it a bit for REST API! but not working 100% of time yet (still fixing bugs). are you needing it for REST API or workers bindings? bc if REST API, it might work for you now
Any insights into time out issue: Page.printToPDF timed out. Increase the 'protocolTimeout' setting in launch/connect calls for a higher timeout if needed.
I've set the protocolTimeout to 0 in my config: const browser = await puppeteer.launch(env.BROWSER,{ headless: 'new', keep_alive: 600000, args: [ "--disable-setuid-sandbox", "--no-sandbox", "--single-process", "--no-zygote", ], protocolTimeout: 0, });
Any other suggestions for avoiding timeout.
What my app does - I pass it an HTML string, which it renders via page.setContent(html...) then it converts to PDF using page.pdf({...,timeout: 0})
I switched to use the playwright provider instead of puppeteer and it loaded significantly faster with no time out, so will stick with that and likely not turn back to puppeteer
Cloudflare's deWe're making it easier for developers to build amazing things, by introducing major upgrades like external model support in AI Search (previously AutoRAG), updates to Node.js compatibility, support for larger and more concurrent container instances, and more! Plus, new tools like, transactional email, and GAed support for Workers...
Would be great to have better observability, e.g. which workers caused browser rendering events in the logs. Just spent a long time trying to figure out why a worker was causing so many browser rendering hours, when it turns out it was a totally unsuspected one instead!
We're working on implementation of the rendering api for generating PDF files with the puppeteer provider. In evaluating this against our current solution I'm seeing that some of the custom fonts our customers use aren't supported (I've seen/reviewed the web page you guys have published for this). On that page you mention to come here to the discord to request new fonts to be supported. Before we start bombarding you with requests I wanted to get an idea for what that process and timeline looks like. For now we're going to encourage our customers to use supported fallback fonts in their CSS rules but if a particularly important customer feels strongly about something like Calibre or Open Sans - how likely is that to become supported and in what span of time?
Hey guys I noticed the browser rendering api is getting detected as a bot and would like to use my own custom implementation, is there a way I can connect to my own playwright instance through websockets?
Nope wont work cuz cloudflare workers has limitation on ws size message so even if you patch it cdp messages will be too big inside cloudflare workers see https://github.com/cloudflare/workerd/issues/4649
I’m working on adding support to connecting to any arbitrary websocket chrome dev protocol server for @cloudflare/playwright (in draft cloudflare/playwright#59) but seeing errors when the incoming ...
This adds support for using any ws CDP based server to connect to, without breaking existing usage of the browser rendering api. Example usage with Browserbase: import { Browserbase } from &#39...
Browser Rendering uses a managed Chromium environment that includes a standard set of fonts. When you generate a screenshot or PDF, text is rendered using the fonts available in this environment.