Cloudflare Developers•7mo ago

extract images from pdf in a worker via unpdf, and recognize text via AI

Any suggestions here? I tried different stuff, and no luck. Providing a real image works perfectly fine, of course. Failed to process PDF: pdf/17457.pdf AiError: 3010: Invalid or incomplete input for the model: Unsupported image data

import { extractImages, extractText, getDocumentProxy } from 'unpdf';

const pdf = await getDocumentProxy(pdfBuffer);
const extractedImages = await extractImages(pdf, 1);

const { description } = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
    image: [...extractedImage],
    // image: extractedImage,
    prompt: 'If you can recognize the text in the logo, provide it in the output and nothing else.',
    max_tokens: CONFIG.AI.MAX_TOKENS,
});

import { extractImages, extractText, getDocumentProxy } from 'unpdf';

const pdf = await getDocumentProxy(pdfBuffer);
const extractedImages = await extractImages(pdf, 1);

const { description } = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
    image: [...extractedImage],
    // image: extractedImage,
    prompt: 'If you can recognize the text in the logo, provide it in the output and nothing else.',
    max_tokens: CONFIG.AI.MAX_TOKENS,
});

0 Replies

No replies yetBe the first to reply to this messageJoin

Gaming

Programming

extract images from pdf in a worker via unpdf, and recognize text via AI

Did you find this page helpful?