BL4CKLIST・CODING•5d ago

im having a problem with a project in python raspberry pi and VLM

Building an ai robot with raspberry pi and a VLM and camera to detect the environment and interact but the problem is the AI model cannot detect precise x,y coordinates can someone tell me how prompt engineer works to make it work or what to do optimal way to obtain x,y cooridnates of the objects detectred in the image ? Thank you

2 Replies

Bl4cklist🔥System•5d ago

:hack: - Danke für deine Frage! › Unsere Community freut sich schon, dir bei deinem Problem weiterzuhelfen! Sei so lieb und unterstütze die Personen welche dir weitergeholfen in dem du die Antwort welche das Problem lösen konnte akzeptierst. - :accept: = Akzeptiert die Antwort und markiert dein Problem als gelöst. Alternativ kannst du auch /solved verwenden, falls du es selbst herausgefunden hast. Pushe deinen Post für mehr Aufmerksamkeit mit /push. ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀

Jannikjbi•2d ago

You won’t get exact x,y coordinates just by prompting a vision language model — you’ll need something like YOLO or another object detection model that can actually output those coordinates directly.

Gaming

Programming

im having a problem with a project in python raspberry pi and VLM

Did you find this page helpful?