VLM Query

Uses Vision Language Models to answer questions about images by analyzing visual content and understanding natural language queries.

Quick Start

To get started:

Select a vision language model from the dropdown menu.

Relative path of the image file on shared storage.

Example: "images/document.jpg"

Question or prompt about the image.

Example: "What objects are visible in this image?"

msg.payload contains an output field with the answer to the query.

Example: {"output": "The image contains a car, a tree, and a building."}

{
  "image_path": "images/document.jpg",
  "prompt": "What objects are visible in this image?"
}

{
  "output": "The image contains a car, a tree, and a building."
}

When the block fails, it raises an error. Use a Catch block in your flow to handle failures and inspect the error payload.

Missing image path: msg.payload.image_path is required and must point to a file on shared storage.
Empty prompt: msg.payload.prompt must be a non-empty string.