I cannot submit images to Docker Models

billraymond · July 16, 2025, 8:08pm

I am running the following model: ai/gemma3n (Quantization: IQ2_XXS/Q4_K_M)

I have n8n installed as a Docker container and can successfully connect to the model, prompt, and receive a response.

However, when I submit an image (converted to JSON RAW BASE64 as required by the compatible OpenAI API), I receive the following error (at the end of this message), which suggests that perhaps MMPROJ (the Ollama Multimodal Projector) is not enabled in Docker. Am I correct in saying that I cannot use the Docker OpenAI-compatible API to use multimodal vision?

If I am incorrect, and I can submit images, audio, and video (multimodal options listed in the Docker Model), how do I proceed?

{
  "errorMessage": "The service was not able to process your request",
  "errorDescription": "image input is not supported - hint: if this is unexpected, you may need to provide the mmproj",
  "errorDetails": {
    "rawErrorMessage": [
      "500 - \"{\\\"error\\\":{\\\"code\\\":500,\\\"message\\\":\\\"image input is not supported - hint: if this is unexpected, you may need to provide the mmproj\\\",\\\"type\\\":\\\"server_error\\\"}}\""
    ],
    "httpCode": "500"
  },
  "n8nDetails": {
    "nodeName": "HTTP Request",
    "nodeType": "n8n-nodes-base.httpRequest",
    "nodeVersion": 4.2,
    "itemIndex": 0,
    "time": "7/16/2025, 12:56:11 PM",
    "n8nVersion": "1.102.3 (Self Hosted)",
    "binaryDataMode": "default",
    "stackTrace": [
      "NodeApiError: The service was not able to process your request",
      "    at ExecuteContext.requestWithAuthentication (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_openai@5.8.1_ws@8.17.1_zod@3.25.67_/node_modules/n8n-core/src/execution-engine/node-execution-context/utils/request-helper-functions.ts:1476:10)",
      "    at processTicksAndRejections (node:internal/process/task_queues:105:5)",
      "    at ExecuteContext.requestWithAuthentication (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_openai@5.8.1_ws@8.17.1_zod@3.25.67_/node_modules/n8n-core/src/execution-engine/node-execution-context/utils/request-helper-functions.ts:1762:11)"
    ]
  }
}

lwsrbrts · July 20, 2025, 5:31pm

I see this behaviour too. I’ve configured Docker Model Runner on Docker Engine on Ubuntu and added the OpenAI API endpoint to Open WebUI and get the same response.

500: image input is not supported - hint: if this is unexpected, you may need to provide the mmproj

Unfortunately I don’t have any answers and landed here while searching for the error. Text works just fine though.

billraymond · July 20, 2025, 9:29pm

Okay, thank you for trying and confirming anyway.

joymon · August 15, 2025, 3:10pm

So does that mean Docker Desktop Model Runner does not support image content in the request? Or somewhere to configure mmproj?

I know Model Runner is in the experimental phase. But any official links?

billraymond · August 15, 2025, 3:18pm

This is not my official response, but as of this writing (Aug 2025), it would appear that Docker uses some variant of Ollama. Ollama (also as of this writing) supports multi-modal inputs (read an image, transcribe audio, etc). It does not appear that Docker supports binary inputs like that with their models. So when you see “multimodal” or supports “vision, images, audio, etc.”, that means the model supports it, but not necessarily Docker. I might be proven wrong here, and I am sure Docker is working on it, but currently, I think it is text-optimized.

billraymond · August 15, 2025, 4:07pm

Okay, so the minute I responded here, I received an update fro the Model Runner team on GitHub stating that the feature is available and my ticket was closed, so I assume someone was watching this thread. So that is good news! I have not tried the feature yet, but will share the link here for your reference:

github.com/docker/model-runner

Model Runner does not seem to support multimodal use cases

opened 03:17PM - 28 Jul 25 UTC

closed 03:27PM - 15 Aug 25 UTC

BillRaymond

enhancement

Hello, I was directed here from the Moby community. I am running the following …Docker (Google Gemini) model: ai/gemma3n (Quantization: IQ2_XXS/Q4_K_M). This model is advertised as supporting multimodal inputs, like images, video, and audio. I can successfully connect to the model, prompt with text, and receive a response. However, when I post an image (converted to JSON, RAW BASE64, and 512x512, as required by the compatible OpenAI API), I receive the following error (at the end of this message), which suggests that perhaps MMPROJ (the Ollama Multimodal Projector) is not enabled in Docker. Am I correct in saying that I cannot use the Docker OpenAI-compatible API to use multimodal vision? If I am incorrect, and I can submit images, audio, and video (multimodal options listed in the Docker/Gemma Model), could you provide the steps to get it working? ``` { "errorMessage": "The service was not able to process your request", "errorDescription": "image input is not supported - hint: if this is unexpected, you may need to provide the mmproj", "errorDetails": { "rawErrorMessage": [ "500 - \"{\\\"error\\\":{\\\"code\\\":500,\\\"message\\\":\\\"image input is not supported - hint: if this is unexpected, you may need to provide the mmproj\\\",\\\"type\\\":\\\"server_error\\\"}}\"" ], "httpCode": "500" } ```

Topic		Replies	Views
Does a Docker image exist for OpenDevin with Ollama integration? General	2	39	May 23, 2025
Introducing Docker Model Runner Announcements model-runner	0	229	April 29, 2025
About the Model Runner category Model Runner	0	38	April 24, 2025
Chatgpt desktop img Docker Hub	0	146	May 15, 2024
Docker Matlab Runtime unable to generate jpg images General	0	1030	September 12, 2017

I cannot submit images to Docker Models

Related topics