I am running the following model: ai/gemma3n (Quantization: IQ2_XXS/Q4_K_M)
I have n8n installed as a Docker container and can successfully connect to the model, prompt, and receive a response.
However, when I submit an image (converted to JSON RAW BASE64 as required by the compatible OpenAI API), I receive the following error (at the end of this message), which suggests that perhaps MMPROJ (the Ollama Multimodal Projector) is not enabled in Docker. Am I correct in saying that I cannot use the Docker OpenAI-compatible API to use multimodal vision?
If I am incorrect, and I can submit images, audio, and video (multimodal options listed in the Docker Model), how do I proceed?
{
"errorMessage": "The service was not able to process your request",
"errorDescription": "image input is not supported - hint: if this is unexpected, you may need to provide the mmproj",
"errorDetails": {
"rawErrorMessage": [
"500 - \"{\\\"error\\\":{\\\"code\\\":500,\\\"message\\\":\\\"image input is not supported - hint: if this is unexpected, you may need to provide the mmproj\\\",\\\"type\\\":\\\"server_error\\\"}}\""
],
"httpCode": "500"
},
"n8nDetails": {
"nodeName": "HTTP Request",
"nodeType": "n8n-nodes-base.httpRequest",
"nodeVersion": 4.2,
"itemIndex": 0,
"time": "7/16/2025, 12:56:11 PM",
"n8nVersion": "1.102.3 (Self Hosted)",
"binaryDataMode": "default",
"stackTrace": [
"NodeApiError: The service was not able to process your request",
" at ExecuteContext.requestWithAuthentication (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_openai@5.8.1_ws@8.17.1_zod@3.25.67_/node_modules/n8n-core/src/execution-engine/node-execution-context/utils/request-helper-functions.ts:1476:10)",
" at processTicksAndRejections (node:internal/process/task_queues:105:5)",
" at ExecuteContext.requestWithAuthentication (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_openai@5.8.1_ws@8.17.1_zod@3.25.67_/node_modules/n8n-core/src/execution-engine/node-execution-context/utils/request-helper-functions.ts:1762:11)"
]
}
}
I see this behaviour too. I’ve configured Docker Model Runner on Docker Engine on Ubuntu and added the OpenAI API endpoint to Open WebUI and get the same response.
500: image input is not supported - hint: if this is unexpected, you may need to provide the mmproj
This is not my official response, but as of this writing (Aug 2025), it would appear that Docker uses some variant of Ollama. Ollama (also as of this writing) supports multi-modal inputs (read an image, transcribe audio, etc). It does not appear that Docker supports binary inputs like that with their models. So when you see “multimodal” or supports “vision, images, audio, etc.”, that means the model supports it, but not necessarily Docker. I might be proven wrong here, and I am sure Docker is working on it, but currently, I think it is text-optimized.
Okay, so the minute I responded here, I received an update fro the Model Runner team on GitHub stating that the feature is available and my ticket was closed, so I assume someone was watching this thread. So that is good news! I have not tried the feature yet, but will share the link here for your reference: