How to monitor/diagnose LLM Model execution?

ulrichbeck155 · August 3, 2025, 3:46pm

Good Day,

i am currently evaluating the new Docker Desktop Feature to run LLM Models with Docker. Is there any way to diagnose / monitor the execution of an LLM? Currently, after calling a LLM, there quite some time passing (due to my pc) until some output is coming again. It would be great to have some means of probing the current state the LLM is in/ what is doing, e.g. Tools Calls, MCP Actions,…

Thank you and best Regards,
UIi

rimelek · August 3, 2025, 6:06pm

You could try

docker model logs --follow

in anothet terminal. This is what I got from docker model run ai/llama3.2:latest until a simply “Hi” message sent to the model.

[2025-08-03T18:01:04.780505000Z][inference.model-manager] Getting model by reference: ai/llama3.2:latest
[2025-08-03T18:01:04.781339000Z][inference.model-manager][E] Failed to get model: model not found reference: ai/llama3.2:latest
[2025-08-03T18:01:04.781964000Z][inference] Pulling model: ai/llama3.2:latest
[2025-08-03T18:01:04.782415000Z][inference.model-manager] Starting model pull: ai/llama3.2:latest
[2025-08-03T18:01:05.911036000Z][inference.model-manager] Remote model digest: sha256:436bb282b41968a83638482999980267ca8d7e8b5574604460efa9efff11cf59
[2025-08-03T18:01:05.911479000Z][inference.model-manager] Model not found in local store, pulling from remote: ai/llama3.2:latest
[2025-08-03T18:03:38.700910000Z][inference.model-manager] Getting model by reference: ai/llama3.2:latest
[2025-08-03T18:03:38.708529000Z] srv  params_from_: Chat format: Content-only
[2025-08-03T18:03:38.709646000Z] slot launch_slot_: id  0 | task 100 | processing task
[2025-08-03T18:03:38.709780000Z] slot update_slots: id  0 | task 100 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 36
[2025-08-03T18:03:38.709908000Z] slot update_slots: id  0 | task 100 | need to evaluate at least 1 token for each active slot, n_past = 36, n_prompt_tokens = 36
[2025-08-03T18:03:38.710120000Z] slot update_slots: id  0 | task 100 | kv cache rm [35, end)
[2025-08-03T18:03:38.710214000Z] slot update_slots: id  0 | task 100 | prompt processing progress, n_past = 36, n_tokens = 1, progress = 0.027778
[2025-08-03T18:03:38.710395000Z] slot update_slots: id  0 | task 100 | prompt done, n_past = 36, n_tokens = 1
[2025-08-03T18:03:39.614485000Z] slot      release: id  0 | task 100 | stop processing: n_past = 43, truncated = 0
[2025-08-03T18:03:39.614719000Z] slot print_timing: id  0 | task 100 |
[2025-08-03T18:03:39.614760000Z] prompt eval time =     593.08 ms /     1 tokens (  593.08 ms per token,     1.69 tokens per second)
[2025-08-03T18:03:39.614799000Z]        eval time =     307.84 ms /     8 tokens (   38.48 ms per token,    25.99 tokens per second)
[2025-08-03T18:03:39.614834000Z]       total time =     900.92 ms /     9 tokens
[2025-08-03T18:03:39.614937000Z] srv  update_slots: all slots are idle
[2025-08-03T18:03:39.615102000Z] srv  log_server_r: request: POST /v1/chat/completions  200

ulrichbeck155 · August 4, 2025, 3:50am

Hello rimelek,
thank you very much for your reply!
I have already look at the logs-section of the models in docker desktop. Actually, I am looking for more detailed updates / more insights to the interactions currently going on.
Thank you and best regards,
Uli

rimelek · August 4, 2025, 9:05pm

I’m not aware of more detailed logs. There is a --debug flag of docker model run, but I didn’t notice the difference.

If you know anything in the OpenAPI reference that would help you, you can enable the “host-side TCP support” in Docker Desktop’s settings on the “Beta features” tab, but I’m sure you know that as it is also in the menitoned documentation.

If you have a specific feature you would like to see in the model runner, you can ask for it in the Roadmap

ulrichbeck155 · August 5, 2025, 3:34am

Dear rimelek,
Thank you for your reply. then ill do so, Docker surely will provide excellent solutions

system · September 4, 2025, 3:34am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Introducing Docker Model Runner Announcements model-runner	0	416	April 29, 2025
Persist AI Model Runner llama-server.exe for more than 5 minutes Model Runner	0	328	August 5, 2025
About the Model Runner category Model Runner	0	64	April 24, 2025
Beginner's question to docker model/download problem Docker Hub model-runner	5	434	December 11, 2025
When will the Docker Model Runner work on Docker Desktop for Linux Model Runner	1	158	July 29, 2025

How to monitor/diagnose LLM Model execution?

Related topics