Running:
Windows 11 24H2, OS Build 26100.4652
Docker Desktop 4.43.2
The issue/feature:
The model I use is large and takes time to load in memory. If I don’t ask it a question before the end of 5 minutes, it unloads. It will happily reload after, but now I have to wait for it to reload. Is there a keepalive setting somewhere to keep the model alive in memory for an indeterminate amount of time?
Steps to replicate issue/feature:
- Open Docker Desktop
- Select [Models] tab
- Select an existing model to open the chat window
- Ask a question …
… the com.docker.llama-server.exe starts and loads the model into memory, answers the question - Let sit idle and after 5 minutes, com.docker.llama-server.exe unloads and the model is cleaned from memory
Tried with no success
- Disabled Docker Desktop Settings → Resources → Resource Saver
- Looked over docker *-options.json files for possible relevant parameters