rocket_launch
Run other Models
Running other models link
Do you have already a model file? Skip to Run models manually .
To load models into LocalAI, you can either use models manually or configure LocalAI to pull the models from external sources, like Huggingface and configure the model.
To do that, you can point LocalAI to an URL to a YAML configuration file - however - LocalAI does also have some popular model configuration embedded in the binary as well. Below you can find a list of the models configuration that LocalAI has pre-built, see Model customization on how to configure models from URLs.
There are different categories of models: LLMs , Multimodal LLM , Embeddings , Audio to Text , and Text to Audio depending on the backend being used and the model architecture.
💡Don’t need GPU acceleration? use the CPU images which are lighter and do not have Nvidia dependencies
Model
Category
Docker command
phi-2
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core phi-2
🌋 bakllava
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core bakllava
🌋 llava-1.5
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core llava-1.5
🌋 llava-1.6-mistral
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core llava-1.6-mistral
🌋 llava-1.6-vicuna
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core llava-1.6-vicuna
mistral-openorca
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core mistral-openorca
bert-cpp
Embeddings
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core bert-cpp
all-minilm-l6-v2
Embeddings
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg all-minilm-l6-v2
whisper-base
Audio to Text
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core whisper-base
rhasspy-voice-en-us-amy
Text to Audio
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core rhasspy-voice-en-us-amy
🐸 coqui
Text to Audio
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg coqui
🐶 bark
Text to Audio
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg bark
🔊 vall-e-x
Text to Audio
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg vall-e-x
mixtral-instruct Mixtral-8x7B-Instruct-v0.1
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core mixtral-instruct
tinyllama-chat original model
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core tinyllama-chat
dolphin-2.5-mixtral-8x7b
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core dolphin-2.5-mixtral-8x7b
🐍 mamba
LLM
GPU-only
animagine-xl
Text to Image
GPU-only
transformers-tinyllama
LLM
GPU-only
codellama-7b (with transformers)
LLM
GPU-only
codellama-7b-gguf (with llama.cpp)
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core codellama-7b-gguf
hermes-2-pro-mistral
LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core hermes-2-pro-mistral
To know which version of CUDA do you have available, you can check with nvidia-smi
or nvcc --version
see also GPU acceleration .
Model
Category
Docker command
phi-2
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core phi-2
🌋 bakllava
Multimodal LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core bakllava
🌋 llava-1.5
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core llava-1.5
🌋 llava-1.6-mistral
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core llava-1.6-mistral
🌋 llava-1.6-vicuna
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core llava-1.6-vicuna
mistral-openorca
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core mistral-openorca
bert-cpp
Embeddings
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core bert-cpp
all-minilm-l6-v2
Embeddings
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 all-minilm-l6-v2
whisper-base
Audio to Text
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core whisper-base
rhasspy-voice-en-us-amy
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core rhasspy-voice-en-us-amy
🐸 coqui
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 coqui
🐶 bark
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 bark
🔊 vall-e-x
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 vall-e-x
mixtral-instruct Mixtral-8x7B-Instruct-v0.1
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core mixtral-instruct
tinyllama-chat original model
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core tinyllama-chat
dolphin-2.5-mixtral-8x7b
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core dolphin-2.5-mixtral-8x7b
🐍 mamba
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 mamba-chat
animagine-xl
Text to Image
docker run -ti -p 8080:8080 -e COMPEL=0 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 animagine-xl
transformers-tinyllama
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 transformers-tinyllama
codellama-7b
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11 codellama-7b
codellama-7b-gguf
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core codellama-7b-gguf
hermes-2-pro-mistral
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda11-core hermes-2-pro-mistral
To know which version of CUDA do you have available, you can check with nvidia-smi
or nvcc --version
see also GPU acceleration .
Model
Category
Docker command
phi-2
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core phi-2
🌋 bakllava
Multimodal LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core bakllava
🌋 llava-1.5
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core llava-1.5
🌋 llava-1.6-mistral
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core llava-1.6-mistral
🌋 llava-1.6-vicuna
Multimodal LLM
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core llava-1.6-vicuna
mistral-openorca
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core mistral-openorca
bert-cpp
Embeddings
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core bert-cpp
all-minilm-l6-v2
Embeddings
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 all-minilm-l6-v2
whisper-base
Audio to Text
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core whisper-base
rhasspy-voice-en-us-amy
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core rhasspy-voice-en-us-amy
🐸 coqui
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 coqui
🐶 bark
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 bark
🔊 vall-e-x
Text to Audio
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 vall-e-x
mixtral-instruct Mixtral-8x7B-Instruct-v0.1
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core mixtral-instruct
tinyllama-chat original model
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core tinyllama-chat
dolphin-2.5-mixtral-8x7b
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core dolphin-2.5-mixtral-8x7b
🐍 mamba
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 mamba-chat
animagine-xl
Text to Image
docker run -ti -p 8080:8080 -e COMPEL=0 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 animagine-xl
transformers-tinyllama
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 transformers-tinyllama
codellama-7b
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12 codellama-7b
codellama-7b-gguf
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core codellama-7b-gguf
hermes-2-pro-mistral
LLM
docker run -ti -p 8080:8080 --gpus all localai/localai: 🖼️ v2.13.0 - Model gallery edition-cublas-cuda12-core hermes-2-pro-mistral
Tip You can actually specify multiple models to start an instance with the models loaded, for example to have both llava and phi-2 configured:
docker run -ti -p 8080:8080 localai/localai: 🖼️ v2.13.0 - Model gallery edition-ffmpeg-core llava phi-2
Last updated
28 Mar 2024, 12:42 +0100
. history