Downloading an LLM model manually
Download pre-trained data models in GGUF format.
Before you begin
About this task
Make sure that you have more available system RAM than the size of the model.
Procedure
- Go to theModels tab on the Hugging Face site.
- Filter the downloadable GGUF models by selectingText Generation under tasks,GGUF under Libraries, any specific language, and specific license types under Licenses. Here is anexample of the results if you filter TextGeneration GGUF files in English with MIT license.
- Select a model that fits your application needs. If you selected the llama.3.x license, you can choose, for example, thellmstudio-community/Llama-3.2-1B-Instruct-GGUFf model.
- On the same page, select one of the available bit quantized models. We recommend using 3b or 7b llama3.x models with 3-bit or 4-bit quantization levels, which are much smaller to load with an acceptable text generation quality. Thisexample shows the metadata on the models.
- When you've made your choice, click theDownload button to save the file on your computer.
- These models in GGUF format (.gguf file extension) can be copied to the llm_models subdirectory under the Domino data directory, as described inEnabling Domino IQ servers. Make sure these files are readable by the user account running the Domino server.
- Save the document.