Downloading an LLM model manually

Download pre-trained data models in GGUF format.

Before you begin

Alternatively, you can configure dominoiq.nsf to have the Domino IQ administration server download LLM models for you. In a Domino Domain with Domino IQ servers running in both local and remote mode, a remote mode Domino IQ server shouldn't be made the Administration server of Domino IQ. In such deployment scenarios, the model download and copy from the Administration server won't work. For more information, see
Adding an LLM Model document.

About this task

Make sure that you have more available system RAM than the size of the model.

Procedure

  1. Go to theModels tab on the Hugging Face site.
  2. Filter the downloadable GGUF models by selectingText Generation under tasks,GGUF under Libraries, any specific language, and specific license types under Licenses. Here is anexample of the results if you filter TextGeneration GGUF files in English with MIT license.
  3. Select a model that fits your application needs. If you selected the llama.3.x license, you can choose, for example, thellmstudio-community/Llama-3.2-1B-Instruct-GGUFf model.
  4. On the same page, select one of the available bit quantized models. We recommend using 3b or 7b llama3.x models with 3-bit or 4-bit quantization levels, which are much smaller to load with an acceptable text generation quality. Thisexample shows the metadata on the models.
  5. When you've made your choice, click theDownload button to save the file on your computer.
  6. These models in GGUF format (.gguf file extension) can be copied to the llm_models subdirectory under the Domino data directory, as described inEnabling Domino IQ servers. Make sure these files are readable by the user account running the Domino server.
  7. Save the document.