Large Language Models#

NeMo Framework has everything needed to train Large Language Models, including setting up the compute cluster, downloading data, and selecting model hyperparameters. NeMo 2.0 usesNeMo-Run to make it easy to scale LLMs to thousands of GPUs.

The following LLMs are currently supported in NeMo 2.0:

Default configurations are provided for each model. The default configurations provided are outlined in the model-specific documentation linked above. Every configuration can be modified in order to train on new datasets or test new model hyperparameters.

Traininglong context models, or extending the context length of pre-trained models is also supported in NeMo:

For information ondeploying LLMs:

LLM Deployment Overview