Posted onAug 5, 2023

Closure

After a pause, this series comes to a conclusion, mostly because of the rapid developments in the area of large language models.

Original intention

At the beginning I intended to create a language model, that would have gotten a prompt "Geschirrabwaschgesetz" (a law about washing dishes) and write me a corresponding law text in German.

I was discouraged from training the originalchar RNN because of the scary amount of training time with a 110 M training data. Therefore I went with fine-tuning aGerman GPT-2 (and laterthe better one; thanks Jo!). The fine-tuning process of such a model is describedhere orhere, for example.

(Un-)expected discovery

I happened to discover that my intended case is covered perfectly by theLLAMA 2 Chat German model (almost, because of a few grammatical errors). This is very likely because of being fine-tuned with theGerman legal SQuAD dataset, among others.

I do not want to withhold the result from you (produced inLM Studio):