Page 11 - AI Vol 1: Foundations of AI
P. 11

parameters. Although OpenAI has not confirmed

            the number of parameters in GPT-4, it is thought       THROUGH REPEATED EXPOSURE
            to  exceed  1  trillion  parameters.  Even  newer,           TO DIVERSE LANGUAGE
            “small” LLMs, like Microsoft’s Phi-2, still contain          PATTERNS, THE MODEL
            billions of parameters. While imagining a complex       PROGRESSIVELY IMPROVES ITS
            mathematical  formula  is  helpful  for  visualizing     ABILITY TO UNDERSTAND AND
            how an LLM functions, it is worth keeping in mind                 GENERATE TEXT.
            the actual vast scale of these models.


            In practice, an LLM is given input data (typically   adjusts its weights and biases to reduce this error.

            the  user  prompt)  which  is  processed  by  this   This process is a bit like tweaking the variables
            complex  “formula”  to  predict  subsequent  text    in our imagined formula to get a more accurate
            (the LLM’s output). The LLM training process         result.  This  process  is  repeated  with  millions
            fundamentally involves the model practicing this     of  documents,  with  each  iteration  refining  the
            process, learning from mistakes, and adjusting       model’s parameters. Each complete pass through
            its parameters (the variables in our analogy) to     the  training  data,  or  “epoch,”  further  refines
            produce better predictions in the future, thereby    the  model’s  performance.  Through  repeated

            learning and internalizing the language patterns     exposure to diverse language patterns, the model
            from its training data.                              progressively improves its ability to understand
                                                                 and generate text.
            LLMs are trained  using vast data sets of text,
            which often include books, articles, websites, and   Throughout  this  training  phase,  various
            other written material. This text provides the raw   techniques are employed to optimize the model’s
            material from which the model learns language        performance.  This  includes  adjusting  the
            as well  as the  information  and,  potentially, the   model’s architecture, fine-tuning parameters, and
            underlying logic patterns contained in these data    employing strategies to handle overfitting (where

            sets. Accordingly, the quality of the training set   a model becomes too tailored to the training data
            is  critical  for  the  model’s  performance.  Before   and loses its generalization ability).
            training, this data undergoes preprocessing which
            involves cleaning and organizing the data, such as   This process enables LLMs to learn, not just the
            removing irrelevant information, correcting errors   ability to generate text, but understand and learn
            (where feasible), and standardizing formats.         the  intricacies  of human  language,  particularly
                                                                 the components of syntax and semantics. LLMs

            The training process begins by inputting a portion   observe how words are commonly ordered, the
            of a document into the LLM. The model then uses      structure  of sentences,  and the  use of various
            its  current  parameters  (weights  and  biases)  to   grammatical  elements, allowing the LLMs to
            predict the next part of the document. After making   generate  text  that  is not only grammatically
            a prediction, the model compares its output with     correct but also stylistically  consistent with the
            the actual text. If there is a discrepancy, the model   input  they  receive.  LLMs  learn  semantics  by




     FOUNDATIONS OF AI  |  LOZANOSMITH.COM                                                                 VOLUME  1    |   11
   6   7   8   9   10   11   12   13   14   15   16