10 terms
Showing all terms starting with L
A compressed, abstract multi-dimensional representation of data learned by an AI model, where similar items cluster together.
A deep learning model trained on vast text datasets to understand and generate human-like language. Examples include GPT-4, Claude, and Gemini.
A mathematical measure of how far a model's predictions are from the correct answers, used to guide training via gradient descent.
Low-Rank Adaptation - an efficient fine-tuning method that injects small trainable matrices into a frozen pre-trained model, reducing compute costs.
The ground-truth answer or category associated with a training example in supervised learning, used to compute the loss during training.
A probabilistic model trained to predict the next word or token in a sequence, forming the basis for modern text AI systems.
The time delay between sending a request to an AI model and receiving the first token or full response, critical for real-time applications.
A normalisation technique applied within transformer layers that stabilises training by normalising activations across the feature dimension.
A hyperparameter controlling how much model weights are updated per training step - too high causes instability, too low causes slow convergence.
Mechanisms that allow AI agents to persist and retrieve information across multiple sessions, enabling personalisation and continuity.