Skip to content




Term Description
A100 Ampere 100, A GPU variant named after French mathematician and physicist André-Marie Ampère
AGI Artificial general intelligence, A concept that suggests a more advanced version of AI than we know today, one that can perform tasks much better than humans while also teaching and advancing its own capabilities.
ALIGN A Large-scale ImaGe and Noisy-Text Embedding, 1.8 Billion Image-Text pairs dataset by Google.
AWQ Activation-aware Weight Quantization, an alogorithm for quantizing LLMs like GPTQ.


Term Description
BART Bidirectional and Auto-Regressive Transformers, an LLM by Google.
BELEBELE A Bambara word meaning "big, large, fat, great". This is a dataset containing 900 unique multiple-choice reading comprehension questions, each associated with one of 488 distinct passages
BERT Bidirectional Encoder Representations from Transformers, an LLM by Google.
BIG-Bench Beyond the Imitation Game Benchmark, a benchmark for measuring the performance of language models across a diverse set of tasks.
BiT Big Transfer, a family of transfer learning models pre-trained on large datasets.
BLEU BiLingual Evaluation Understudy, a metric for evaluating a generated sentence to a reference sentence.
BLOOM BigScience Large Open-science Open-access Multilingual Language Model
BPE Byte Pair Encoding, a tokenization method.


Term Description
C4 The Colossal Clean Crawled Corpus, a dataset of 800GB of English text collected from the web.
Chinchilla Chinchilla is a 70B parameters model trained as a compute-optimal model with 1.4 trillion tokens by Deepmind.
CLIP Contrastive Language-Image Pre-training, maps data of different modalities, text and images, into a shared embedding space.
CoQA CoQA is a large-scale dataset for building Conversational Question Answering systems. CoQA contains 127,000+ questions with answers collected from 8000+ conversations.


Term Description
DALL-E It is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí.
DPO Direct Preference Optimization, a new parameterization of the reward model in RLHF with only a simple classification loss
DPR Dense Passage Retrieval


Term Description
ELMo Embeddings from Language Models
ERNIE Enhanced Representation through kNowledge IntEgration
ELECTRA Efficiently Learning an Encoder that Classifies Token Replacements Accurately


Term Description
FAIR Facebook AI Research
FLAN Fine tuning Language models
FLOPS Floating Point Operations Per Second
FLoRes Facebook Low Res Machine Translation Benchmark is a low-resource MT dataset.


Term Description
GAVIE GPT4-Assisted Visual Instruction Evaluation, an approach to evaluate visual instruction tuning without the need for human-annotated groundtruth answers and can adapt to diverse instruction formats.
GGML Georgi Gerganov Machine Learning, a C library focused on machine learning
GLaM Generalist Language Model, a family of language models which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants.
GSM8K Grade School Math 8K, GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers.
GOFAI Good Old-Fashioned Artificial Intelligence


Term Description
HNSW Hierarchical Navigable Small Worlds
HH-RLHF Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback


Term Description
ILSVRC2012 ImageNet Large Scale Visual Recognition Challenge 2012, a competition to estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training.


Term Description
JFT JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback.


Term Description
LAION-400M Large-scale Artificial Intelligence Open Network, an open dataset of CLIP-Filtered 400 Million Image-Text Pairs
LaMDA Language Model for Dialogue Applications
LCM Latent Consistency Models
LLaMA Large Language Model Meta AI
LLaSM Large Language and Speech Model
LLM Large Language Model: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language.
LLaVA Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general purpose visual and language understanding.
LMM Large Multimodal Models, models for visual instructions like DeepMind’s Flamingo, Google’s PaLM-E, Salesforce’s BLIP, Microsoft’s KOSMOS-1, Tencent’s Macaw-LLM; Chatbots like ChatGPT and Gemini are LMMs.
LoRA Low Rank Adaptation, a fine-tuning method that uses low-rank matrices to adapt a pre-trained model to a new task.
LRV Large-scale Robust Visual, a large and diverse visual instruction tuning dataset.


Term Description
M3W Multi Modal Massive Web, an image and text dataset by DeepMind. This is used to train Flamingo, a multimodal LLM.
MAWPS A Math Word Problem Repository is an online repository of Math Word Problems, to provide a unified testbed to evaluate different algorithms.
ML Machine Learning, A component in AI that allows computers to learn and make better predictive outcomes without explicit programming. Can be coupled with training sets to generate new content.
MLP Multi Level Perceptron, a deep artificial neural network. It is a collection of more than one perceptron.
MLLM Multimodal Large Language Model
MLM Masked Language Model
MMLU Massive Multitask Language Understanding, a new test to measure a text model's multitask accuracy
MRC Machine Reading Comprehension
MTPB Multi-Turn Programming Benchmark, a benchmark consisting of 115 diverse problem sets that are factorized into multi-turn prompts


Term Description
NEFTune Noisy Embedding Instruction Fine Tuning, an algorithm which suggests adding noise to the embedding layer during forward pass of fine tuning.
NeurIPS Neural Information Processing System
NLP Natural language processing. A branch of AI that uses machine learning and deep learning to give computers the ability to understand human language, often using learning algorithms, statistical models and linguistic rules.
NLG Natural Language Generation. A branch of AI that uses machine learning and deep learning to generate human-like language.
NLU Natural Language Understanding, to understand the relationship and meaning in text data.
NSP Next Sentence Prediction


Term Description
PALM Pathways Language Model
PEFT Parameter Efficient Fine-Tuning
POMDP Partially Observable Markov Decision Process, a model for decision making in situations where outcomes are partly random and partly under the control of a decision maker.
POPE Polling-based Object Probing Evaluation, an evaluation metric for probing the knowledge of LVLMs. Code and Data
PPO Proximal Policy Optimization, foundational RL algorithm for learning from human preferences


Term Description
QLoRA Quantized Low Rank Adaptation, a fine-tuning method that combines Quantization and LoRA (Low-Rank Adapters).
Quantization Quantization is the process of reducing the numerical precision of a model's tensors, making the model more compact and the operations faster in execution.


Term Description
RAG Retriever-Augmented Generation is an AI framework that combines an information retrieval component with a text generation model to improve the quality of responses generated by LLMs.
ResNet A Residual Neural Network (a.k.a. Residual Network, ResNet) is a deep learning model in which the weight layers learn residual functions with reference to the layer inputs.
RLHF Reinforcement Learning from Human Feedback
RoBERTa Robustly Optimized BERT Approach
ROGUE Recall-Oriented Understudy for Gisting Evaluation, a metric for evaluating a generated sentence to a reference sentence.
RoPE Rotary Position Embedding, an upgrade to traditional sinusodial positional embedding on Transformer architecture. Check this video for more details.


Term Description
SFT Supervised Fine Tuning, A fine-tuning method especially in LLMs
SQuAD Stanford Question Answering Dataset
SVAMP Simple Variations on Arithmetic Math word Problems is a challenge set to enable more robust evaluation of automatic MWP (Math Word Problem) solvers
SLM It can be both Small Language Model or Statistical Language Model


Term Description
T5 Text to Text Transfer Transformer
TRL Transformer Reinforcement Learning, a framework for training and evaluating RL agents in the context of language generation tasks.


Term Description
VIGC Visual Instruction Generation and Correction, a framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly.
ViT Vision Transformer, a vision model based as closely as possible on the Transformer architecture originally designed for text-based tasks.
VLU Vision Language Understanding, like Natural Language Understanding (NLU) but for images
VRAM Video Random Access Memory, a special type of memory that stores graphics data for the GPU.


Term Description
Woodpecker Woodpecker is a training-free five steps method to correct hallucinations from MLLMs.


Term Description
XLM Cross-lingual Language Models
XLU Cross-lingual Understanding
XNLI Cross-lingual Natural Language Inference
XLNet Generalized Autoregressive Pretraining for Language Understanding


Term Description
ZeRO Zero Redundancy Optimizer

Note: PRs are accepted. Feel free to add more terms and their details.