Home
Glossary
A
| Term |
Description |
| A100 |
Ampere 100, A GPU variant named after French mathematician and physicist André-Marie Ampère |
| AGI |
Artificial general intelligence, A concept that suggests a more advanced version of AI than we know today, one that can perform tasks much better than humans while also teaching and advancing its own capabilities. |
| ALIGN |
A Large-scale ImaGe and Noisy-Text Embedding, 1.8 Billion Image-Text pairs dataset by Google. |
| AWQ |
Activation-aware Weight Quantization, an alogorithm for quantizing LLMs like GPTQ. |
B
| Term |
Description |
| BART |
Bidirectional and Auto-Regressive Transformers, an LLM by Google. |
| BELEBELE |
A Bambara word meaning "big, large, fat, great". This is a dataset containing 900 unique multiple-choice reading comprehension questions, each associated with one of 488 distinct passages |
| BERT |
Bidirectional Encoder Representations from Transformers, an LLM by Google. |
| BIG-Bench |
Beyond the Imitation Game Benchmark, a benchmark for measuring the performance of language models across a diverse set of tasks. |
| BiT |
Big Transfer, a family of transfer learning models pre-trained on large datasets. |
| BLEU |
BiLingual Evaluation Understudy, a metric for evaluating a generated sentence to a reference sentence. |
| BLOOM |
BigScience Large Open-science Open-access Multilingual Language Model |
| BPE |
Byte Pair Encoding, a tokenization method. |
C
| Term |
Description |
| C4 |
The Colossal Clean Crawled Corpus, a dataset of 800GB of English text collected from the web. |
| Chinchilla |
Chinchilla is a 70B parameters model trained as a compute-optimal model with 1.4 trillion tokens by Deepmind. |
| CLIP |
Contrastive Language-Image Pre-training, maps data of different modalities, text and images, into a shared embedding space. |
| CoQA |
CoQA is a large-scale dataset for building Conversational Question Answering systems. CoQA contains 127,000+ questions with answers collected from 8000+ conversations. |
D
| Term |
Description |
| DALL-E |
It is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí. |
| DPO |
Direct Preference Optimization, a new parameterization of the reward model in RLHF with only a simple classification loss |
| DPR |
Dense Passage Retrieval |
E
| Term |
Description |
| ELMo |
Embeddings from Language Models |
| ERNIE |
Enhanced Representation through kNowledge IntEgration |
| ELECTRA |
Efficiently Learning an Encoder that Classifies Token Replacements Accurately |
F
| Term |
Description |
| FAIR |
Facebook AI Research |
| FLAN |
Fine tuning Language models |
| FLOPS |
Floating Point Operations Per Second |
| FLoRes |
Facebook Low Res Machine Translation Benchmark is a low-resource MT dataset. |
G
| Term |
Description |
| GAVIE |
GPT4-Assisted Visual Instruction Evaluation, an approach to evaluate visual instruction tuning without the need for human-annotated groundtruth answers and can adapt to diverse instruction formats. |
| GGML |
Georgi Gerganov Machine Learning, a C library focused on machine learning |
| GLaM |
Generalist Language Model, a family of language models which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants. |
| GSM8K |
Grade School Math 8K, GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers. |
| GOFAI |
Good Old-Fashioned Artificial Intelligence |
H
| Term |
Description |
| HNSW |
Hierarchical Navigable Small Worlds |
| HH-RLHF |
Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback |
I
| Term |
Description |
| ILSVRC2012 |
ImageNet Large Scale Visual Recognition Challenge 2012, a competition to estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training. |
J
| Term |
Description |
| JFT |
JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback. |
L
| Term |
Description |
| LAION-400M |
Large-scale Artificial Intelligence Open Network, an open dataset of CLIP-Filtered 400 Million Image-Text Pairs |
| LaMDA |
Language Model for Dialogue Applications |
| LCM |
Latent Consistency Models |
| LLaMA |
Large Language Model Meta AI |
| LLaSM |
Large Language and Speech Model |
| LLM |
Large Language Model: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language. |
| LLaVA |
Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general purpose visual and language understanding. |
| LMM |
Large Multimodal Models, models for visual instructions like DeepMind’s Flamingo, Google’s PaLM-E, Salesforce’s BLIP, Microsoft’s KOSMOS-1, Tencent’s Macaw-LLM; Chatbots like ChatGPT and Gemini are LMMs. |
| LoRA |
Low Rank Adaptation, a fine-tuning method that uses low-rank matrices to adapt a pre-trained model to a new task. |
| LRV |
Large-scale Robust Visual, a large and diverse visual instruction tuning dataset. |
M
| Term |
Description |
| M3W |
Multi Modal Massive Web, an image and text dataset by DeepMind. This is used to train Flamingo, a multimodal LLM. |
| MAWPS |
A Math Word Problem Repository is an online repository of Math Word Problems, to provide a unified testbed to evaluate different algorithms. |
| ML |
Machine Learning, A component in AI that allows computers to learn and make better predictive outcomes without explicit programming. Can be coupled with training sets to generate new content. |
| MLP |
Multi Level Perceptron, a deep artificial neural network. It is a collection of more than one perceptron. |
| MLLM |
Multimodal Large Language Model |
| MLM |
Masked Language Model |
| MMLU |
Massive Multitask Language Understanding, a new test to measure a text model's multitask accuracy |
| MRC |
Machine Reading Comprehension |
| MTPB |
Multi-Turn Programming Benchmark, a benchmark consisting of 115 diverse problem sets that are factorized into multi-turn prompts |
N
| Term |
Description |
| NEFTune |
Noisy Embedding Instruction Fine Tuning, an algorithm which suggests adding noise to the embedding layer during forward pass of fine tuning. |
| NeurIPS |
Neural Information Processing System |
| NLP |
Natural language processing. A branch of AI that uses machine learning and deep learning to give computers the ability to understand human language, often using learning algorithms, statistical models and linguistic rules. |
| NLG |
Natural Language Generation. A branch of AI that uses machine learning and deep learning to generate human-like language. |
| NLU |
Natural Language Understanding, to understand the relationship and meaning in text data. |
| NSP |
Next Sentence Prediction |
P
| Term |
Description |
| PALM |
Pathways Language Model |
| PEFT |
Parameter Efficient Fine-Tuning |
| POMDP |
Partially Observable Markov Decision Process, a model for decision making in situations where outcomes are partly random and partly under the control of a decision maker. |
| POPE |
Polling-based Object Probing Evaluation, an evaluation metric for probing the knowledge of LVLMs. Code and Data |
| PPO |
Proximal Policy Optimization, foundational RL algorithm for learning from human preferences |
Q
| Term |
Description |
| QLoRA |
Quantized Low Rank Adaptation, a fine-tuning method that combines Quantization and LoRA (Low-Rank Adapters). |
| Quantization |
Quantization is the process of reducing the numerical precision of a model's tensors, making the model more compact and the operations faster in execution. |
R
| Term |
Description |
| RAG |
Retriever-Augmented Generation is an AI framework that combines an information retrieval component with a text generation model to improve the quality of responses generated by LLMs. |
| ResNet |
A Residual Neural Network (a.k.a. Residual Network, ResNet) is a deep learning model in which the weight layers learn residual functions with reference to the layer inputs. |
| RLHF |
Reinforcement Learning from Human Feedback |
| RoBERTa |
Robustly Optimized BERT Approach |
| ROGUE |
Recall-Oriented Understudy for Gisting Evaluation, a metric for evaluating a generated sentence to a reference sentence. |
| RoPE |
Rotary Position Embedding, an upgrade to traditional sinusodial positional embedding on Transformer architecture. Check this video for more details. |
S
| Term |
Description |
| SFT |
Supervised Fine Tuning, A fine-tuning method especially in LLMs |
| SQuAD |
Stanford Question Answering Dataset |
| SVAMP |
Simple Variations on Arithmetic Math word Problems is a challenge set to enable more robust evaluation of automatic MWP (Math Word Problem) solvers |
| SLM |
It can be both Small Language Model or Statistical Language Model |
T
| Term |
Description |
| T5 |
Text to Text Transfer Transformer |
| TRL |
Transformer Reinforcement Learning, a framework for training and evaluating RL agents in the context of language generation tasks. |
V
| Term |
Description |
| VIGC |
Visual Instruction Generation and Correction, a framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly. |
| ViT |
Vision Transformer, a vision model based as closely as possible on the Transformer architecture originally designed for text-based tasks. |
| VLU |
Vision Language Understanding, like Natural Language Understanding (NLU) but for images |
| VRAM |
Video Random Access Memory, a special type of memory that stores graphics data for the GPU. |
W
| Term |
Description |
| Woodpecker |
Woodpecker is a training-free five steps method to correct hallucinations from MLLMs. |
X
| Term |
Description |
| XLM |
Cross-lingual Language Models |
| XLU |
Cross-lingual Understanding |
| XNLI |
Cross-lingual Natural Language Inference |
| XLNet |
Generalized Autoregressive Pretraining for Language Understanding |
Z
| Term |
Description |
| ZeRO |
Zero Redundancy Optimizer |
Note: PRs are accepted. Feel free to add more terms and their details.