Home
Glossary
A
Term |
Description |
A100 |
Ampere 100, A GPU variant named after French mathematician and physicist André-Marie Ampère |
AGI |
Artificial general intelligence, A concept that suggests a more advanced version of AI than we know today, one that can perform tasks much better than humans while also teaching and advancing its own capabilities. |
ALIGN |
A Large-scale ImaGe and Noisy-Text Embedding, 1.8 Billion Image-Text pairs dataset by Google. |
AWQ |
Activation-aware Weight Quantization, an alogorithm for quantizing LLMs like GPTQ. |
B
Term |
Description |
BART |
Bidirectional and Auto-Regressive Transformers, an LLM by Google. |
BELEBELE |
A Bambara word meaning "big, large, fat, great". This is a dataset containing 900 unique multiple-choice reading comprehension questions, each associated with one of 488 distinct passages |
BERT |
Bidirectional Encoder Representations from Transformers, an LLM by Google. |
BIG-Bench |
Beyond the Imitation Game Benchmark, a benchmark for measuring the performance of language models across a diverse set of tasks. |
BiT |
Big Transfer, a family of transfer learning models pre-trained on large datasets. |
BLEU |
BiLingual Evaluation Understudy, a metric for evaluating a generated sentence to a reference sentence. |
BLOOM |
BigScience Large Open-science Open-access Multilingual Language Model |
BPE |
Byte Pair Encoding, a tokenization method. |
C
Term |
Description |
C4 |
The Colossal Clean Crawled Corpus, a dataset of 800GB of English text collected from the web. |
Chinchilla |
Chinchilla is a 70B parameters model trained as a compute-optimal model with 1.4 trillion tokens by Deepmind. |
CLIP |
Contrastive Language-Image Pre-training, maps data of different modalities, text and images, into a shared embedding space. |
CoQA |
CoQA is a large-scale dataset for building Conversational Question Answering systems. CoQA contains 127,000+ questions with answers collected from 8000+ conversations. |
D
Term |
Description |
DALL-E |
It is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí. |
DPO |
Direct Preference Optimization, a new parameterization of the reward model in RLHF with only a simple classification loss |
DPR |
Dense Passage Retrieval |
E
Term |
Description |
ELMo |
Embeddings from Language Models |
ERNIE |
Enhanced Representation through kNowledge IntEgration |
ELECTRA |
Efficiently Learning an Encoder that Classifies Token Replacements Accurately |
F
Term |
Description |
FAIR |
Facebook AI Research |
FLAN |
Fine tuning Language models |
FLOPS |
Floating Point Operations Per Second |
FLoRes |
Facebook Low Res Machine Translation Benchmark is a low-resource MT dataset. |
G
Term |
Description |
GAVIE |
GPT4-Assisted Visual Instruction Evaluation, an approach to evaluate visual instruction tuning without the need for human-annotated groundtruth answers and can adapt to diverse instruction formats. |
GGML |
Georgi Gerganov Machine Learning, a C library focused on machine learning |
GLaM |
Generalist Language Model, a family of language models which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants. |
GSM8K |
Grade School Math 8K, GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers. |
GOFAI |
Good Old-Fashioned Artificial Intelligence |
H
Term |
Description |
HNSW |
Hierarchical Navigable Small Worlds |
HH-RLHF |
Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback |
I
Term |
Description |
ILSVRC2012 |
ImageNet Large Scale Visual Recognition Challenge 2012, a competition to estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training. |
J
Term |
Description |
JFT |
JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback. |
L
Term |
Description |
LAION-400M |
Large-scale Artificial Intelligence Open Network, an open dataset of CLIP-Filtered 400 Million Image-Text Pairs |
LaMDA |
Language Model for Dialogue Applications |
LCM |
Latent Consistency Models |
LLaMA |
Large Language Model Meta AI |
LLaSM |
Large Language and Speech Model |
LLM |
Large Language Model: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language. |
LLaVA |
Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general purpose visual and language understanding. |
LMM |
Large Multimodal Models, models for visual instructions like DeepMind’s Flamingo, Google’s PaLM-E, Salesforce’s BLIP, Microsoft’s KOSMOS-1, Tencent’s Macaw-LLM; Chatbots like ChatGPT and Gemini are LMMs. |
LoRA |
Low Rank Adaptation, a fine-tuning method that uses low-rank matrices to adapt a pre-trained model to a new task. |
LRV |
Large-scale Robust Visual, a large and diverse visual instruction tuning dataset. |
M
Term |
Description |
M3W |
Multi Modal Massive Web, an image and text dataset by DeepMind. This is used to train Flamingo, a multimodal LLM. |
MAWPS |
A Math Word Problem Repository is an online repository of Math Word Problems, to provide a unified testbed to evaluate different algorithms. |
ML |
Machine Learning, A component in AI that allows computers to learn and make better predictive outcomes without explicit programming. Can be coupled with training sets to generate new content. |
MLP |
Multi Level Perceptron, a deep artificial neural network. It is a collection of more than one perceptron. |
MLLM |
Multimodal Large Language Model |
MLM |
Masked Language Model |
MMLU |
Massive Multitask Language Understanding, a new test to measure a text model's multitask accuracy |
MRC |
Machine Reading Comprehension |
MTPB |
Multi-Turn Programming Benchmark, a benchmark consisting of 115 diverse problem sets that are factorized into multi-turn prompts |
N
Term |
Description |
NEFTune |
Noisy Embedding Instruction Fine Tuning, an algorithm which suggests adding noise to the embedding layer during forward pass of fine tuning. |
NeurIPS |
Neural Information Processing System |
NLP |
Natural language processing. A branch of AI that uses machine learning and deep learning to give computers the ability to understand human language, often using learning algorithms, statistical models and linguistic rules. |
NLG |
Natural Language Generation. A branch of AI that uses machine learning and deep learning to generate human-like language. |
NLU |
Natural Language Understanding, to understand the relationship and meaning in text data. |
NSP |
Next Sentence Prediction |
P
Term |
Description |
PALM |
Pathways Language Model |
PEFT |
Parameter Efficient Fine-Tuning |
POMDP |
Partially Observable Markov Decision Process, a model for decision making in situations where outcomes are partly random and partly under the control of a decision maker. |
POPE |
Polling-based Object Probing Evaluation, an evaluation metric for probing the knowledge of LVLMs. Code and Data |
PPO |
Proximal Policy Optimization, foundational RL algorithm for learning from human preferences |
Q
Term |
Description |
QLoRA |
Quantized Low Rank Adaptation, a fine-tuning method that combines Quantization and LoRA (Low-Rank Adapters). |
Quantization |
Quantization is the process of reducing the numerical precision of a model's tensors, making the model more compact and the operations faster in execution. |
R
Term |
Description |
RAG |
Retriever-Augmented Generation is an AI framework that combines an information retrieval component with a text generation model to improve the quality of responses generated by LLMs. |
ResNet |
A Residual Neural Network (a.k.a. Residual Network, ResNet) is a deep learning model in which the weight layers learn residual functions with reference to the layer inputs. |
RLHF |
Reinforcement Learning from Human Feedback |
RoBERTa |
Robustly Optimized BERT Approach |
ROGUE |
Recall-Oriented Understudy for Gisting Evaluation, a metric for evaluating a generated sentence to a reference sentence. |
RoPE |
Rotary Position Embedding, an upgrade to traditional sinusodial positional embedding on Transformer architecture. Check this video for more details. |
S
Term |
Description |
SFT |
Supervised Fine Tuning, A fine-tuning method especially in LLMs |
SQuAD |
Stanford Question Answering Dataset |
SVAMP |
Simple Variations on Arithmetic Math word Problems is a challenge set to enable more robust evaluation of automatic MWP (Math Word Problem) solvers |
SLM |
It can be both Small Language Model or Statistical Language Model |
T
Term |
Description |
T5 |
Text to Text Transfer Transformer |
TRL |
Transformer Reinforcement Learning, a framework for training and evaluating RL agents in the context of language generation tasks. |
V
Term |
Description |
VIGC |
Visual Instruction Generation and Correction, a framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly. |
ViT |
Vision Transformer, a vision model based as closely as possible on the Transformer architecture originally designed for text-based tasks. |
VLU |
Vision Language Understanding, like Natural Language Understanding (NLU) but for images |
VRAM |
Video Random Access Memory, a special type of memory that stores graphics data for the GPU. |
W
Term |
Description |
Woodpecker |
Woodpecker is a training-free five steps method to correct hallucinations from MLLMs. |
X
Term |
Description |
XLM |
Cross-lingual Language Models |
XLU |
Cross-lingual Understanding |
XNLI |
Cross-lingual Natural Language Inference |
XLNet |
Generalized Autoregressive Pretraining for Language Understanding |
Z
Term |
Description |
ZeRO |
Zero Redundancy Optimizer |
Note: PRs are accepted. Feel free to add more terms and their details.