acronym community-consensus
GQA is most commonly pronounced "jee kyoo ay" (/ˌdʒiː kjuː ˈeɪ/). This is the widely-used reading among engineers, though edge cases exist.
Grouped-Query Attention — the default attention in Llama 2/3, Qwen, Gemma, GPT-OSS. Read as an initialism, letter by letter: G-Q-A, jee-kyoo-AY, STRESS on the final letter. Said-as-a-word never caught on because the Q has no following vowel, so spelling it out is the dominant convention in papers, model cards, and talks.
Alternate readings you might hear:
Source: arXiv: GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Pronouncing project and product names correctly avoids the small but persistent friction of being gently corrected during standups, conference Q&As, and team calls. Hearing the word a few times locks in the right reading better than reading IPA ever will. Pronounce is a community-maintained dictionary — every entry tagged with a confidence level and (where possible) a citable source.
Install the CLI: git clone https://github.com/anzy-renlab-ai/pronounce.git && cd pronounce && ./install.sh
This whole page exists because of a community-maintained TSV. If it saved you a cringey moment, drop a star.
★ Star on GitHub