Etymology Explorer
Trace the origins of words through time
Click any word in the graph to see its definition
Search for a word to see its etymological tree, or press 🎲 to get a random word
Etymology for All
Etymology should be a public good.
Many apps gatekeep etymological knowledge behind paywalls or proprietary databases. We believe the history of language belongs to everyone.
This project exists to make word origins accessible, explorable, and free. Every word has a story—discover where yours came from.
Data Sources
Etymology: EtymDB 2.1, an open etymological database derived from Wiktionary containing 1.9 million words across 2,500+ languages.
Fourrier & Sagot (2020), "Methodological Aspects of Developing and Managing an Etymological Lexical Resource", LREC 2020
Definitions: Free Dictionary API, a community-driven dictionary service also sourced from Wiktionary.
Glossary
Etymology
The study of word origins and how their meanings have changed throughout history. An etymology traces a word back through time to its earliest known form.
Ancestors
Words that a modern word inherited from or derived from. These form the direct lineage of a word through time.
Example: English "mother" ← Old English "mōdor" ← Proto-Germanic "*mōdēr" ← Proto-Indo-European "*méh₂tēr"
Cognates
Words in different languages that share a common ancestor. They evolved separately but have the same root.
Example: English "friend", German "Freund", Dutch "vriend", Gothic "frijōnds" — all from Proto-Germanic "*frijōndz"
Compound Etymology
When a word is formed from multiple source words or morphemes combined together. These are shown with blue edges in the graph.
Example: "uplander" = "upland" + "-er" (the suffix meaning "one who")
Morpheme
The smallest meaningful unit of language. Words are built from morphemes, including roots, prefixes, and suffixes.
Example: "unhappiness" contains three morphemes: "un-" (not) + "happy" (root) + "-ness" (state of)
Graph Traversal
The graph follows etymology connections recursively up to 5 levels deep, including both ancestors and cognates.
Example: "friend" connects to 12 words directly. Following each of those recursively for 5 levels yields 28 total nodes in the graph.
Language Family
A group of languages descended from a common ancestral language.
Example: English, German, Dutch, and Swedish are all part of the Germanic branch of the Indo-European family.
Proto-language
A reconstructed ancestral language that existed before writing. Linguists use the prefix "Proto-" and asterisks (*) for reconstructed forms.
Example: Proto-Indo-European (*méh₂tēr) is the reconstructed ancestor of words for "mother" across many languages, from English to Hindi.
Language Codes
Languages are identified by standardized codes from ISO 639, similar to how countries have two-letter codes (US, UK, DE). These codes help linguists and researchers categorize the world's ~7,000 languages consistently.
Examples: en = English, la = Latin, grc = Ancient Greek,
ang = Old English, gem-pro = Proto-Germanic, ine-pro = Proto-Indo-European
Why Random Words Are Obscure
Word usage follows Zipf's Law—a small number of words make up most of what we read, while thousands of rare words form a "long tail." Our random button samples uniformly, giving you equal chances of discovering hidden gems like "cystolithic" or "auxotrophy."
How It Works
The Data Pipeline
- Source: EtymDB extracts etymology data from Wiktionary
- Curation: We filter for clean English words with valid etymology links (~40K words)
- Language metadata: Each word is tagged with its language family (e.g., "Germanic → Indo-European")
- Definitions: Enriched from Free Dictionary API (~21K definitions)
Reading the Graph
- Arrows point from modern words to their ancestors
- Click any word to see its language family and definition
- Language families show how languages are historically related
Definition Matching
EtymDB provides a sense field for each word entry (e.g., "bank" might have senses like "financial institution" or "side of a river"). When the sense differs from the word itself, we display it directly.
Key assumption: When a word's sense equals its lexeme (e.g., sense="bank" for word "bank"), we fall back to the first definition from the Free Dictionary API. We assume this primary definition corresponds to the word's main etymological meaning. This may not always be accurate for words with multiple distinct origins.
Limitations
Not all words have definitions available. Some etymology connections may be incomplete or reflect Wiktionary's editorial choices. Compound word breakdowns (e.g., "magn-animus") are not yet supported. Definition matching between EtymDB senses and dictionary entries is approximate—there is no shared identifier between the two data sources.