Since google is charging for their translate service and bing translator is crap, I decided to switch to Wiktionary. I downloaded main dump marked “en” only to discover, that it contains a bit more than just English. Below is the list of languages with their corresponding word count. Are all of those languages real?
Language | Word count |
---|---|
Latin | 582310 |
Italian | 472491 |
English | 450604 |
French | 239073 |
Spanish | 209495 |
Finnish | 104013 |
Esperanto | 99474 |
Swedish | 79173 |
German | 65890 |
Dutch | 56233 |
Catalan | 54316 |
Bulgarian | 36639 |
Translingual | 36572 |
Portuguese | 35189 |
Japanese | 33950 |
Mandarin | 31910 |
Polish | 29709 |
Hungarian | 29223 |
Greek | 22373 |
Danish | 21547 |
Lithuanian | 20202 |
Pronunciation | 19447 |
Czech | 19004 |
Galician | 16475 |
Russian | 14887 |
Georgian | 12348 |
Scottish Gaelic | 12276 |
Turkish | 9994 |
Gothic | 9305 |
Romanian | 8943 |
Armenian | 8635 |
Irish | 8398 |
Icelandic | 8064 |
Etymology | 7642 |
Latvian | 7008 |
Noun | 6290 |
Alternative forms | 5877 |
Korean | 5706 |
Ancient Greek | 5548 |
Manx | 5483 |
Ido | 5432 |
Persian | 4956 |
Hebrew | 4896 |
Telugu | 4188 |
Norwegian | 3966 |
Venetian | 3350 |
Arabic | 3304 |
Malagasy | 3292 |
Luxembourgish | 3264 |
Kurdish | 3239 |
Hindi | 3168 |
Old English | 3048 |
Faroese | 2987 |
Estonian | 2979 |
Old Armenian | 2518 |
Vietnamese | 2512 |
Norwegian Nynorsk | 2464 |
Thai | 2396 |
Old Church Slavonic | 2157 |
Lojban | 2015 |
Albanian | 2001 |
Navajo | 1981 |
Crimean Tatar | 1831 |
Bengali | 1817 |
Aramaic | 1676 |
Slovene | 1626 |
Asturian | 1598 |
Swahili | 1524 |
Classical Syriac | 1491 |
Old French | 1485 |
Welsh | 1408 |
Hiligaynon | 1342 |
Urdu | 1258 |
Middle French | 1250 |
Sanskrit | 1214 |
Adjective | 1209 |
Indonesian | 1203 |
Maltese | 1179 |
Romansch | 1153 |
Interlingua | 1108 |
Azeri | 1099 |
Verb | 974 |
Scots | 968 |
Classical Nahuatl | 921 |
Occitan | 915 |
Min Nan | 907 |
Sicilian | 904 |
Breton | 887 |
Ladino | 864 |
Basque | 836 |
Middle English | 779 |
Old Irish | 776 |
Greenlandic | 775 |
Yiddish | 774 |
Mapudungun | 707 |
Cornish | 706 |
Khmer | 665 |
Northern Sami | 665 |
Macedonian | 636 |
Ukrainian | 631 |
Afrikaans | 630 |
Tajik | 628 |
Proper noun | 613 |
Slovak | 586 |
West Frisian | 583 |
Haitian Creole | 578 |
Pashto | 571 |
Hawaiian | 562 |
Cantonese | 559 |
Lao | 559 |
Ottoman Turkish | 532 |
Old High German | 512 |
Taos | 510 |
Ewe | 504 |
Old Saxon | 497 |
Malay | 492 |
Old Norse | 490 |
Tagalog | 476 |
Chamicuro | 434 |
Tamil | 431 |
Vilamovian | 414 |
Kashubian | 413 |
Tarantino | 409 |
Neapolitan | 405 |
Dalmatian | 401 |
Ugaritic | 389 |
Tatar | 383 |
Aromanian | 382 |
Belarusian | 354 |
Chickasaw | 340 |
Baluchi | 330 |
Cherokee | 316 |
Vai | 311 |
Kannada | 304 |
Tok Pisin | 297 |
Romani | 291 |
Norman | 283 |
Limburgish | 272 |
Nahuatl | 257 |
Turkmen | 256 |
Yurok | 253 |
Low German | 245 |
Mongolian | 243 |
Tigrinya | 241 |
American Sign Language | 236 |
Burmese | 236 |
Khakas | 222 |
Old Prussian | 216 |
Ojibwe | 207 |
Lower Sorbian | 203 |
Saanich | 202 |
Walloon | 198 |
Tibetan | 196 |
Uyghur | 196 |
Adverb | 195 |
Karelian | 186 |
Uzbek | 184 |
Marshallese | 183 |
Bashkir | 175 |
Kazakh | 175 |
Wiradhuri | 173 |
Gujarati | 170 |
Sranan Tongo | 170 |
Amharic | 169 |
Egyptian | 161 |
Aragonese | 158 |
Maori | 157 |
Yucatec Maya | 155 |
Ossetian | 154 |
Chechen | 150 |
Egyptian Arabic | 150 |
Old Frisian | 150 |
Rapa Nui | 149 |
Polabian | 148 |
Inuktitut | 143 |
Kyrgyz | 139 |
Tahitian | 139 |
Corsican | 128 |
Malayalam | 127 |
Ngarrindjeri | 126 |
Samoan | 126 |
Kott | 125 |
Libyan Arabic | 125 |
Gamilaraay | 122 |
Livonian | 122 |
Central Atlas Tamazight | 117 |
Mycenaean Greek | 117 |
Inari Sami | 116 |
Old Persian | 116 |
Punjabi | 115 |
Sinhalese | 114 |
Suffix | 112 |
Yoruba | 111 |
Pitjantjatjara | 105 |
Tulu | 104 |
Old South Arabian | 101 |
Saterland Frisian | 100 |
Phoenician | 92 |
Kumyk | 91 |
Rohingya | 89 |
Akkadian | 88 |
Martuthunira | 88 |
Veps | 88 |
Guugu Yimidhirr | 86 |
Quechua | 85 |
Darkinjung | 84 |
Upper Sorbian | 84 |
Fiji Hindi | 82 |
Chamorro | 79 |
Fijian | 78 |
Santali | 78 |
Pronoun | 77 |
Skolt Sami | 74 |
Rajasthani | 73 |
Cardinal number | 72 |
Samogitian | 68 |
Alabama | 66 |
Cebuano | 66 |
Tswana | 66 |
Choctaw | 65 |
Abenaki | 63 |
Hittite | 62 |
Numeral | 61 |
Papiamentu | 61 |
Erzya | 59 |
Marathi | 59 |
Ainu | 58 |
Aleut | 58 |
Latgalian | 58 |
Lingala | 57 |
Middle Dutch | 55 |
Gooniyandi | 51 |
Seri | 51 |
Shabo | 51 |
Zulu | 51 |
Carian | 50 |
Chinook Jargon | 48 |
Luwian | 48 |
Okinawan | 48 |
Sichuan Yi | 48 |
Tocharian B | 48 |
Elfdalian | 47 |
Southern Altai | 47 |
Torres Strait Creole | 46 |
Undetermined | 46 |
Yakut | 46 |
Kabyle | 45 |
Pulaar | 45 |
Mandinka | 44 |
Bandjalang | 43 |
Evenki | 43 |
Gallo | 43 |
Abkhaz | 42 |
Alemannic German | 42 |
Amanab | 42 |
Sumerian | 42 |
Koryak | 41 |
Tuvan | 41 |
Votic | 41 |
Tocharian A | 40 |
Novial | 39 |
Phrase | 39 |
Mari | 38 |
Woiwurrung | 38 |
Northern Yukaghir | 37 |
Old Swedish | 37 |
Jingpho | 36 |
Lydian | 36 |
Pumpokol | 36 |
Sardinian | 36 |
Manchu | 35 |
Old Dutch | 35 |
Balinese | 34 |
Interjection | 34 |
Lycian | 34 |
Northern Thai | 34 |
Pali | 34 |
Arabela | 33 |
Chuvash | 33 |
Luhya | 32 |
Mirandese | 32 |
Silesian | 32 |
Cree | 31 |
Japanese | 30 |
Middle High German | 30 |
Middle Irish | 30 |
Conjunction | 29 |
Lakota | 29 |
Laz | 29 |
Lezgi | 29 |
Shan | 29 |
Arin | 28 |
Aymara | 28 |
Dacian | 28 |
Mazanderani | 28 |
Miyako | 28 |
Zhuang | 28 |
Buginese | 27 |
North Frisian | 27 |
Dutch Low Saxon | 26 |
Preposition | 26 |
Tetum | 26 |
Meriam | 25 |
Alcozauca Mixtec | 24 |
Betawi | 24 |
Igbo | 24 |
Oriya | 24 |
Sindhi | 24 |
Friulian | 23 |
Hiri Motu | 23 |
Ladin | 23 |
Somali | 23 |
Laki | 22 |
Primitive Irish | 22 |
Tai Dam | 22 |
Campidanese Sardinian | 21 |
Cappadocian Greek | 21 |
Cheyenne | 21 |
Mingrelian | 21 |
Niuean | 21 |
Old Portuguese | 21 |
Yindjibarndi | 21 |
Assamese | 20 |
Hunsrik | 20 |
Shor | 20 |
Goguryeo | 19 |
Krisa | 19 |
Moroccan Arabic | 19 |
Southern Sami | 19 |
Tongan | 19 |
Xhosa | 19 |
Adangme | 18 |
Aguaruna | 18 |
Lombard | 18 |
Luo | 18 |
Middle Low German | 18 |
Phrygian | 18 |
Aari | 17 |
Afar | 17 |
Bavarian | 17 |
Determiner | 17 |
Ga | 17 |
Gagauz | 17 |
Jamaican Creole | 17 |
Nenets | 17 |
Norfuk | 17 |
Old Georgian | 17 |
Chinese | 16 |
Dolgan | 16 |
Kriol | 16 |
Michif | 16 |
Umbrian | 16 |
Warlpiri | 16 |
Abaza | 15 |
Archi | 15 |
Extremaduran | 15 |
Nigerian Pidgin | 15 |
Piedmontese | 15 |
Acholi | 14 |
Amuzgo | 14 |
Assan | 14 |
Gheg Albanian | 14 |
Moabite | 14 |
Sundanese | 14 |
Article | 13 |
Bislama | 13 |
Coptic | 13 |
Hmong | 13 |
Nepali | 13 |
Samoan Plantation Pidgin | 13 |
Sotho | 13 |
Wolof | 13 |
Abbreviation | 12 |
Ama | 12 |
Avar | 12 |
Dhuwal | 12 |
Dungan | 12 |
Isthmus Zapotec | 12 |
Middle Persian | 12 |
Sydney | 12 |
Tachelhit | 12 |
Udi | 12 |
Warao | 12 |
Yatzachi Zapotec | 12 |
Akan | 11 |
Baure | 11 |
Dhivehi | 11 |
Gaulish | 11 |
Inupiak | 11 |
Javanese | 11 |
Kedah Malay | 11 |
Kildin Sami | 11 |
Middle Korean | 11 |
Mon | 11 |
Nias | 11 |
Oscan | 11 |
Punic | 11 |
Tambora | 11 |
Uab Meto | 11 |
Acehnese | 10 |
Aklanon | 10 |
Baekje | 10 |
Ilocano | 10 |
Lenape | 10 |
Lule Sami | 10 |
Old East Slavic | 10 |
Old Polish | 10 |
Twi | 10 |
Woi | 10 |
Aghul | 9 |
Bambara | 9 |
Berbice Creole Dutch | 9 |
Chiquitano | 9 |
Cuiba | 9 |
Hawaiian Pidgin | 9 |
Hopi | 9 |
Kongo | 9 |
Nauruan | 9 |
Nyankole | 9 |
Pennsylvania German | 9 |
Seraiki | 9 |
Wallisian | 9 |
Yola | 9 |
Zoogocho Zapotec | 9 |
Abau | 8 |
Ambonese Malay | 8 |
Ankave | 8 |
Catawba | 8 |
Chichewa | 8 |
Dzongkha | 8 |
Eblaite | 8 |
Ket | 8 |
Logudorese Sardinian | 8 |
Mauritian Creole | 8 |
Nogai | 8 |
Old Korean | 8 |
Potawatomi | 8 |
Rotuman | 8 |
Tacana | 8 |
Udmurt | 8 |
Avestan | 7 |
Comox | 7 |
Etruscan | 7 |
Kanuri | 7 |
Karakalpak | 7 |
Kiput | 7 |
Kwanyama | 7 |
Letter | 7 |
Penobscot | 7 |
Picard | 7 |
Tuvaluan | 7 |
Western Apache | 7 |
Western Arrernte | 7 |
Akkala Sami | 6 |
Arapaho | 6 |
Contraction | 6 |
Hausa | 6 |
Kalmyk | 6 |
Kven | 6 |
Minangkabau | 6 |
Number | 6 |
Old Spanish | 6 |
Shuar | 6 |
Svan | 6 |
Tiwi | 6 |
English | 5 |
Ammonite | 5 |
Angaataha | 5 |
Annobonese | 5 |
Banjarese | 5 |
Dieri | 5 |
Gilbertese | 5 |
Idiom | 5 |
Ingush | 5 |
Kamba | 5 |
Karamojong | 5 |
Kinyarwanda | 5 |
Mansi | 5 |
Moksha | 5 |
Panyjima | 5 |
Prefix | 5 |
Shoshone | 5 |
Weyewa | 5 |
Yonaguni | 5 |
Igbo | 4 |
Adyghe | 4 |
Alternative form | 4 |
Alutiiq | 4 |
Aneme Wake | 4 |
Angor | 4 |
Balti | 4 |
Chagatai | 4 |
Chol | 4 |
Croatian | 4 |
Ingrian | 4 |
Jurchen | 4 |
Kabardian | 4 |
Kapampangan | 4 |
Kashmiri | 4 |
Khitan | 4 |
Lak | 4 |
Lozi | 4 |
Luganda | 4 |
Mangarevan | 4 |
Middle Welsh | 4 |
Min Dong | 4 |
Nganasan | 4 |
Northern Sotho | 4 |
Oromo | 4 |
Pangasinan | 4 |
Rusyn | 4 |
Southern Ohlone | 4 |
Sursurunga | 4 |
Swati | 4 |
Vandalic | 4 |
Wageman | 4 |
Russian | 3 |
Spanish | 3 |
Akawaio | 3 |
Ambai | 3 |
Bishnupriya Manipuri | 3 |
Brahui | 3 |
Coastal Kadazan | 3 |
Faliscan | 3 |
Istriot | 3 |
Itelmen | 3 |
Kirundi | 3 |
Lavukaleve | 3 |
Madurese | 3 |
Makhuwa | 3 |
Marti Ke | 3 |
Old Tupi | 3 |
Ordinal number | 3 |
Shona | 3 |
Tigre | 3 |
Yanyuwa | 3 |
Yidiny | 3 |
Czech | 2 |
Abu | 2 |
Alutor | 2 |
Ansus | 2 |
Anuta | 2 |
Awabakal | 2 |
Awadhi | 2 |
Bactrian | 2 |
Beja | 2 |
Biak | 2 |
Bihari | 2 |
Bikol | 2 |
Blackfoot | 2 |
Buryat | 2 |
Central Tarahumara | 2 |
Chiricahua | 2 |
Cimbrian | 2 |
Creek | 2 |
Dakota | 2 |
Fula | 2 |
Gabi | 2 |
Garifuna | 2 |
Gronings | 2 |
Gunai | 2 |
Gusii | 2 |
Haitian Vodoun Culture Language | 2 |
Interlingue | 2 |
Kabuverdianu | 2 |
Kalami | 2 |
Kalenjin | 2 |
Limbu | 2 |
Literary Chinese | 2 |
Marsian | 2 |
Mende | 2 |
Mizo | 2 |
Mohawk | 2 |
Munggui | 2 |
Ndonga | 2 |
Newari | 2 |
Ngaju | 2 |
Ngawun | 2 |
Northern Dagara | 2 |
Paiwan | 2 |
Particle | 2 |
Rutul | 2 |
Sasak | 2 |
Sayula Popoluca | 2 |
Seneca | 2 |
Shelta | 2 |
South Picene | 2 |
Taimyr Pidgin Russian | 2 |
Tamashek | 2 |
Ter Sami | 2 |
Tumbuka | 2 |
Wu | 2 |
Zuni | 2 |
Bulgarian | 1 |
Chinese | 1 |
Esperanto | 1 |
Estonian | 1 |
French | 1 |
Hawaiian | 1 |
Hebrew | 1 |
Korean | 1 |
Latin | 1 |
Macedonian | 1 |
Mandarin | 1 |
Min Nan | 1 |
Old Norse | 1 |
Sanskrit | 1 |
Ukrainian | 1 |
Verb | 1 |
Welsh | 1 |
Abanyom | 1 |
Achumawi | 1 |
Adele | 1 |
Adioukrou | 1 |
Alamblak | 1 |
Amaimon | 1 |
Ambulas | 1 |
Antillean Creole | 1 |
Arawak | 1 |
Argobba | 1 |
Atayal | 1 |
Aynu | 1 |
Baruga | 1 |
Batak Toba | 1 |
Bikol Central | 1 |
Bintulu | 1 |
Bondei | 1 |
Bribri | 1 |
Broome Pearling Lugger Pidgin | 1 |
Bube | 1 |
Buhid | 1 |
Bunurong | 1 |
Buyeo | 1 |
Central Dusun | 1 |
Central Siberian Yupik | 1 |
Central Tagbanwa | 1 |
Chinese Pidgin English | 1 |
Chinook | 1 |
Chukchi | 1 |
Comanche | 1 |
Comorian | 1 |
Cowlitz | 1 |
Dadibi | 1 |
Dargwa | 1 |
Diitidaht | 1 |
Dogrib | 1 |
Dyirbal | 1 |
Eastern Arrernte | 1 |
Eastern Canadian Inuktitut | 1 |
Eastern Cham | 1 |
Esan | 1 |
Eteocretan | 1 |
Flemish | 1 |
Fon | 1 |
Frankish | 1 |
Gallurese Sardinian | 1 |
Gayo | 1 |
Gazi | 1 |
Golin | 1 |
Greenlandic Eskimo Pidgin | 1 |
Gullah | 1 |
Haida | 1 |
Hakka | 1 |
Hurrian | 1 |
Iban | 1 |
Initialism | 1 |
Iranun | 1 |
Kalo Finnish Romani | 1 |
Kamassian | 1 |
Kashaya | 1 |
Kaurna | 1 |
Kelabit | 1 |
Kemi Sami | 1 |
Khanty | 1 |
Khazar | 1 |
Khmu | 1 |
Khvarshi | 1 |
Kikuyu | 1 |
Kilivila | 1 |
Korean | 1 |
Kuanua | 1 |
Kuvi | 1 |
Laal | 1 |
Latvia | 1 |
Lemnian | 1 |
Leonese | 1 |
Lepontic | 1 |
Louisiana Creole French | 1 |
Lubuagan Kalinga | 1 |
Lusitanian | 1 |
Maasai | 1 |
Mainstream Kenyah | 1 |
Makah | 1 |
Mandari | 1 |
Mara Chin | 1 |
Marau | 1 |
Maricopa | 1 |
Marrucinian | 1 |
Marwari | 1 |
Massachusett | 1 |
Median | 1 |
Meru | 1 |
Monguor | 1 |
Montagnais | 1 |
Motu | 1 |
Muduapa | 1 |
Mungaka | 1 |
Ngaanyatjarra | 1 |
Nhanda | 1 |
Nivkh | 1 |
Nootka | 1 |
Old Japanese | 1 |
Old Javanese | 1 |
Old Lithuanian | 1 |
Old Welsh | 1 |
Palauan | 1 |
Papuma | 1 |
Parthian | 1 |
Pinyin | 1 |
Pohnpeian | 1 |
Polari | 1 |
Puyuma | 1 |
Rarotongan | 1 |
Rejang | 1 |
Rhymes | 1 |
Rotokas | 1 |
Russenorsk | 1 |
Russia Buryat | 1 |
Sassarese Sardinian | 1 |
Sebat Bet Gurage | 1 |
Serbian | 1 |
Sherpa | 1 |
Slovincian | 1 |
Soqotri | 1 |
Southeastern Tepehuan | 1 |
Southern Hindko | 1 |
Southern Sama | 1 |
Southern Yukaghir | 1 |
Swabian | 1 |
Syriac | 1 |
Tabassaran | 1 |
Tausug | 1 |
Tsakhur | 1 |
Tshiluba | 1 |
Tuamotuan | 1 |
Umbundu | 1 |
Ume Sami | 1 |
Usage notes | 1 |
Venda | 1 |
Venetic | 1 |
Verb form | 1 |
Vinza | 1 |
Volscian | 1 |
Wabo | 1 |
Wandamen | 1 |
Waropen | 1 |
Wawa | 1 |
West Coast Bajau | 1 |
Western Kayah | 1 |
Western Panjabi | 1 |
Winnebago | 1 |
Worimi | 1 |
Xavante | 1 |
Yaeyama | 1 |
Yami | 1 |
Yankunytjatjara | 1 |
Yapese | 1 |
Yopno | 1 |
Zenaga | 1 |