Hasn't it already become kitsch that 10,000 of something can get you mastery over something else, or something ? Well, I'm no linguist (and my friends will tell you how slow I am at learning languages), but it seems that quite less than that number of words can give you a significant chunk of a language. How many less and how much of a chunk? Read on.





I looked into this because I wanted to boost my Hebrew skills, but i didn't know which words were the most important to learn, and in what order. I was memorizing words randomly and quite arbitrarily, picked up from books, conversations, and the like, and often would never hear the words in usage again. This felt frustrating and futile. It is often said (and makes perfect sense) that the most beneficial words to study are those that are used the most often in speech. This led me to a quandary, because I couldn't find any ready resource with a list of the most common spoken words in modern Hebrew.





I was discussing this with a friend over lunch one day and we came upon an idea: text subtitles in movies are nearly 100% dialogue, so movie subtitle files are perfect candidates for looking at frequency of spoken word usage. Luckily, my friend is an avid movie hoarder, and offered me a large supply of movie subtitle files on the spot. By aggregating enough such files together, we figured it should be possible to parse out individual words and get a pretty good idea of how often different words are used in bone-fide, modern, spoken Hebrew. So that's what I went ahead and did.





Aside from being extremely useful for my own studies, I found the results of this little experiment to be rather fascinating. To see why, take a look at the most common Hebrew words that came up on my list, in order from 1st to 36th:





hebrew word meaning (google translate) pronunciation cumulative % of word usage covered לא Not / no loh 2.8 את You (f) / * at / eht 5.4 אני I anee 7.9 זה It / that zeh 9.8 אתה You (m) atah 11.1 מה What mah 12.4 הוא He hoo 13.2 לי To me lee 14.1 על About / On al 14.9 לך To you lekhah / lakh (m/f) 15.5 כן Yes ken 16.1 של Of shel 16.7 יש There is yaysh 17.2 רוצה Want rotzeh / rotzah (m/f) 17.6 טוב Good tov 18.1 כל All kol 18.5 אבל But a-vahl 19.0 בסדר OK beseder 19.4 אם If eem 19.8 שלי My shel-ee 20.2 עם With eem 20.6 יודע Knows yo-dey-ah 21.0 היא She he 21.4 היה Was hayah 21.7 שלך Your shelkhah / shelakh (m/f) 22.1 הם They hehm 22.4 אותך You oh-takh 22.8 אז Then / so ahz 23.1 אותו Same oh-toh 23.4 רק Only rahk 23.7 אנחנו We ah-nakh-noo 23.9 יותר More yoh-tair 24.2 יכול Might / Can yah-kol 24.5 אותי Me oh-tee 24.7 למה Why lamah 25.0





(the * is for the 2nd meaning of the word 'eht,' which is as an article for direct definite objects. For example, to say " I ate the banana" in Hebrew, you would say something like: "I ate eht the banana"..).





Note the last column in the table, which marks the cumulative percentage of total word usages (out of >600,000) that are accounted for by each individual word plus all the words preceding it on the list. The amazing thing is that with only 36 words, we have covered 25% of total word usages in the language!





That's 36 words. How much comprehension will we have if we dutifully study and learn, say, 10,000 words? The answer is: around 80%. If you learn words in the order they appear on my list, here's how much more Hebrew you will have gained from your effort in learning each individual new word (i.e., the marginal amount of Hebrew gained per word):













It looks pretty good up until around 100 words, at which point you're already drowning in an abyss of seemingly futile vocabulary. Of course, this is somewhat misleading, since multiple forms of the same verb will come up separately on this list, and there's also much more to learn in a language than just vocabulary. Despite usage frequencies, not all words are equal. For example, if you're visiting Israel, you'll probably be most interested in knowing how to say 'bathroom' (sheh-roo-teem) and 'food' (okhel), among many others. But even so, if you want to put in minimal effort to get maximal lingual return, you're probably best off with my list.





If you want this list for the top 10,000 words in Hebrew (actually more), you can download it as an excel file by clicking HERE





Enjoy!







































