The short answer is "yes, pretty much"! As I showed in an earlier post, the HSK does a pretty good job of covering the majority of common words across all six levels, but it might be interesting to see how early on the really high frequency words are covered. The results aren't too surprising; each HSK level gives you a mix of both high frequency words, and lower frequency (but probably still very useful) words, e.g. the least frequently used word in level 1 is 汉语, although it is quite useful to be able to say the name of the language that you are learning! Nouns such as 北京 and 苹果 have relatively low usage frequency because there are so many of them, but are included in HSK 1 because an early learner's vocabulary wouldn't be much use if all he or she knew was the most common prepositions and verbs.

The two graphs below show the exact same data, just presented slightly differently- the second graph stacks the HSK  levels on top of each other. They are both histograms, with the 'buckets' on the horizontal axis showing the natural logarithm of the usage frequency of the words at each HSK level. Log frequency is used because word frequency data is very right skewed; a few words are used a lot, and the vast majority are used at very low frequency. The vertical scale shows how many words of that frequency exist at each HSK Level.


nguyễn thị thanh
03/06/2013 20:43

học tiếng trung qua mạng