Language learning progression and vocabulary lists are not the same thing. However, good lists -- those that focus on the more frequently occurring lemmas -- can be enormously helpful in efficient language learning acquisition.
How many words are learned at each CEFR level?
CEFR Levels are meant to show meaningful degrees of foreign language acquisition: A1,A2,B1,B2,C1,C2. Vocabulary comprehension can be tested along with each level, and an average amount of vocabulary learned at each level can help learners focus on word acquisition. CEFR is not based on vocabulary size but on communication skills, hardly the same.
There is some research that provides benchmarks, but the studies are in different contexts and it is difficult to generalize. In addition, there is a distinction of active vs. passive vocabulary that is hard to measure meaningfully.
One research publication did try to map levels to vocabulary and CEFR levels is Milton & Meara (2003) which provided the following table:
A1: 0-2,000 A2: 2,000-2,750 B1: 2,750-3,250 B2: 3,250-3,750 C1: 3,750-4,500 C2: 4,500-5,000
The test uses was the XLEX which can measure up to 5,000 words so C2 should actually be 4,500+ or 4,500-5,000+. This scheme has pedagogical value (namely, justifies learning a number of words at each level) but is altogether too neat and tidy. The question is, how many words are needed for practical ability to continue learning a language?
Minimum number of words to enable self-learning progression
There is a minimum number of words that are needed for anyone to be able to progress on a self-initiated way, specifically via reading, but also other media. In a sense CEFR levels are nice and neat but really address a testing need, rather than learning milestones.
It is also the words in a particular context that are most important. For academic learners, the low frequency technical words are as or more important that the high frequency words not in context. At the very lowest level, a minimum 500 headwords might be needed for A1. The next level that can be better measured is B2 which requires at least passive comprehension of the first 2,000 words in the General Service List. All other things being equal, this means 500 words per level: A1,A2,B1,B2. This would be at a minimum, and more applicable to those focused on academic or technical fields, rather than general fields.
What this is meant to show is that the numbers above are a bit inflated. A1 could simply be 500-1,500 words. B2 could simply be 2,000-3,000 words. C2 could indeed be the 5,000 words as noted above, which would mean adding another 1,000 words at C1. This also shows that the CEFR levels are not all equal nor should they necessarily have such discrete and standardized numbers of words partitioned across them.
Native Speaker Vocabulary Size
20,000 to 30,000 words are considered standard for a native speaker vocabulary. If a CEFR C2 is around 5,000 words, it is quite a long way from 5,000 to 20,000-30,000 words. The average 8-year-old native speaker already knows 10,000 words.
Words are not the most important metric, but rather lemmas. For 75% of daily language, only 800 lemmas are needed. For tv shows, 3,000 lemmas. For novels and quality newspapers, 8,000-9,000 lemmas. It is these lemmas that are the target of the General Service List as well as specialist vocabulary lists.
Generalist and Specialist Vocabulary Lists
Beyond the General Service List, there are three other general vocabularies that are in wide use:
- New General Service List (NGSL) 2,800 core general English words, 700 core spoken English words (NGSL-S)
- New Academic Word List (NAWL) 960 Academic words
- TOEIC Service List (TSL) 1,200 TOEIC words
- Business Service List (BSL) 1,700 Business words
Use of Vocabulary Lists in Teaching and Learning a Foreign Language
Because the lemmas efficiently leverage comprehension, focused vocabulary learning could and should make good use of them. Incidental lessons at the primary and secondary level should also use these.