How many new words should you teach per lesson?

Introduction – The wrong question

The question in the title is one of the most common ones I am asked by colleagues from all corners of the globe. And whenever I have googled that question in the past ten years I have always invariably found the same answer crop up in EFL and MFL forums, blogs and websites: 8 to 10 words per contact hour. I have always wondered where those numbers came from as there is no consensus amongst researchers as to what constitutes an ideal number of new words to teach per lesson. Unsurprisingly so. As I will argue below, it is impossible to answer the question with a precise figure unless we define clearly what we mean by ‘teaching’ and ‘learning’ new words and have a 360-degree awareness of the target learning contexts with their unique interaction of affective and cognitive factors as well as other important individual variables such as the methodology in use, available resources, logistics, timelines, socio-economic factors, etc.

I personally ‘teach’ 20 to 25 words minimum per lesson, but what the word ‘teach’ means to me may not be what other colleagues take it to mean.

The good news 

The good news for MFL teachers in England and Wales is that by the end of a typical GCSE course the estimated vocabulary size of a successful MFL student should be 2,000 words at GCSE Higher and 1,000 at GCSE Lower (Milton, 2006). If we divide that number by 5 years of learning French (from yr 7 to yr 11) two hours per week, that would equate with, 5.2 words per lesson, in truth a very manageable burden. In 2006, however, the national average showed that GCSE students in English state schools had accrued a vocabulary amounting to less than 1,000 words each (see picture, below, from Milton, 2006).


Why the title question is the wrong question to ask yourself

In deciding how many words to teach per lesson one has to take into account a number of contextual factors which play a decisive role in vocabulary acquisition and, more importantly, the depth and range of one’s learning intentions. The question ‘How many words should I teach?’ cannot be answered unless we first consider the following :

(1) Depth of knowledge – Knowing a word entails knowing many things about the word: its literal meaning, its various connotations, its spelling, its derivations, collocations (knowing the words that usually co-occur with the target word), frequency, pronunciation, the syntactic constructions it is used in, the morphological options it offers and a rich variety of semantic associates such as synonyms, antonyms, homonyms (Nagy and Scott, 2000). How deep one intends to go will entail spending more time hence teaching fewer words.E.g., if I teach a set of French irregular adjectives in terms of how they change from masculine to feminine, rather than just focusing on their main meaning and pronunciation of the masculine form, I will evidently have less time which will in turn limit the amount of words I can teach.

(2) Receptive vs Productive knowledge – as Nation (1990) notes vocabulary items in the learners’ receptive vocabulary might not be readily available for productive purposes, since vocabulary reception does not guarantee production. In other words, students may learn to recognize words whilst not being able to use them in speech or in writing. This difference is often overlooked whilst is crucial in planning a vocabulary lesson. If one is planning to simply teach new words for receptive use, they can teach, in my experience, as many as 40 with an able group, as recognition – especially through the written medium – is easier than production.

Moreover, although they are both receptive modalities, learning vocabulary through listening and reading obviously require providing students with two different types of extensive training which means that if you really aim to thoroughly develop the two skill sets – as you should – you will inevitably have less time available.

(3) Speed of recognition and production and degree of contextualisation – When we talk of recognition and production we need to consider (a) the element of speed and  (b) the ability to understand the target words in unfamiliar contexts as markers of mastery . The faster a student recognizes a word (in familiar and unfamiliar contexts) as heard or read will tell us to what degree it has been automatized. The same applies to written and oral production (the hardest to automatize).

A vocabulary item can only be said to be fully acquired when it can be produced spontaneously (and correctly) within the context it was taught as well as unfamiliar contexts. With this in mind, to say ‘I taught ten words in yesterdays’ lesson’ is flawed. I may have presented those words and got the students to practise them and maybe they could recall them in isolation at the end of the lesson or even in one or more sentences. However, that does not mean the words have been learnt, because words are never used in isolation and not simply in two or three sentences learned by rote. Moreover, acquiring a vocabulary item takes weeks and in certain cases even months of practice in context.

(4) Word learnability – the learnability of the target word places further constraints on the number of words one decides to teach. ‘Learnability’ refers to the level of challenge a word poses to the learner. For instance:  long polysyllabic words with unfamiliar phonemes will be harder for beginners to retain; abstract and connotative words are  usually more difficult to acquire than concrete and denotative lexis; cognates are easier to recognize, etc. When deciding how many words to teach, the learnability factor is crucial.

(5) Shallow vs Deep processing –  the method you use will also play an important role in deciding how many words you aim to teach. The deeper the degree of semantic processing the more likely the students are to recall them in the future. Deep processing includes activities such as: establishing association within new and old words, categorizing them; finding opposites and synonyms; writing the definition; inferencing their meanings from context; creating mnemonics to enhance future recall); odd one out; etc. Shallow processing involves little cognitive effort (e.g. learning by repeating aloud; the games). Teacher with effective vocabulary teaching methods are usually more successful at teaching larger amounts of words.

(6) Time, recycling opportunities and learning habits – the numbers of words you can teach will also depend on how many chances you can find in your lesson to recycle them. Do you have enough time, resources or activities in your repertoire for you to recycle each word you set out to teach a minimum of 5 to 8 times (through deep processing tasks) within the lesson? Do you have resources to ensure the recycling of the same items in subsequent lessons?

It takes me a lot of time and effort to create resources that allow me to effectively recycle all the target words I set  out to teach in lesson 1, as well as all the subsequent lessons in which I revisit them. The more words you aim to teach, the more the effort you will have to put in follow-up lessons to create recycling opportunities. This is something you have to factor in when you decide on the number of words to teach in a given lesson or your teaching will have been in vain.

Connected with this is the issue of homework and learning habits and strategies. Are your students the kind of learners who do your homework consistently? If you flip vocabulary learning to them, will they actually do it? What the students do at home and how effectively their learning strategies are will have an impact to on how many words you plan to teach. In the case of one of the two year 9 groups I currently teach the amount of work they do outside the classroom – not their aptitude – profoundly affects the number of words I plan to teach each day.

(7) Chunks –  The memorization of chunks is productive and powerful. It serves two objectives: it enables the student to have chunks of language available for immediate use and it also provides the student with information that can be broken down and analysed at later stages. Chunks allow you teach more words in one go as Working Memory can process chunks made up of 7+/- 2 items (Miller, 1956). Moreover, in real life we rarely process words in isolation.

The main advantage of the use of lexical chunks is that they build on the fluency of the language learner as they facilitate clear, relevant and concise language and are stored as ready-to–use units that can be retrieved and used without the need to compose on-line through word selection and grammatical sequencing. This means that there is less demand on cognitive processing capacity.

I hardly ever teach vocabulary in isolation, unless I am focusing on speed of recognition, decoding/pronunciation or spelling (e.g through the games). I always present vocabulary for the first time either through texts containing comprehensible input which allows easy inferencing from context or through sentence builders (see figure below). Teaching in chunks and short sentences allows me to recycle old material whilst presenting new material but also to include more vocabulary.


(8) Chunking and word awareness – Chunks have another important impact on how many words you will be able to teach. Once you have unpacked each chunk you taught, made the students notice the underlying grammatical pattern (e.g. I want you to go to the cinema) and got them to use that pattern over and over again with new lexical items, you will have enhanced the generative learning power of that chunk. The more morphological (e.g. prefix, suffixes) and syntactic patterns (rather than grammar rules) you teach your students the greater the chances for them to learn new words by ‘hooking’ them to those patterns. This process, known as ‘chunking’ happens in the brain at incredibly high speed in L1 acquisition and plays a crucial role in L2 vocabulary acquisition; hence, the more automatized the ability to recognize those patterns in aural and written input will be in your students, the more likely they will be to learn more words in your lessons.

Word awareness refers to a learner’s ability to ‘unpack’ the way words work both in relation to other words (synonyms, antonyms, collocations, etc.), their word class (adjectives, nouns, etc.) and how they are formed (prefixes, suffixes, etymology, similarities with mother tongue words, etc.). Word awareness promotes chunking, hence, acquisition. Creating a culture of word awareness in your classroom does not require much preparation, just asking lots of questions such as: Is it an adjective or a noun? Does this go before or after the verb? Does it remind you of a word in our language? Why does this word end in ‘-ly’?, etc. Research in word-awareness (also referred to as word-consciousness) it is still pretty scant, but many scholars believe that a strong emphasis on it in the classroom can greatly impact vocabulary acquisition. The more word -aware your students are the greater the amount of words you will be able to teach them in lessons.

(9) The students – last but not least. This is self-evident. Your students are the best source of evidence that you are gauging the amount of vocabulary input correctly. Regular low stakes assessment will tell you how much of what you have taught gets retained or lost along the way as the term advances. Online surveys through google forms or the likes will allow you to find out in a few minutes how they feel about their vocab learning, if you are being too ambitious or spot on. They can also help you find out about their learning habits.

Not all students have the same ability to learn vocabulary. Students who are low in any of the crucial components of language aptitude, especially Working Memory span and Phonemic sensitivity will be particularly disadvantaged and their presence in your class will have to be taken into account as they will be more prone to cognitive overload. Differentiated instruction will be a must in mixed ability classes.

The students’ current level of proficiency will also be an important variable to consider. The more advanced the learner is the easier for them will be to use conscious and subconscious learning strategies to acquire vocabulary. Hence you will be able to teach way more new words per lesson to your advance level students than to your GCSE ones.

Motivation is obviously another crucial factor. I am not going to discuss it as it is beyond the scope of this post. It will suffice to say that motivation enhances cognitive and affective arousal which in turns increases Working Memory span and the chances to memorize words. Hence, the more fun and relevant to your students’ lives and interests your vocabulary teaching is, the more words you will be able to teach effectively.

Concluding remarks

The issues above refer to but a few of the many factors one needs to consider in deciding how many words to teach per lesson. The most important thing I would like the reader to take home from this post is that vocabulary acquisition being a long process, planning a successful vocabulary lesson is about zooming out and thinking about the bigger picture and the longer term: what matters is not how many words you teach in a given lesson but how your subsequent teaching is going to ensure that those words will be automatized both receptively and productively by your learners across a wide range of contexts, both familiar and unfamiliar. In order to do so, the language instructor must master effective vocabulary teaching strategies, know the students well and implement skillful and systematic recycling never losing sight of the challenges that words and the contexts those words are taught in pose to the learner. A culture of word awareness that you build in day in day out through regular questioning, both metalinguistic and metacognitive in nature, will also facilitate your task and allow you to teach an increasingly larger amount of words per lesson, as your students become more alert to the morpho-syntactic properties of the target language words.Ultimately, it will be student feedback and regular low stake assessments that will tell you whether you are teaching the correct amount of words per lesson.

Ten things I did in 2016 that have significantly enhanced my teaching

The year just gone was one of the best I have ever had in terms of professional development as a teacher, researcher, writer and CPD provider. In this blog I share ten things that I have tried out in 2016 that, in my view, have significantly enhanced teaching and learning in my lessons.

 1.Doubled the exposure to receptive processing and delayed production

One major change to my teaching has involved massively increasing my students’ exposure to comprehensible input before engaging them in production. Thus, on introducing a new phoneme, grammar structure, communicative function and/or vocabulary set, I now ensure my students process the target items receptively through as wide as possible a range of listening and reading tasks which recycle them to death (narrow reading being one of my favourite reading/listening tasks).

In order to enable my students to learn from the aural and written input provided, as illustrated in the texts in figure 1, I make sure it contains lots of patterned repetitions, cognates and familiar language and contextual clues which facilitate inference (so that 95 % would be accessible without resorting to guessing or dictionaries). I also usually provide a gloss in the margin and use typographic devices to draw their attention to items I want them to notice.

Figure 1 – narrow reading texts including comprehensible input with lots of patterned repetitions and cognates


I found that massively increasing receptive processing and delaying production – often to the second lesson on a new topic – has greatly benefitted my students, both in terms of confidence and understanding of the target items, especially when the tasks carried out on the aural and written texts involve lots of pattern recognition (see point 2), recycling and, most importantly, modelling. It is important to reiterate again the distinction between reading and listening aimed at modelling and reading and listening aimed at quizzing (i.e. the typical listening/reading comprehension). In my approach, listening and reading comprehension tasks are staged only at the end of the whole process.

2.Grammar and pattern recognition through listening

The extensive research I have carried out this year has made me aware of a gap in traditional explicit grammar instruction methodology: grammar is rarely taught regularly and systematically through listening and very few – if any – published materials purporting to do that exist. In my instructional model (MARS), exemplified in this post, grammar instruction nearly always begins with modelling of target grammar use through a L.A.M. (Listening-As-Modelling ) activity such as a sentence-builders, sentence puzzles or cognitive comparison tasks.

Figure 2. Sentence puzzles


This approach to grammar instruction addresses two important skillsets involved in listening comprehension, i.e.: decoding and parsing skills. As I detailed in one of my most widely read posts. ‘Teaching grammar through listening’, the latter skillset is paramount in the Parsing phase of comprehension, when Working Memory attempts to interpret what it hears using the grammar of the language (by fitting the words identified to the surrounding linguistic context).

3.Inductive Grammar teaching

The adoption of the approach touched upon in the previous paragraph has led me to abandon deductive grammar teaching and the traditional PPP sequence (Presentation, Practice, Production). Unless I am pressed for time, I now involve the students in problem solving activities which requires them to figure out the target grammar rule(s) by themselves based on the Listening-as-Modelling activities staged. Example: I may start with a sentence puzzle modelling how the negatives are used in French; after many examples, I would ask the students, working in groups of two or three, to work out the rule and explain it on a google document or padlet wall shared with me and the rest of the class.

After this student-led discovery phase, intensive receptive processing practice will ensue through listening and reading tasks (e.g. narrow reading), grammar quizzes and puzzles, and oral interaction (find your match or find someone who with cards). Since in my approach automatizing grammar is the main purpose of grammar instruction, this receptive phase is followed by structured oral practice – see next paragraph.

Figure 3 – Find-someone-who with cards. Grid to fill in (above) and Cards (below)


4. Communicative oral drills and closed questions

In the past year I perfected and intensified the use of CDs and closed questions in an attempt to routinize the target grammar structures and vocabulary.

4.1. Communicative drills (CDs)

Communicative drills, as the figure below show, are very short and highly structured tasks which ‘force’ the students to deploy the target items as many times as you feel fit, over and over again, in the context of real-life-like situations. Unlike audiolingual drills, CDs typically include lexical items (words and chunks) of high surrender value (e.g. high frequency words or formulaic language) which are very useful in real life communication and are contextualised in the topic-at-hand.

In the teaching sequence I typically use in my lessons, M.A.R.S. (Modelling, Awareness-Raising, Receptive processing, Structured production), CDs occur in the end-phase, after lots of receptive processing as occurred which exposed the students frequently to the same language included in the CDs. I usually stage three or four different types of CDs per session.

Figure 4. 2 type of Communicative Drills I use in my lessons. Translation into French (above) and Illustrated cards game in the past tense (below)


CD 2.png

4.2 Closed questions

I have also massively increased the amount of closed questions I ask my students as, unlike some language teaching ‘gurus’ advocate, I believe that closed questions – not open questions- are key to the development of spontaneity. By the way, by ‘closed questions’ I do not simply mean ‘true or false’ or ‘Yes or No’ questions, but also questions such as ‘What is your name?’, ‘What sport do you like?’, ‘What did you do last weekend?’ – as opposed to open questions such as ‘Talk to me about yourself’.

But, why more closed questions? Firstly, because by prioritising open questions students are not pushed to diversify their vocabulary. Secondly, they do not learn much vocabulary from the questions themselves (and questions are powerful modellers of new language). Thirdy, because ‘spontaneous speakers’ are first and foremost ‘spontaneous comprehenders’. The comprehension dimension of ‘oral spontaneity’ is often neglected, although is by all accounts as important as the production dimension.

Imagine asking the open question ‘What do you do in your free time?’ the student can get away with the usual ‘I play football and go to the cinema’. However, asking students a wider range of closed questions such as ‘What sport do you do?’, ‘What tv shows do you watch?’, ‘What movies do you watch?’, ‘What social media do you use?’, ‘What do you read?’, ‘What music do you listen to?’ allows you to tap into specific areas of their vocabulary and grammar knowledge, as well as developing their Listenership. Students who are asked a lot of closed questions get constant stimulation thereby learning more language both receptively – as they decode the questions – and productively, as they retrieve the specific vocabulary needed to answer (I usually ask them to pack at least three details in each answer, however close the questions is).

Both the intensive communicative oral drilling and the use of closed questions have greatly enhanced my grammar teaching whilst allowing me to recycle old and new vocabulary.

5.Hyper-questioning and Listenership

Not only have I increased the amount of closed questions I ask in class, but I have actually made a conscious and systematic effort in every single lesson to ask more questions overall, both open, closed and yes/no or true/false ones. This has entailed:

(1) More modelling of new language in the presentation stage through questioning techniques such as ‘Either / or’ (e.g. pointing at a picture on screen: ‘Is he doing ‘X’ or ‘Y’?) Substitution, etc.;

(2) Work on increasing speed of student response to questions;

(3) Learning about question formation through modelling (e.g. question puzzles) and explicit grammar study (e.g. studying word order and hyphenation in French questions);

(4) Reading comprehension tasks which involve understanding of questions rather than statements set;

(5) Widening the questions repertoire I expose the students to;

(6) Designing oral tasks and assessments which lay more emphasis on asking questions.

Why? In order to develop the area of oral spontaneity that is usually less focused on by teachers: ‘spontaneous comprehension’ or ‘Listenership’ as it is known amongst Applied Linguists.

A regular zero-preparation activity that I have staged regularly in lessons  has involved asking questions to my students who, equipped with mini-boards, respond in writing under time constraints. Another frequent minimal preparation starter or plenary has consisted of giving my students a statement and asking them to write on mini whiteboards a possible question that statement could be the answer to.

6.My recycling tool

Last year I also created a simple recycling tool (in the figure below) that has allowed me to recycle the core grammar items more systematically over the whole school year. It consists of a spreadsheet on which I keep track of how often I practise a given grammar structure. In selecting the items on the tracking sheet and assigning them a priority I have used another strategy, discussed in the next paragraph.


7.Error-informed curriculum design and delivery

Irritated by the disproportionate emphasis lain by English schools on very time-consuming and largely ineffective dialogic corrective practices (e.g. students respond to feedback and teacher respond to student response, etc.), last year I decided to tackle learner errors from three different angles.

Firstly, as discussed in paragraph 1, by delaying production in order to pre-empt errors stemming from unfamiliarity with the target structure (as errors are often made by rushing production); secondly, as discussed in paragraph 4, by engaging students in as many structured tasks as possible before venturing in less structured ones; thirdly, by monitoring my students’ mistakes, keeping a tally of their most frequent ones and using my findings to inform my teaching.

So, whenever I went through their books or recordings or listened in as they interacted with one another during oral activities, I noted down day in day out on a google doc their more common and serious mistakes and modified my schemes of work accordingly making sure that I would tackle those issues in the lessons to come. It was not very time consuming; it reduced the time I spent writing in their books (as I was going to deal with them in class anyway) and has made me become a better observer and listener of my students’ output.                                                                        

8.The 4,3,2 technique

This technique consists of getting a student to answer the same open question (e.g. ‘What did you do last weekend) three times. At time one you will ask them to answer the question in two minutes; at time two, in one minute and a half (i.e. 3/4 of the time employed at time 1); at time three in one minute. This technique, which I reserve to discuss in a forthcoming post, has been proven to significantly enhance L2 learner oral fluency not only within the topic in which the students use the technique but in terms of overall speaking spontaneity and proficiency.

Although I have been using it with my GCSE students only for three months, it has already paid good dividends. The rationale for its success is that it helps the students – after much practice – to automatize sub-routines thereby speeding up Working Memory processing.

9. Experiments with pronunciation of problematic French endings

As part of my on-going research on decoding-skill instruction in the last part of 2016 I endeavoured to enhance my year 8 students’ ability to pronounce French word endings; more specifically I worked on the pronunciation of silent word endings (e.g. ‘s’, ‘t’, ‘e’) in French using the Micro-Listening Enhancers detailed in this PPT (put together with my colleague Dylan Vinales for a conference we delivered recently). After a baseline assessment in which identified the problematic endings I carried out instruction as follows : 10 minutes session per lesson, contextualizing the decoding-skill work within the teaching of the vocabulary-at-hand. After a whole term of such instruction, the students’ ability to pronounce the endings which were problematic at pre-test increased by an average of 70 %, with 2 of the 17 students in the class making only a couple of mistakes.

10.‘Spot the intruder’ and ‘Spot the error’ listening tasks

These tasks have been regular features in my lessons for the last nine months or so. They focus students on listening for detail like nothing else, thereby developing their bottom-up processing skills. They require very little preparation, all one has to do is doctor the lyrics of a target language song by inserting a few extra words here and there (usually small ones) or errors; students then listen to the song tasked with identifying the items planted in the text. Here is an example by Dylan Vinales which cleverly combines a number of my micro-listening enhancers including ‘Spot the intruder’ and ‘Spot the mistake’. Here is a video of Dylan and Ronan Jezequel rehearsing the song the tasks are based on in one of our classrooms at Garden International School.

To find out about my ideas on reading instruction, get hold of ‘The Language Teacher Toolkit’, the book Steve Smith and I co-authored .