10 commonly made mistakes in vocabulary instruction

Please note: this post was written in collaboration with Steve Smith of http://www.frenchteacher.net and Dylan Vinales of Garden International School.


In this post I will concern myself with ten very common pitfalls of vocabulary instruction and with ways in which they can be easily pre-empted.

Mistake1 – Shallow encoding practices

As already mentioned in many previous posts of mine, a to-be-learnt word lingers in our Working Memory for no longer than two or three seconds immediately after we hear it. Thus, in order to commit it effectively to Long-term Memory, we must perform some form of rehearsal. Rehearsal involves either ‘shallow’ or ‘deep’ processing.

In shallow processing we use repetition or matching a word to a visual cue. In deep processing, on the other hand, the brain performs problem-solving operations which require more attentional investment and higher order thinking (e.g. analysis and evaluation) and are meaning-orientated. Typical vocabulary teaching activities of this kind include:

  • Matching synonyms
  • Matching antonyms
  • Odd one out
  • Matching word and definition
  • Providing the definition of a word
  • Sorting words into semantic categories
  • Creatively finding association between words seemingly semantically unrelated
  • Working out the meaning of a word using the surrounding linguistic context

The reason why deep processing is more likely to result in deeper learning than shallow processing is because (1) it requires more cognitive investment on the part of the learner and, more importantly, (2) it creates more and stronger associations between the to-be-learnt word and existing information and words in Long-Term Memory. The latter point is of paramount importance as failure to retrieve a word (forgetting) is usually cue-dependent, i.e. the brain cannot find the required word not because it has vanished from Long-Term Memory, but because it ‘cannot find its way to it’ in the absence of effective contextual cues (physical or psychological elements that were present at the time of learning the word but are absent at the time of recall).

Example: if you taught your students ten words using some of the www.linguascope.com very entertaining games (e.g. matching words to pictures; word dictation; spelling games), they will have performed lots of fun activities for 10-15 minutes. True. However, you will have engaged your students in 100% shallow encoding; the number of contextual cues you will have provided them  with will have been very limited (as all they did was word-recognition work); and the associations with previously learnt L2 vocabulary will be zero as Linguascope does not present the words in context.

On the other hand, imagine asking your students to: (1) match the target words with their antonyms and synonyms; (2) sort them into different thematic categories or in terms of size or importance; (3) use them to solve a problem (e.g. working out the meaning of a sentence), (4) fulfill a communicative goal (e.g. booking a holiday or simply interviewing a peer), (5) complete gapped sentences meaningfully, (6) create a poem or song in the target language. Your students will be processing the words in terms of meaning and will build hundreds of associations with other L2 words, other existing information in your brain (e.g. your knowledge of the world) and with many other contextual cues (e.g. their peers, the website used to book the holiday, the things that inspired the song or poem). Last but not least, they will have put serious thought into these activities; not just mindlessly matched words to images and sounds as happens in most online vocabulary learning websites (e.g. Quizlet, Memrise, etc.)

The reason why I created my (free) website www.language-gym.com was dictated by the need to involve my students in less fun but more cognitively challenging deep processing activities. And it has paid enormous dividends in term of vocabulary learning.

It goes without saying that with absolute beginner learners it is not always straightfoward to create activities that promote deep processing.

Mistake 2 –  Limited contextualized practice

You will have surely noticed, whilst doing a Google search, that as you type a sentence Google offers you a range of predictions as to how that sentence is going to end. You will have also noticed how those predictions get gradually narrowed down as you get closer to the end. In other words, based on their users’ behavior, Google has worked out what you are statistically more likely to type next. Well this is, according to existing Cognitivist models of language production what our brain does, too. Based on the probability that you will utter words Y and Z after word X, your brain automatizes and speeds up language production. So, if you have said ‘Quel âge as-tu? (‘How old are you?’ ) 100 times and ‘Quel âge a-t-il?’ (how old is he?) only 10, it is highly probable that the sentence stem ‘Quel âge’ will automatically retrieve ‘as-tu’ rather than ‘a-t-il’.

The implications of this for language teaching and learning in general are enormous, but beyond the scope of this blog. In terms of vocabulary acquisition, the main implication is that vocabulary items MUST NOT be taught as discrete items or in the very limited range of phrases or contexts in which textbooks usually present it. If we do, we are merely teaching the Audiolingual way – i.e. the relentless memorization of the same words/phrases over and over. That is why it is important to:

(1) teach the target words as contextualized in as wide as possible a range of written or aural comprehensible input which models the target vocabulary (e.g. narrow reading and listening). This can be done even with beginners, provided that the texts used are short and accessible. This is the most important part of teaching vocabulary as it models how words relate to and combine with each other in the target language;

(2) integrate grammar/syntax instruction and sentence combining into the teaching of vocabulary so as to increase the generative power of the target lexis;

(3) teach a variety of verbs + noun collocations (not always the same one or two verbs);

(4) involve the students in a lot of structured and semi-structured communicative practice which requires them to use the target vocabulary in as wide as possible a range of linguistic contexts;

(5) try, as much as possible, when teaching new vocabulary, to provide opportunities for students to use it with previously learnt lexis so as to kill two birds with one stone; on the one hand you will recycle old vocabulary, on the other you will provide a further context to ‘anchor’ the new words to.

In much vocabulary teaching I have observed in 25 years, target words (mostly nouns!) are taught for the most part of the lesson as discrete items and/or within the same basic phrases or pattern. Little modelling in context actually occurs and when it does happen it is limited to one text or two. This has created a generation of students who know – at best – lots of isolated words but do not often know how to interpret/use them in more challenging receptive/productive contexts.

Mistake 3. The ‘so what?’ effect

A lot of vocabulary learning these days is divorced from a real-life communicative purpose due to a tendency to a gamification of learning and an overreliance on Apps. Humans are goal-orientated beings, hence their motivation and cognition are aroused by problem solving and by the attainment of a goal. The most effective way to learn vocabulary is by activating it in order to carry out several real-life tasks in the context of interactional activities. The ‘so what?’ effect, when compounded by discrete and out-of-context word teaching exacerbates the perception by learners that language lessons are just about memorizing words for memorization’s sake and have not much relevance to the real world.

Mistake 4. Misunderstanding of what progression means in terms of vocabulary acquisition

Often teachers from various parts of the world approach me on social media asking me to give them ideas or help them prepare for an imminent lesson observation by a line-manager. Their main worry: showing progression. However, progression in vocabulary acquisition as measured within a specific lesson is a construct of questionable validity.

Firstly, because the same students who show evidence of learning will lose, in the absence of reinforcement, 40% of what they have learnt an hour after the lesson (will return to this point below); 60 % 24 hours later; and 80 % six days later. Vocabulary acquisition does not occur within one lesson, hence, stating that by the end of the lesson students demonstrate to have learnt the target vocabulary is a flawed assumption.


Secondly, as pointed out above, learning words as discrete entities does not mean acquiring them. You need to be able to understand or use them in context for the attainment of a communicative goal or it has no value.

Thirdly, progression in vocabulary acquisition refers to being able to understand/produce the target lexis successfully across as wide as possible a range of contexts (familiar and unfamiliar), at high speed (fluency) and with a high degree of accuracy. Hence teachers ought to measure all of these dimensions of vocabulary learning before claiming that the target vocabulary has been acquired

Yet many lesson observers in a typical British secondary school will require from their observee-teachers that most of the students demonstrate by the end of the lesson the ability to recall accurately orally and/or in writing most of the target words (mostly in isolation); they will see it as the ultimate evidence of learning. And most teachers, too, will agree that this indeed is their main preoccupation.

This leads to a neglect of all the other very important dimensions of progression alluded to above, at the detriment of effective language acquisition.

Mistake 5. Homework timelines

From what I said above about how forgetting occurs, it is evident that setting vocabulary learning homework for Thursday when you have just taught new words on Monday is not very smart if you know that the vast majority of your students will do it on Wednesday night – as it means they will have forgotten 60 % of what they learnt by then.

Solution: if you teach in a high-tech school like mine you can split up the vocabulary learning homework in two and ask them to send it to you in two installments (e.g. via Google classroom). So, using the Monday/Thursday scenario above, one part of the HW will be due on Monday eve and the other one on Wednesday.

Mistake 6. Not planning which level of acquisition you aim at in a lesson

Much ineffective vocabulary teaching stems from not deciding which level/facet of vocabulary acquisition (of the ones mentioned above) one is focusing on. In planning a lesson it is important to decide whether one wants to focus on receptive skills rather than productive ones or on both. Is it just listening for modelling and/or comprehension you want to focus on in lesson one (today) because you want to focus on speaking and pronunciation in lesson 2 (tomorrow)? Is it only the grammatical usage of the target adjectives you are mainly concerned about? Or are you focusing on enhancing speed of retrieval (fluency)?

I also usually decide which 10-15 of the 20-25 words I typically aim to teach in a given lesson will be in my students’ focal awareness and which 10-15 will be in their peripheral awareness. This is another important decision to take in order to pre-empt student cognitive overload.

Mistake 7 – Using audio-tracks to introduce new words

Using audio-tracks to introduce new words has become common practice in many classrooms these days. This can be justified when the teacher does not have a good target language pronunciation; however, when she does, this ought to be avoided. The teacher must clearly show how each new target word is pronounced and get her students to imitate her mouth movements, especially with sounds that are more notoriously challenging, such as the French ‘in’ and the ‘en’ sounds in the words ‘singe’ and ‘serpent’.

This pronunciation-visibility issue is often compounded by the fact that recordings  tend to pronounce the target words at native speed. This can be detrimental when dealing with novice learners whose decoding skills are poor and would benefit from the pronunciation being slowed down in order to render the sounds more intelligible.

Mistake 8 – Using word-lists and mats with students with poor decoding skills

Often students are provided with word lists and talking/writing mats packed with unfamiliar lexical items. I, for one, love using writing mats and have come up with an instructional sequence based on their deployment that I implement quite frequently in my lessons (outlined in a previous blog). If teachers have trained their students extensively in L2 decoding skills there will be no problem as they will be able to convert most of the words into sound fairly accurately. However, in most secondary schools this is not the case.

This can be very harmful since, as I have explained in several blogs, correct or near-correct pronunciation of L2 words is of crucial importance to successful L2-acquisition and performance (Walter, 2008). The main reason is that memory is sound-mediated, so successful recall of L2 words and their meaning require their accurate phonological encoding.

In many lessons I have observed teachers usually pronounced the words and got students with no-decoding-skill training to repeat them aloud a few times before using the words on the lists/mats. However, since words linger in Working Memory for only a few seconds, only a few gifted learners could actually pronounce the words correctly in the subsequent oral tasks. The rest experienced cognitive overload. Hence the teacher ended up having to correct the same students on the same mistakes over and over again for the whole duration of the lesson. At the end of the lesson the pronunciation of the new words was still generally quite poor.

A possible solution: when one is using word-lists and writing mats one may want to model those words extensively through lots of listening and micro-listening tasks. As far as listening is concerned, the easiest zero-preparation way to do this is (a) to utter short accessible sentences and ask the students to write their meaning on MWBs or (b) micro-dictation/transcription tasks. Narrow listening tasks require more preparation but yield excellent results. As for the micro-listening tasks, please refer to this post: https://gianfrancoconti.wordpress.com/2015/06/16/seven-micro-listening-enhancers-you-may-not-be-using-often-enough-in-your-lessons/

An even better solution: teach decoding skills from the very early stages of instruction so as to avoid these problems when you will be providing your students longer and more complex vocabulary lists for independent learning in the future. An effective L2 decoder is a more effective autonomous learner on many accounts. Unsurprisingly, research has evidenced a correlation between good decoding skills and the pursuit of language study at GCSE (i.e. it is the students with more effective decoding skills who usually choose to continue to study MFL after year 9).

Mistake 9. Presenting new vocabulary in its written form before or concurrently with its phonemic form.

As already discussed in my previous blog ‘Nine research facts about pronunciation’ L2 graphemes (letters) automatically activate L1 pronunciation. Hence exposure to L2 words in their written form ought to be avoided as much as possible with beginner learners. When new lexical items are indeed presented, they should first be presented through visual aids or gestures; their written form should be provided only after much exposure / practice with their phonemic form.

  1. Causing cognitive overload

This issue refers to many scenarios I witnessed. Here are four common ones.

(1) the teachers is overambitious and aims at teaching too much vocabulary – without deciding on the receptive vs productive / core vs peripheral dichotomies. The result is poor overall recall.

(2) (with novice learners) the teacher selects complex words which pose a series of important challenges in terms of pronunciation and/or grammar (e.g. word order and agreement). For what we said about the importance of pronunciation and the limitation of Working Memory capacity in terms of phonological storage, teachers must select the target words carefully. When faced with polysyllabic words containing challenging phonemes, one must deploy strategies to make them more accessible to learners both receptively and productively (e.g. ‘chunking’). My colleague Dylan Vinales uses humour, body language and focus on muscle memory as a way to make the pronunciation of such words ‘stick’. Here is a short clip demonstrating the very simple and minimal preparation way in which he does it: https://www.youtube.com/watch?v=w4wcsjLtckE&feature=em-upload_owner )

(3) the teacher selects a lot of cognates in the belief that they are easy to pick up. However, whilst cognates are easy to learn receptively (especially in reading) they can pose serious cognitive challenges in certain aspect of production especially pronunciation and writing, causing processing inefficiency issues. I have experienced this first hand at the early stages of my Spanish learning; Italian and Spanish being so close I would often misspell words which differed by only one letter. Obviously, this issue does not have a major negative impact on acquisition. This phenomenon is referred to by psycholinguists as ‘cross association’

(4) two or more near-homophones are taught in the same lesson. This, too can cause cross association. This happened to me yesterday in my year 7 French class. A week earlier I had just finished a unit on animals in which I had taught ‘grenouille’ (frog) and as we were asking each other what we thought about different rooms in the house, my student Abi asked me: ‘ Tu aimes le grenouille?’ (do you like the frog?) when he actually meant to ask ‘Tu aimes le grenier?’ (do you like the attic?). He had cross-associated ‘grenier’ and ‘grenouille’ due to the common stem the two words share.

Concluding remarks

Some of the above mistakes are more serious than others and may have a more long-lasting detrimental impact on vocabulary acquisition and language learning in general. Vocabulary learning being one of the most important aspects of language acquisition, teachers need to be mindful of the issues discussed in this post. The most important mistakes, in my view, pertain to four areas. Firstly, the bad habit of not contextualizing the teaching of lexis and wasting too much classroom time on discrete-word teaching (which can be flipped). Secondly, the importance of getting the students to learn the words by using them orally or in interactional writing for real-life communication. Thirdly, the insufficient amount of listening practice devoted to modelling good pronunciation and, fourthly, the very limited focus devoted to decoding skills, one of the most important sets of lifelong learning skills a linguist may ever wish to have.

You can find more on vocabulary teaching and learning in my book ‘The language teacher toolkit’ , co-authored with Steve Smith and available for purchase on http://www.amazon.com


18 thoughts on “10 commonly made mistakes in vocabulary instruction

  1. Thanks a lot for a very interesting and useful article. Your mention of British schools has made me think of the requirement that all our lessons have a plenary. If you have dedicated an hour to memorising 15 items of vocabulary and your plenary is to have students recalling it in some sort of game it is hardly surprising that most of the students recall most of the vocabulary. The objective has been fulfilled and progression has been shown. (And the teacher knows that the students will not remember much in the next lesson)

    Could you give the reference for the Elapsed Time Since Learning Table? (Apologies if it is in the text and I have missed it) I need to bring that data to my school’s attention when amending our long term planning for the new GCSE. I am finding that my students’ major problem is memorisation and lack of automatisation, so we need to teach less and better in KS3. This table will help me make the case.

    Thanks again for your blog!

  2. Be it in language, Maths or in other disciplines there is always two very important aspect that could aid knowledge retention.

    Firstly, the knowledge must connect to prior knowledge – a process that offer relational understanding – a term introduced Stieg Mellin-Olsen and made popular by Richard Kempt in teaching of Maths.

    The process involve connecting a new knowledge to a prior knowledge which makes it more accessible by our concious mind. Cognitively, our brain search for a piece of information going through a network neurons connecting to neurons. A new piece of knowledge content is more accessible if it is connected to the network of prior knowledge. The benefit of relational understanding is not just so that the knowledge is applicable but also more easily accessible.

    I theorize that such connection is also made more permanent and therefore better retain, by default of its being fire up every time the brain work on something connected to it.

    Interestingly, before a new piece of information is connected to a prior knowledge, usually some unlearning of prior knowledge must be done. E.g. a child learn the word cat as a domestic animal, but later learn that a cat can also refer to a huge tiger. In the head of the child, there is a conflict in connecting tiger as a cat to the kitten that he thought is a cat. The brain have to cognitively reorganize that understanding so as to embrace the word cat as collective noun for all cats big and small. The process in resolving conflicting understanding and a broader and deeper understanding must be part of what contribute to that which we regard as hard. It takes time. Many students are left behind when teachers move on from one concept to another when learners experience mental hurdle in connecting knowledge to knowledge. Unlearning and learning must takes place before knowledge is chunked. Chunking process will lead us to the second part of knowledge retention.

    Knowledge retention could best facilitated if repetition of work is spaced out in the correct time frame. It allows for chunking to take place. Those who do not believe in homework should read your article. Chunking is a process that involves linking various information together in certain cohesive manner – especially information in learning maths.Chunking is a time consuming process. It appears that there must have some repetitive intervals of learn and rest/sleep before knowledge is chunked.

    The chunk of information should be revisited periodically. The way school curriculum are currently organize is based on spiral progression approach where students will revisit chunks of knowledge they have learned in the space of one year. That space of time is not very helpful in knoweldge retention.

    Periodically, there must be purposeful designed excercise such that students need to access their chunks of knowledge again in order to apply them in more challenging manner. Preferably, in a less structure, well simulated real life situation.

    A lot of time is wasted if students move to cover too much material but has not properly form relational understanding and not properly chunked.

    In the case of Maths the best way to teach relational understanding is by inductive and deductive reasoning at higher level and by way of constructivism for younger learners. After the process of properly establishing linking new knowledge to prior knowledge then students should chunk those knowledge. I have not seen any maths programme that is designed and organize(time) in ways that align with neuroscience understanding of how our brain learns. Some over chunked with procedural understanding instead of relational understanding. Some use contructivism without strategic focus.

    Just sharing the parallel in learning maths from my perspective. Only one question to ask Mr. NasiKandar (sorry I could not resist) Is there a link to the ‘Elapse time learning curve’?

    • Thanks for your insightful comments. I agree on most of your points as they totally aligned with my Cognitivist approach to teaching although in Language Acquisition relational understanding is only one of the cognitive factors which lead to enhanced language proficiency. Skill acquisition follows different paths which include the automatisation of motor sensorial modalities. Also, you are neglecting affective and social elements pivotal in the acquisition of a language which have less to do with relational understanding and more with a student Personal locus of Causality and motivation in general. Is Nasikandar somewhat related to Iskandar?

      • Apologize for the joke there. Nasi Kandar us a popular Northern Malaysian dish. It sort of rhyme with the name Iskandar. A joke that one of your students use on you. Anyway, I am sure language acquisition are more complex that learning maths. But I would be interested to find out more on the Elapse Time learning curve presented above. Can you please provide a link or reference?

      • Sure. The Elapsed Time Learning Curve was elaborated by Ebbinghaus at the beginning of the 20th century. Although it has been slightly revised in recent years, it has stayed pretty much the same. All you have to do is google: Ebbinghaus curve of forgetting and you will find lots of results, usually study-skills websites I learnt about this at Uni very long time ago – don’t have a specific reference. Ciao.

  3. I have spent the last 3 hours, without realizing it, reading the information in your blogs. I am encouraged by your combination of cognitive research and linguistics. I have been teaching L2 (Spanish) in the Southern US for many years and hoping for a way to at least change the way I do things, if I can’t influence others to do so. I have had some successes, and have even argued much of what you have stated here, without realizing this was even a “thing”, since my exposure some years ago to the cognitive/memory aspect of education that has all but disappeared due to our Bloom/Gardener kool-aid, with all due respect. At any rate, my question is, do you have videos or steps to teaching “decoding” in your book? I may have missed mention of videos on your site, as I have just been clicking on everything I could read. Thank you again for your work and insight.

  4. Hi Dr. Conti. Interested in mistake 1: shallow-encoding practices. Do you think these are always a mistake? Repetition makes use of phonological short-term memory (the phonological loop), which is an important component of the working memory model.
    Could you recommend any particular literature on shallow/deep encoding practices? thanks!
    (this is my third read of your article – very useful!)

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s