Skip to main content

Is Wikipedia Too Difficult to Read?

Image from https://commons.wikimedia.org/wiki/File:Afghan_man_reading_Wikipedia_article_in_Kandahar.jpg

The short answer via statistical analysis is yes
For more information, read Lucassen, T., Dijkstra, R., & Schraagen, J. M. (2012). Readability of Wikipedia. First Monday at http://firstmonday.org/ojs/index.php/fm/article/view/3916/3297

Wikipedians are aware that the open online encyclopedia may be too difficult, and there is a discussion of its reading level at https://meta.wikimedia.org/wiki/Reading_level. Much of this discussion took place over a decade ago, but the gist is that many contributors write at or for the college level. What appeals to me most is at the end of the page, where Wikipedians are discussing accessibility and what it means to be open to all. Here's my screenshot (in case it gets edited later).


What does this mean for English language teachers?

I was interested in seeing how selected Wikipedia articles range according to the CEFR scales. I looked through 20 articles that English for Academic Purposes (EAP) students may use. My selection takes into consideration the local academic culture, so I included articles specific to my city and American culture. Incidentally, the first three I selected were the best examples in terms of range: Carbondale, IL (mid-range); circle (easier); and Wikipedia:About (more difficult). I ran all 20 articles through the Readability Test Tool at https://www.webpagefx.com/tools/read-able/check.php. Most articles were scored at the 9th grade level with a few at the 10th grade or higher.

These grade levels are not appropriate for language learners, so I applied the Text Inspector tool from http://www.englishprofile.org/wordlists/text-inspector to the first three articles to find the CEFR levels. To show you how Text Inspector shows you the levels (for free), I took a screen shot of the analyzed first 500 words of the Wikipedia:About article. (For a fee, you can analyze up to 10,000 words).


Looking at this table, I believe that the text is appropriate for C1 readers, which is usually the highest level in most English language programs. In my previous institution, C2 readers were usually ready for university courses. If C1 readers read this article, they may find about 5% of the words new to them with around 1% beyond their level. If I were designing a vocabulary quiz around this information, I would choose about 10 of the new words with none at the C2 range. This is for the more difficult Wikipedia articles.

However, even the easier articles for academic purposes don't appear to go below B2. I haven't tried to find the easiest but appropriate EAP Wikipedia article, but I would make a safe bet that Wikipedia should not be introduced until students are reading at the B1 level. And I wouldn't assign independent reading of Wikipedia articles until they are at the B2 level. However, most students may find it easier just to read the same article in their first language if they trust it contains the same information. 

Going back to answer my title question, Wikipedia is too difficult to read for all students reading at the A1 & A2 level and most students at the B1 level. If you're teaching at the B2 level, I would expect a wide range of Wikipedia reading abilities, but by the C1 level, a teacher should expect them to understand most. But to be more confident about what they may be able to read, I urge the teacher to use the text readability tools I mentioned earlier (in bold). Keep in mind that these measurements are not completely reliable, but they should give a you a clearer idea more quickly than skimming it yourself.

What about Simple English Wikipedia?

Sadly, most of the articles I selected were not much simpler than the original article. They were simpler in that they were usually shorter articles with slightly fewer complex words, but complex words for the most part could not be avoided. My quick analysis helped me come to the conclusion that at best the Simple English article was one CEFR level below the original, but usually it's at the same level, just slightly easier. The Simple English articles give the illusion that they're easier because many of them are shorter than their counterparts on regular English Wikipedia. I didn't do a deep dive on this, but perhaps someone already did? If not, it would make an interesting paper.

If this is the case, then you could possibly introduce Simple English Wikipedia as early as A2, but I wouldn't recommend it. I would, however, feel a little bit more confident introducing it at the B1 level. Perhaps Simple English Wikipedia is best for B1 readers, where B2 is the level to transition to regular Wikipedia, and then C1 is regular Wikipedia. But now I'm thinking like a publisher. Most experienced teachers know that we shouldn't be this rigid with text selection. Anyway, it's Wikipedia! You don't have to buy it and feel like you're stuck with it.

Comments

The Leveller said…
Thanks Jeremy for an interesting article with careful use of our Text Inspector EVP (English Vocabulary Profile) tool. However, that is only a snapshot of a text's level. The full suite of tools could give you a really detailed analysis of whether a reading text is at B2, C1 and so on based on over a dozen measures - more reliable than using just the one EVP tool which you used. See www.textinspector.com/ help for details.

Thanks, Stephen Bax (Text Inspector)

Popular posts from this blog

Adrian Holliday

In January 2015, the University of Warwick (UK) hosted a lecture by Dr. Adrian Holliday, whose work has greatly influenced my dissertation.  The lecture was recorded and can be viewed at http://www2.warwick.ac.uk/fac/soc/al/research/groups/llta/activities/events/holliday .  If you are interested in watching the video, I advise that you wear headphones as Dr. Holliday was not wearing a microphone.   For this blog, I briefly summarize the video, highlighting what I found most provocative.  Following that, I explain how Holliday's work has influenced my research and teaching philosophy for the past 5-10 years. Summary of "Revisiting appropriate methodology, BANA, TESEP and 'contexts'" The main purpose of this lecture was for Holliday to reflect upon his book Appropriate Methodology and Social Context , published 20 years ago by Cambridge University Press.  In this lecture, he integrated criticism from another professor whose research I admire, Dr. Suresh Cana

TESOL Job Market Trends 2009-2018

I have been tracking full-time TESOL jobs since Fall 2009, my first year as a Ph.D. student at the University of Iowa. Back then, the job market was quite bad because of the 2008 economic crisis. My motivation for tracking jobs was to help my future TESOL students understand the market. This was based on colleagues asking about good locations to live and work. I had hunches but not enough data, and now I have almost a decade of data. What did I track?  In Fall 2009, I started tracking TESOL job announcements from HigherEdJobs.com and the TESOL Career Center for tenure and non-tenure professorships in universities and community colleges. In 2010, I expanded my tracking to include instructor positions at universities (mainly intensive English programs) and "other" jobs, which used to be mainly governmental, non-profit, and publishing jobs. But now they are predominantly in the for-profit higher education ELT industry, including corporations like Shorelight and INTO. In 201