Menu Close

Paper 032: Extending vocabulary profiling to languages other than English

Laurence ANTHONY (Waseda University), Natalie FINLAYSON, Emma MARSDEN, Rachel HAWKES, and Nick AVERY (National Centre of Excellence for Language Pedagogy, University of York)

Keywords: NCELP, vocabulary, profiling, non-English, tools

Abstract

Vocabulary profiles of corpora are often created as a step towards creating and/or modifying pedagogic materials for a target learner audience. Two of the most used desktop vocabulary profiling tools are Range and its more modern equivalent AntWordProfiler. For online vocabulary profiling, Web VP tool, which is part of Compleat Lexical Tutor is a popular alternative. All these tools can in theory be used to profile texts of any language. However, they rely on levelled vocabulary lists where each item in the list is grouped according to its “word family”, “flemma”, or lemma category. They also rely on each item in a list being a single string of characters (i.e words). These limitations introduce problems when attempting to profile languages such as English and French (and almost all other languages) which are composed of both single- and multi-word units. They also hugely complicate the process of vocabulary profiling for languages with a high degree of declension such as German and Spanish. In this presentation, we will first discuss the problems of vocabulary profiling in English and languages other than English. Next, we will explain how an existing desktop profiling tool was adapted for use at the National Centre for Excellence for Language Pedagogy (NCELP), UK to assist researchers in the creation of curricula specifications for the teaching of Spanish, French, and German vocabulary and also teachers hoping to implement these specifications. Then, we will explain the next stage of the project, which is to develop an open-access, online version of the tool.

Presentation video

Supplementary Information

Q&A live (Zoom) session

No longer available.

5 Comments

  1. anthony

    If you have any questions about our talk, please write them here or direct message us through the conference Slack chat. (Laurence and Natalie)

  2. iskwshin

    Thank you for introducing your new project, which is really fascinating. One small question. Maybe now you think of continuing to offer Ant Profiler as a desktop software (as you already do) and MultiLing Version as an online system. Then, what do you think about the merits and demerits of distributing corpus tools as an online system? (Shin Ishikawa, Kobe U)

  3. tono

    A very interesting project! Our university (TUFS) has a project of converting all the CEFR-J resources to 27 other languages. One of them is the wordlist. We have various problems, orthography, complex derivation/conjugation systems, etc. for Asian languages, and how to match English or Japanese equivalents to them. About half of the languages have completed the making of wordlist covering A1 to B2 levels. The resource will be made available toward the end of the Super Global University project. I am interested in your web-based vocabulary profile system, so keep me informed.
    (Yukio TONO)

  4. anthony

    @ISKWSHIN Thank you for your question. There are many, many advantages to online tools. The biggest is the ease of deployment (across multiple devices and operating systems) and updates. The main weakness is the fact that the user needs to have an Internet connection to access the tool, but this is becoming less and less of an issue.

  5. anthony

    @Tono Thank you for the comment. We are already starting discussion about extending the MultiLingProfiler to work with more languages. If you are interested, we can certainly consider the possibility of adding your lists to the site.

Comments are closed.