Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "CALL"

Sort by: Order: Results:

  • China-Kolehmainen, Elena (2021)
    Computer-Assisted Language Learning (CALL) is one of the sub-disciplines within the area of Second Language Acquisition. Clozes, also called fill-in-the-blank, are largely used exercises in language learning applications. A cloze is an exercise where the learner is asked to provide a fragment that has been removed from the text. For language learning purposes, in addition to open-end clozes where one or more words are removed and the student must fill the gap, another type of cloze is commonly used, namely multiple-choice cloze. In a multiple-choice cloze, a fragment is removed from the text and the student must choose the correct answer from multiple options. Multiple-choice exercises are a common way of practicing and testing grammatical knowledge. The aim of this work is to identify relevant learning constructs for Italian to be applied to automatic exercises creation based on authentic texts in the Revita Framework. Learning constructs are units that represent language knowledge. Revita is a free to use online platform that was designed to provide language learning tools with the aim of revitalizing endangered languages including several Finno-Ugric languages such as North Saami. Later non-endangered languages were added. Italian is the first majority language to be added in a principled way. This work paves the way towards adding new languages in the future. Its purpose is threefold: it contributes to the raising of Italian from its beta status towards a full development stage; it formulates best practices for defining support for a new language and it serves as a documentation of what has been done, how and what remains to be done. Grammars and linguistic resources were consulted to compile an inventory of learning constructs for Italian. Analytic and pronominal verbs, verb government with prepositions, and noun phrase agreement were implemented by designing pattern rules that match sequences of tokens with specific parts-of-speech, surfaces and morphological tags. The rules were tested with test sentences that allowed further refining and correction of the rules. Current precision of the 47 rules for analytic and pronominal verbs on 177 test sentences results in 100%. Recall is 96.4%. Both precision and recall for the 5 noun phrase agreement rules result in 96.0% in respect to the 34 test sentences. Analytic and pronominal verb, as well as noun phrase agreement patterns, were used to generate open-end clozes. Verb government pattern rules were implemented into multiple-choice exercises where one of the four presented options is the correct preposition and the other three are prepositions that do not fit in context. The patterns were designed based on colligations, combinations of tokens (collocations) that are also explained by grammatical constraints. Verb government exercises were generated on a specifically collected corpus of 29074 words. The corpus included three types of text: biography sections from Wikipedia, Italian news articles and Italian language matriculation exams. The last text type generated the most exercises with a rate of 19 exercises every 10000 words, suggesting that the semi-authentic text met best the level of verb government exercises because of appropriate vocabulary frequency and sentence structure complexity. Four native language experts, either teachers of Italian as L2 or linguists, evaluated usability of the generated multiple-choice clozes, which resulted in 93.55%. This result suggests that minor adjustments i.e., the exclusion of target verbs that cause multiple-admissibility, are sufficient to consider verb government patterns usable until the possibility of dealing with multiple-admissible answers is addressed. The implementation of some of the most important learning constructs for Italian resulted feasible with current NLP tools, although quantitative evaluation of precision and recall of the designed rules is needed to evaluate the generation of exercises on authentic text. This work paves the way towards a full development stage of Italian in Revita and enables further pilot studies with actual learners, which will allow to measure learning outcomes in quantitative terms
  • Ahvenharju, Panu (2024)
    The topic of the thesis is integrating Natural Language Processing (NLP) and Computer-assisted Language Learning (CALL) into teacher-led Spanish instruction. The aim is to present a development process and a CALL application to be used to study learning results. The study seeks answers to questions on how an NLP-based CALL application can be used to investigate learning, and how its usage rate and usage affect learning outcomes. Also, the focus is on usability, asking how usable the students evaluate the application to be, and what kind of open feedback they give for it. 108 secondary school students and four teachers from the Helsinki Metropolitan region participated in the study, where a gamified application creates a competitive setting between five teaching groups. The students use the application to solve textbook-based cloze exercises that are generated using a combination of a neural language model and a rule-based exercise creation. The vocabulary tests measure learning by selecting test words according to the usage analytics so that they are from outside the cloze fields of exercise sentences. The students who used the application were divided into two groups: those (N=26) who encountered the test words in the application and those (N=31) who did not. The results are being compared to those in the control group (N=8) who did not use the application. The results show that the group encountering the test words performed 11.39 percentage points better than the control group. Interestingly, the students who did not encounter the words performed 25.21 percentage points better in tests than the control group. Despite the positive results, statistical analysis revealed a significant relationship only between usage rate and encountering the test words, not between the test words and the vocabulary test results. This may be explained by the different sizes of the groups, the random way how the application selected exercises, and the fact that the students did not encounter the words often enough. The method requires many enhancements before utilising it on a larger scale. The students evaluated the application's usability to be good, and they left 18 open feedback responses, which were mostly positive.