Table of Contents
Fetching ...

GECTurk WEB: An Explainable Online Platform for Turkish Grammatical Error Detection and Correction

Ali Gebeşçe, Gözde Gül Şahin

TL;DR

GECTurk WEB is a light, open-source, and flexible web-based system that can detect and correct the most common forms of Turkish writing errors, such as the misuse of diacritics, compound and foreign words, pronouns, light verbs along with spelling mistakes.

Abstract

Sophisticated grammatical error detection/correction tools are available for a small set of languages such as English and Chinese. However, it is not straightforward -- if not impossible -- to adapt them to morphologically rich languages with complex writing rules like Turkish which has more than 80 million speakers. Even though several tools exist for Turkish, they primarily focus on spelling errors rather than grammatical errors and lack features such as web interfaces, error explanations and feedback mechanisms. To fill this gap, we introduce GECTurk WEB, a light, open-source, and flexible web-based system that can detect and correct the most common forms of Turkish writing errors, such as the misuse of diacritics, compound and foreign words, pronouns, light verbs along with spelling mistakes. Our system provides native speakers and second language learners an easily accessible tool to detect/correct such mistakes and also to learn from their mistakes by showing the explanation for the violated rule(s). The proposed system achieves 88,3 system usability score, and is shown to help learn/remember a grammatical rule (confirmed by 80% of the participants). The GECTurk WEB is available both as an offline tool at https://github.com/GGLAB-KU/gecturkweb or online at www.gecturk.net.

GECTurk WEB: An Explainable Online Platform for Turkish Grammatical Error Detection and Correction

TL;DR

GECTurk WEB is a light, open-source, and flexible web-based system that can detect and correct the most common forms of Turkish writing errors, such as the misuse of diacritics, compound and foreign words, pronouns, light verbs along with spelling mistakes.

Abstract

Sophisticated grammatical error detection/correction tools are available for a small set of languages such as English and Chinese. However, it is not straightforward -- if not impossible -- to adapt them to morphologically rich languages with complex writing rules like Turkish which has more than 80 million speakers. Even though several tools exist for Turkish, they primarily focus on spelling errors rather than grammatical errors and lack features such as web interfaces, error explanations and feedback mechanisms. To fill this gap, we introduce GECTurk WEB, a light, open-source, and flexible web-based system that can detect and correct the most common forms of Turkish writing errors, such as the misuse of diacritics, compound and foreign words, pronouns, light verbs along with spelling mistakes. Our system provides native speakers and second language learners an easily accessible tool to detect/correct such mistakes and also to learn from their mistakes by showing the explanation for the violated rule(s). The proposed system achieves 88,3 system usability score, and is shown to help learn/remember a grammatical rule (confirmed by 80% of the participants). The GECTurk WEB is available both as an offline tool at https://github.com/GGLAB-KU/gecturkweb or online at www.gecturk.net.

Paper Structure

This paper contains 21 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The screenshot of UI after user entering an input. 1- Girdi (Input): The input area for the user. 2- Yanlışları Bul (Find Errors): A button which is pressed after entering an input. 3- Çıktı (Output): The output area for the tagged and corrected text. Note that each error is categorized (colored) according to Table \ref{['table:2']}. 4- Pop-up: Each corrected word is represented as button. When clicked the violated rule, i.e., error type, is shown. 5- Metni Kopyala (Copy Text): A button for copying corrected text. 6- Bu Hala Hatalı (Still Erroneous): A button for giving feedback in case the user thinks the output still contains errors. When clicked, a pop-up is shown and user is expected to write the corrected version. 7- Geri Bildirim Vermek İster Misin? (Give Feedback): A button for collecting general suggestions.
  • Figure 2: The GECTurk WEB Architecture. 1) User inputs text containing two errors: a spelling error, "yapmk" (shown in red) and a grammatical error, "istiyormusun" (shown in green). 2-3) The view receives the input from the frontend and forwards it to the GEC/D model. 4) The GEC/D model corrects the grammatical error and adds tags for the frontend to display, as shown in \ref{['fig:before-after']}. 5) The SEC module corrects the spelling error, tags it, and sends it back to the View. 6-7) The model compiles relevant information such as ID, Input, Output, and Date, and records these in the database. 8) The View sends the prepared output back to the frontend for display.
  • Figure 3: CASE - 1: The sentence contains an error and GECTurk successfully detects the error. Example: "Sonuçları herkes gibi bende merakla bekliyorum." In this sentence, GECTurk correctly changes "bende" to "ben de". Therefore, there is no need to click on the "This is still incorrect!" button, as shown in the video below.
  • Figure 4: CASE - 2: There is no error in the sentence and GECTurk does not change the sentence. Example: "Lyon, bir milyonu aşan nüfusuyla Fransa'nın üçüncü büyük kenti." There are no errors in this sentence and GECTurk does not change the sentence. Therefore, there is no need to click on the "This is still incorrect!" button, as shown in the video below.
  • Figure 5: CASE - 3: There is no error in the sentence but GECTurk changes the sentence. Example: "O kadar merhametlidir ki yakın arkadaşları arasında karıncaincitmez olarak anılır." There is no mistake in this sentence, but GECTurk changes the word "karıncaincitmez" to "karınca incitmez". Therefore, you should click on the "This is still incorrect!" button and type the correct version of the sentence, as shown in the video below.
  • ...and 2 more figures