QueST: Querying Smart Text
Project proposal was approved for funding by SADiLaR (1 year, 2019/2020)
OverviewIntroduction and background
The availability of digital textbooks is on the rise for a plethora of reasons, including prohibitively high cost of hard copy textbooks and new options with digital textbooks, such as faster concept search. This opens up new ways of engaging with content presented as text that is not possible with hardcopy books, and in such a way that the way of interaction may be tailored to the level of the learner or student. Broadly, this falls within an area called 'adaptive e-learning' where we focus on trying to make the textbook 'smart'. The currently most advanced system for engagement with digital text, is SRI‘s one-off, tailor-made, smart textbook system "Inquire biology" that has has the text annotated with an ontology so that it can assist in concept-based navigation and in context-aware question generation and has some features for automated marking of the exercises, therewith supporting active learning. It is proprietary to SRI and tailored to one university textbook. There is no software system that facilitates active learning through concept navigation and context-sensitive question generation & marking that can take any digital textbook together with a relevant ontology or knowledge graph, considering that one existing system is tailor-made for one textbook only and has proprietary code.
The aim of the project is to devise the methods and tools—theoretical and validated by software implementation—for an intelligent text system for e-learning that can be navigated based on the annotation schema with its concepts and relations and that readers also can interact with by means of context-aware questions and instant feedback with automatically marked answers. One would be able to feed it any text on some topic X and a concept schema on topic X, have the system annotate it automatically, and make it generate relevant and understandable questions and answers that can provide useful feedback to the user.
Conducting research and development to create a generic system for smart textbooks contributes to various areas. It will promote nationwide resource and capacity building in HLTs and Digital Humanities, through supporting the research and its researchers and students, and though the resultant expected concrete outputs such as software and semantically annotated texts. It may be deployed for experiments in the digital humanities and educational technologies for evaluation with specific texts to analyse and query, including text in South African languages such as SAE specifically. It will advance the state of the art in ontologies and knowledge graphs notably on ontology-based question generation and improve on the interaction between text and ontologies.
- Maria Keet (PI), University of Cape Town (UCT)
- Toky Raboanary (PhD student 2019-, UCT)
- Zola Mahlaza (PhD student 2018-, UCT)
- Jacques de Lange (Masters in IT student 2021-, UCT)
- Alec Badenhorst (BSc honours student 2020, UCT)
- Umar Khan (BSc honours student 2020, UCT)
- Kyle Robbertze (BSc honours student 2019, UCT)
- Steve Wang (BSc honours student 2019, UCT)
- Jarryd Dunn (BSc honours student 2019, UCT)
- Matthew Poulter (BSc honours student 2019, UCT)
- Raboanary, T., Wang, S., Keet, C.M. Generating Answerable Questions from Ontologies for Educational Exercises. 15th Metadata and Semantics Research Conference (MTSR'21). Garoufallou, E., Ovalle-Perandones, M-A., Vlachidis, A (Eds.). Springer CCIS vol. 1537, 28-40. 29 Nov - 3 Dec, online.
- Mahlaza, Z., Keet, C.M., Dunn, J., Poulter, M. An evaluation of template and ML-based generation of user-readable text from a knowledge graph. Technical Report CoRR abs/2106.14613. 2021.
- Alec Badenhorst: Part of Speech Tagger Efficacy on South African English (honours [4th year] project report)
- Umar Khan: Building a South African English corpus (honours [4th year] project report)
- Kyle Robbertze: Lodestar: Ontology-Based Annotation of Textbooks (honours [4th year] project report)
- Steve Wang: Ontology Specifications to Generate Questions (honours [4th year] project report)
- Jarryd Dunn: A Comparison of Data-Driven and Template-Based Approaches to Natural Language Generation (honours [4th year] project report)
- Matthew Poulter: Comparing the Utterances Generated by Template-based and Data-driven NLG Systems (honours [4th year] project report)
- Github repo for question generation
- A computational analysis of SA English (honours project page)
- Toward smart textbooks (honours project page)
- Comparing end-to-end models and templates for Generating text (honours project page)