University calendar

Knowledge Component-Constrained Diagnostic Prompting for Automated Knowledge Gap Detection

Thursday, July 23, 2026 at 2:00pm to 3:00pm

Dion 311
Dr. Adnan El-Nasan
aelnasan@umassd.edu

Thesis Advisor: Dr. Adnan El-Nasan, Computer and Information Science

Committee Members:

  • Dr. Jiawei Yuan, Computer and Information Science
  • Dr. Long Jiao, Computer and Information Science

Abstract:

Introductory programming courses face challenges in providing scalable feedback on students’ understanding of core programming concepts. Automated grading shows whether the code passes its test cases, but not the specific gaps in the programming concepts that caused the errors. Knowledge Tracing models target those underlying concepts, however, they demand extensive historical data, machine-learning expertise, and GPU hardware many instructors may lack access to.
This thesis introduces Knowledge Component-Constrained Diagnostic Prompting (KCDP), a framework that diagnoses programming gaps through prompt design. KCDP directs a generic commercial Large Language Model to the concepts each problem is designed to test, requires it to reason through the code before naming any gap, and maps each root cause to an instructor's predefined Knowledge Components (KCs). Human experts and the model are restricted to the same KC vocabulary, so their diagnoses are directly comparable, and their agreement can be measured against each other. Given the prompting-oriented implementation, KCDP can potentially be deployed across different commercial LLM platforms using standard access and an instructor’s existing course-defined KC taxonomy. This provides an accessible approach for scalable, concept-level diagnosis in introductory programming education without requiring specialized machine learning infrastructure. KCDP was evaluated against two experts using Google's Gemini 2.5 Flash, where it reached an F1 of 0.839 against the human agreement ceiling of 0.885 (94.8% of human agreement) with a Cohen's κ of 0.557 against a human-human κ of 0.669. KCDP results held up when run on a second, unrelated model (DeepSeek), suggesting it is the prompt design, not the model, that is doing the work. The gaps also proved to be genuine in detecting recurring weaknesses. When KCDP flagged a struggling student as weak in a specific concept, that student failed the next problem testing the same concept 77% of the time, against a 26.4% chance rate, while strong students were rarely flagged. This shows that the output carries real information about concepts students struggle in which could be basis for triage and other downstream recovery tools.
 
For further information please contact Dr. Adnan El-Nasan at aelnasan@umassd.edu.

Back to top of page