Associate Professor
Education
Postdoc, Massachusetts Institute of Technology, 20052007
 Educational Data Mining
Postdoc, National Center for Supercomputing Applications, 2003 – 2005
 Automated Learning
PhD, University of Illinois at UrbanaChampaign, 2003
 Educational Computing
MS, Seoul National University, 1996
 Earth Science (focusing on astronomy)
BSc, Seoul National University, Seoul National University, 1994
 Earth Science
Research Interests
 Learning analytics
 Educational data mining
 Quantitative data analysis
 Information visualization
Contact Links

Discovery Park Building G157

(940) 3698810
Notable Journal Publications
 Effect of uninterrupted timeontask on students' success in Massive Open Online Courses (MOOCs) (2018). Published by Elsevier.
Abstract: This study investigated the relationship between uninterrupted timeontask and academic success of students enrolled in a Massive Open Online Course (MOOC). The variables representing uninterrupted timeontask, such as number and duration of uninterrupted consecutive learning activities, were mined from the log files capturing how 4286 students tried to learn Newtonian mechanics concepts in a MOOC. These variables were used as predictors in the logistic regression model estimating the likelihood of students getting a course certificate at the end of the semester. The analysis results indicate that the predictive power of the logistic regression model, which was assessed by Area Under the PrecisionRecall Curve (AUPRC), depends on the value of offtask threshold time, and the likelihood of students getting a course certificate increases as students were doing more uninterrupted learning activities over a longer period of time. The findings from this study suggest that a simple count of learning activities, which has been used as a proxy for timeontask in previous studies, may not accurately describe student learning in the computerbased learning environment because it does not reflect the quality, such as uninterrupted durations, of those learning activities.  Using SelfOrganizing Map and Clustering to Investigate ProblemSolving Patterns in the Massive Open Online Course: An Exploratory Study (2018). Published by the Journal of Educational Computing Research.
Abstract: This study investigated whether clustering can identify different groups of students enrolled in a massive open online course (MOOC). This study applied selforganizing map and hierarchical clustering algorithms to the log files of a physics MOOC capturing how students solved weekly homework and quiz problems to identify clusters of students showing similar problemsolving patterns. The usefulness of the identified clusters was verified by examining various characteristics of students such as the number of problems students attempted to solve, weekly and daily problem completion percentages, and whether they earned a course certificate. The findings of this study suggest that the clustering technique utilizing selforganizing map and hierarchical clustering algorithms in tandem can be a useful exploratory data analysis tool that can help MOOC instructors identify similar students based on a large number of variables that examine their characteristics from multiple perspectives.  Estimating student ability and problem difﬁculty using item response theory (IRT) and TrueSkill (2019). Published by EmeraldInsight.
Abstract: Purpose – The purpose of this paper is to investigate an efﬁcient means of estimating the ability of students solving problems in the computerbased learning environment. Design/methodology/approach – Item response theory (IRT) and TrueSkill were applied to simulated and real problemsolving data to estimate the ability of students solving homework problems in the massive open online course (MOOC). Based on the estimated ability, data mining models predicting whether students can correctly solve homework and quiz problems in the MOOC were developed. The predictive power of IRT and TrueSkillbased data mining models were compared in terms of Area Under the receiver operating characteristic Curve. Findings – The correlation between students’ ability estimated from IRT and TrueSkill was strong. In addition, IRT and TrueSkillbased data mining models showed a comparable predictive power when the data included a large number of students. While IRT failed to estimate students’ ability and could not predict their problemsolving performance when the data included a small number of students, TrueSkill did not experience such problems. Originality/value – Estimating students’ ability is critical to determine the most appropriate time for providing instructional scaffolding in the computerbased learning environment. The ﬁndings of this study suggest that TrueSkill can be an efﬁcient means for estimating the ability of students solving problems in the computerbased learning environment regardless of the number of students.  Mathematical learning models that depend on prior knowledge and instructional strategies (2008). Published by Physical Review Physics Education Research.
Abstract: We present mathematical learning models—predictions of student’s knowledge vs amount of instruction—that are based on assumptions motivated by various theories of learning: tabula rasa, constructivist, and tutoring. These models predict the improvement (on the posttest) as a function of the pretest score due to intervening instruction and also depend on the type of instruction. We introduce a connectedness model whose connectedness parameter measures the degree to which the rate of learning is proportional to prior knowledge. Over a wide range of pretest scores on standard tests of introductory physics concepts, it fits highquality data nearly within error. We suggest that data from MIT have low connectedness (indicating memorybased learning) because the test used the same context and representation as the instruction and that more connected data from the University of Minnesota resulted from instruction in a different representation from the test.  Measuring student learning with item response theory (2008). Published by Physical Review Physics Education Research.
Abstract: We investigate shortterm learning from hints and feedback in a Webbased physics tutoring system. Both the skill of students and the difﬁculty and discrimination of items were determined by applying item response theory IRT to the ﬁrst answers of students who are working on forcredit homework items in an introductory Newtonian physics course. We show that after tutoring a shifted logistic item response function with lower discrimination ﬁts the students’ second responses to an item previously answered incorrectly. Student skill decreased by 1.0 standard deviation when students used no tutoring between their incorrect ﬁrst and second attempts, which we attribute to “itemwrong bias.” On average, using hints or feedback increased students’ skill by 0.8 standard deviation. A skill increase of 1.9 standard deviation was observed when hints were requested after viewing, but prior to attempting to answer, a particular item. The skill changes measured in this way will enable the use of IRT to assess students based on their second attempt in a tutoring environment.