Several experiments on the effects of pair programming versus solo programming in the context of education have been reported in the research literature. We present a meta-analysis of these studies that accounted for 18 manuscripts with 49 separate effect sizes in the domains of programming assignments, exams, retention, and affective measures. In total, our sample accounts for N=3308 students either using pair programming as a treatment variable or using traditional solo programming in the context of a computing course. Our findings suggest positive results in favor of pair programming across all four of the domains. We provide a comprehensive review of our results, and discuss our findings.
Systematic endeavors to take computer science and computational thinking (CT) to scale in K-12 classrooms are underway with curricula that emphasize core disciplinary ideas of CS, creativity in computing, and enactment of authentic CT skills especially in the context of programming in block-based programming environments. There is therefore a growing need to measure students learning of CT in the context of programming and also support all learners through this process of learning computational problem solving. The goal of the research presented in this paper is to explore hypothesis-driven approaches that can be combined with data driven approaches to better interpret student actions and process in log data captured from block-based programming environments with the goal of measuring and assessing students CT skills. Informed by past literature and our initial experiences with examining a dataset from the Fairy Assessment in the Alice programming environment, we adopt a more principled approach to assessment design and use the Evidence Centered Design framework to design tasks that will elicit evidence of specific CT skills. We piloted two tasks in two high school Exploring Computer Science classrooms, and conducted an in-depth analysis of student programs as well as video-recordings of a limited number of students as they built their programs to derive possible features and a priori patterns that we can then detect in log data, in addition to interpreting patterns found in log data through bottom-up data-driven approaches. Based on our empirical work and experiences, we present a preliminary framework that formalizes a process where a hypothesis-driven approach effectively complements data-driven learning analytics in interpreting students programming process and assessing CT in block-based programming environments.
In recent years, learning process data have become increasingly easy to collect through computer-based learning environments. This has led to increased interest in the field of learning analytics, which is concerned with leveraging learning process data in order to better understand, and ultimately to improve, teaching and learning. In computing education, the logical place to collect learning process data is through integrated programming environments (IDEs), where computing students typically spend large amounts of time working on programming assignments. While the primary purpose of IDEs is to support computer programming, the IDE might also be used as a mechanism for delivering learning interventions designed to enhance students learning processes and outcomes. The possibility of using the IDE both to collect learning process data, and to strategically intervene in the learning process, suggests an exciting design space for computing education researchers to explore: that of IDE-based learning analytics. In order to facilitate the systematic exploration of this design space, we present an IDE-based data analytics process model with four primary activities: (1) Collect data, (2) Analyze data, (3) Design intervention, and (4) Deliver intervention. For each activity, we identify key design dimensions, and review relevant computing education literature. To provide guidance on designing effective interventions, we then describe four relevant learning theories, and consider their implications for design. Based on our review of research and theory, we present a call-to-action for future research into IDE-based learning analytics.
Computational Thinking describes key principles from computer science that are broadly generalizable. Robotics programs can be engaging learning environments for acquiring core computational thinking competencies. However, few empirical studies evaluate the effectiveness of a robotics programming curriculum for developing broader computational thinking practices and skills. This study measures pre-post gains with new computational thinking assessments given to middle-school students who participated in a virtual robotics programming curriculum. Overall, participation in the virtual robotics curriculum was related to significant gains in pre- to post-test scores, with larger gains in students who made further progress through the curriculum. The success of this intervention suggests that participation in a scaffolded programming curriculum, within the concrete context of virtual robotics, supports the development of generalizable computational thinking skills that are associated with increased problem solving performance on non-robotics computing tasks. However, rate of progress through such a curriculum is a challenge for many teachers.
Most research in primary and secondary computing education has focused on understanding learners within formal classroom communities, leaving aside the growing number of promising informal online programming communities where young learners contribute, comment, and collaborate on programs. In this paper, we examined trends in computational participation in Scratch, an online community with over 1 million registered youth designers primarily 11-18 years of age. Drawing on a random sample of 5,000 youth programmers and their activities over three months in early 2012, we examined the quantity of programming concepts used in projects in relation to level of participation, gender, and account age of Scratch programmers. Latent class analyses revealed four unique groups of programmers with some effects for gender and length of Scratch membership. While there was no significant link between level of online participation, ranging from low to high, and level of programming sophistication, the exception was a small group of highly engaged users who were most likely to use more complex programming concepts. In addition, while the types of programmers remained remarkably stable over the three months of study, the membership within these groups shifted in different ways. The most elementary group tended to stay very stable, though within the other groups members shifted month to month between programming groups based on the programs uploaded in a given month. In the discussion we address the challenges of analyzing young learners programming in informal online communities and opportunities for designing more equitable computational participation.
Automatic identification of those students who are struggling to learn programming is beneficial for both the instructor and the institution. Instructors can investigate working processes of those students to identify misunderstandings and to create targeted assignments, while institutions may channel other types of resources such as student counseling to those in need of help. Whilst traditional methods for identifying such students rely mostly on simple regression models between two variables such as students' high school mathematics grade and programming course outcomes, it is sometimes desirable to investigate dependencies between multiple variables. In this manuscript, we review the application of statistical evaluators of bi-variate contingency tables of students' performance in completing exercises of an introductory programming course and their course outcome. We demonstrate a way of finding the ideal number of attempts needed to answer an exercise. We explore our findings for the associations between students' performance during the semester and the final exam, and report how the association varies from one exercise to another.
Analyzing the process data of students as they complete programming assignments has the potential to provide computing educators with insights into both their students and the processes by which they learn to program. In prior research, we explored the relationship between (a) students' programming behaviors and course outcomes, and (b) students' participation within an online social learning environment and course outcomes. In both studies, we developed statistical measures derived from our data that significantly correlates with students' course grades. Encouraged both by social theories of learning and a desire to improve the accuracy of our statistical models, we explore here the impact of incorporating our predictive measure derived from social behavior into three separate predictive measures derived from programming behaviors. We find that, in combining the measures, we are able to improve the overall predictive power of each measure. This finding affirms the importance of social interaction in the learning process, and provides evidence that predictive models derived from multiple sources of learning process data can provide significantly better predictive power by accounting for multiple factors responsible for student success.