Investigating Aptitude in Learning Programming Lan
Investigating Aptitude in Learning Programming Lan
1. INTRODUCTION
In today’s rapidly evolving technological landscape, computing education continues to grapple with
persistent challenges in teaching programming language effectively. Despite significant advances in research
and curriculum development [1], students often struggle to acquire the intended programming skills. While
prerequisite courses aim to prepare students for advanced programming studies, a significant gap often
remains between the intended and actual skill acquisition [2], [3], [4].
Recent studies have explored various facets of programming language learning, including the impact
of AI tools on novice learners [5], student perspectives on AI assistants [6], shared neural resources in
programming [7], and the effect of professional development on coding education [8]. Additionally,
researchers have investigated instructional modalities, conceptual transfer between languages, and
correlations with Computational Thinking skills [9].
Despite these advancements, the fundamental question persists: What aptitudes are essential for
successfully learning programming languages? Traditional approaches, such as the IBM Programmer
Aptitude Test and standardized examinations like the SAT, have shown limited efficacy in predicting
programming success [10]. Moreover, while programming logic is rooted in mathematical skills, the role of
linguistic abilities in mastering programming syntax remains underexplored.
This study aims to address this gap by developing a comprehensive methodology to assess students'
programming performance and identify the specific aptitudes influencing their learning outcomes. We
hypothesize that a combination of cognitive, linguistic (Natural Language), and mathematical skills
significantly impacts a student's ability to learn programming languages effectively.
To test this hypothesis, we will employ a mixed-methods approach, utilizing Machine Learning
(ML) and Natural Language Processing (NLP) techniques to analyze student performance data and aptitude
indicators. Our research question is: What combination of aptitudes best predicts success in learning
programming languages?
The significance of this study lies in its potential to inform more effective curriculum design and
student support strategies in computer science education. By identifying key aptitudes, educators can develop
targeted interventions to enhance students' programming skills, potentially improving retention rates and
overall success in computing programs.
2. LITERATURE REVIEW
In the paper "Studying the Effect of AI Code Generators on Supporting Novice Learners in
Introductory Programming," a controlled experiment with 69 novice learners was conducted to investigate
the impact of AI code generators, specifically OpenAI Codex, on introductory programming [5]. The findings
revealed that learners with Codex access demonstrated significantly improved code-authoring performance,
with increased completion rates and higher scores. While learners with access to Codex showed better
performance in post-tests, the difference was not statistically significant. The study also highlighted the
influence of prior programming competency, with learners having higher pre-test scores benefiting more
from AI code generators, such as Codex, during their training. Additionally, qualitative feedback suggested
that Codex usage reduced stress and enhanced learners' enthusiasm for programming, although concerns
about over-reliance were raised.
The study by [6] investigates students' perspectives on using ChatGPT in programming education.
The study involved 41 undergraduate students over eight weeks, where students used ChatGPT for weekly
programming assignments. The findings indicate advantages such as quick and correct answers, improved
thinking skills, and debugging facilitation. However, limitations include potential laziness, incomplete or
incorrect answers, and concerns about professional impact. The research suggests integrating generative AI
tools into programming courses with caution, considering both advantages and limitations.
In [7] demonstrated the shared neural resources between computer code comprehension and formal
logical inference, shedding light on the cognitive processes underlying programming language learning.
However, a potential limitation lies in the study's focus on expert programmers, possibly neglecting the
nuances of aptitude development in novice learners. Further research may be needed to explore how these
neural patterns evolve with varying levels of programming experience, contributing valuable insights to
investigations into the aptitude required for learning programming languages.
In study [8] assessing the impact of continuous professional development on elementary teachers'
self-efficacy in teaching coding and computational thinking. The research emphasizes the importance of
hands-on methods and student success in enhancing teachers' confidence. The findings highlight the
effectiveness of year-long professional development in improving teacher self-efficacy, with a focus on
various coding concepts. The study provides insights into the professional growth of elementary teachers in
coding and computational thinking.
The literature reveals a noteworthy association between programming experience and enhanced
neural efficiency in figural reasoning tasks, as evidenced by the study conducted by [9]. This aligns with the
broader understanding of programming skills influencing cognitive processes and figural reasoning abilities.
However, a potential gap in the existing research is the lack of direct assessment of cognitive aptitude, such
as critical thinking, problem-solving, and logical reasoning, which are crucial components in the context of
learning programming languages. Addressing this gap is essential for a comprehensive understanding of the
cognitive aptitude required for effective programming language acquisition.
The researcher [11] investigated predictors of success in an introductory programming course,
revealing first-semester GPA and language admission test scores as significant factors. The study recognized
the complexity of predicting success, considering varied factors across engineering specializations. However,
the study did not directly address the nuanced aptitude for learning programming languages, leaving a gap
that does not align with research on programming language aptitude using machine learning and natural
language processing.
A study [12] delves into the impact of technical reading training and spatial skills training on novice
programming ability, revealing that a CS-focused technical reading intervention yields larger programming
gains than a standardized spatial intervention. The research underscores the distinctiveness of cognitive skills,
emphasizing the relevance of technical reading and spatial abilities to programming success. However, a
notable limitation is the potential lack of generalizability due to a small, homogeneous sample with high
spatial abilities. The study conducted during the COVID-19 pandemic introduces confounding factors, urging
caution when applying findings to diverse learner populations or alternative contexts.
The study about neuroimaging [13] delves into the neural underpinnings of reading, visualization,
and coding in novice programmers, using fNIRS to identify distinct brain activation patterns in occipital,
parietal, and frontal cortices. While shedding light on the cognitive processes involved in coding, the study
faces limitations, including a restricted participant pool from a single university, potentially limiting
generalizability. The reliance on fNIRS introduces the risk of false negatives, and the study's construct
validity is challenged by the multifaceted nature of spatial abilities. Acknowledging these limitations, the
research underscores the need for further exploration into alternative experimental paradigms and a broader
focus on specific programming activities to strengthen the study's findings.
Int. J. of DI & IC, Vol. 3, No. 4, December 2024: 40-61 41
ISSN: 2583-6250 Prisma Publications
for predicting real-world programming expertise requires additional validation. Lastly, the study's reliance on
self-paced online learning may limit its generalizability to more structured educational settings.
In [21], click or tap here to enter text employed functional magnetic resonance imaging (fMRI) to
investigate neural representations of computer programs, revealing insights into how specific brain regions
encode static and dynamic code properties. The study's mapping of brain representations to machine learning
models enhances our understanding of the intricate connection between the human brain and code
representations, showcasing the potential application of neuroscience in deciphering programming cognition.
However, limitations include a focus on a dataset with simple Python programs, potentially restricting
generalizability, an emphasis on comprehension tasks, and the need for further exploration of brain-code
mappings across diverse programming languages. Sample size concerns and the evolving nature of
neuroimaging technology also prompt considerations for refining future research.
In [22] underscored the importance of Computational Thinking (CT) education in fostering essential
skills such as problem-solving and logical reasoning. They discuss the current emphasis on CT in education,
citing initiatives like "CT for All" and various standards. The authors present four studies highlighting the
effectiveness of metaphors in teaching CT concepts, particularly in programming education. Identified
challenges encompass the need for clear CT competencies, efficient use of metaphors, exploration of
pedagogical strategies and technologies, teacher professional development, and the assessment of CT
competencies. However, limitations include a primary focus on primary education, potential oversights in
discussing challenges, and a perceived emphasis on the need for further research without comprehensive
solutions.
The study by [23] introduces the Programming-oriented Computational Thinking Skills (P-CTS)
scale, assessing conceptual knowledge, algorithmic thinking, and evaluation in the context of programming
education. The findings underscore the importance of programming experience, emphasizing the critical
threshold of over one year for skill enhancement. Notably, the research highlights gender differences in
evaluation skills, contributing valuable insights to inclusive programming education. However, limitations
include a focus on university students, raising questions about generalizability to other educational levels,
and the need for further exploration regarding the scale's applicability in interdisciplinary contexts or STEM
education. Additionally, the study suggests avenues for future research to delve into specific factors
influencing gender differences in evaluation skills, enhancing the scale's robustness and applicability.
The study by [24] delves into the cognitive benefits of learning to code, asserting that coding skills
share thinking processes with mathematical Modelling and creative problem-solving. The research presents
empirical evidence supporting the potential transfer effects of coding skills to other cognitive domains,
underscoring the necessity for explicit training to facilitate such effects. Despite the promising evidence, the
study acknowledges limitations, including the absence of a definitive causal relationship and potential
confounding factors. The meta-analytical approach's limitations are recognized, emphasizing the need for
more precise assessment tools and further research to unravel the intricate mechanisms and conditions
influencing the transferability of coding skills.
Researchers [25] investigated the neurocognitive processes in code comprehension, using
electrophysiological measures to reveal N400 effects for semantic congruencies and P600 effects for
syntactic anomalies in Python code. The study suggests that expert programmers prioritize structural aspects
over semantics, emphasizing the incremental nature of code comprehension. However, limitations include the
cross-sectional design hindering causal inferences, a focus on Python programmers limiting generalizability,
and the exclusive emphasis on certain code violations overlooking broader aspects of comprehension.
Addressing these limitations in future research is crucial for a more comprehensive understanding of the
cognitive processes involved in programming language learning.
Study [1] tackled challenges in teaching introductory programming through the HTP programming
application, designed for early problem identification and intervention. Utilizing action research, the study
focuses on students at the Polytechnic of Guarda, yielding positive outcomes in monitoring and a predictive
neural network model for early failure detection. However, limitations include context-specific effectiveness,
potential bias in student engagement, and uncertainties introduced by external factors like the COVID-19
pandemic, emphasizing the need for cautious interpretation and further validation.
In this era, computer programming plays a significant role in solving real-life problems in every
domain. The field usually resembles the taxonomy of science, technology, engineering, and mathematics
(STEM) areas. Programming languages are also like natural languages. Empirically, minimal research has
been evaluated on the cognitive aspect of programming to understand the human mind and has the power to
change educational practices [26]. Thus, it shows that programming is specifically like natural language.
However, as their proposed framework explains, computer programming and natural language are based on a
set of building blocks, such that words and phrases are specifically composed in natural language. Secondly,
computer languages are based on variables. Moreover, it explains the criteria for these building blocks to
combine and create new meanings from them.
Int. J. of DI & IC, Vol. 3, No. 4, December 2024: 40-61 43
ISSN: 2583-6250 Prisma Publications
A recent study by [27] investigated the relationship between a second natural language (L2) and a
programming language. However, their purpose was to explore the 1st year students’ (College of Engineering
and Applied Sciences at Cincinnati University) concepts of computing with a 1st year engineering
curriculum. Moreover, students’ course performance is evaluated based on previous and ongoing experiences
with SLA and accounted for through their previous knowledge of programming languages. They set the
criteria for a second natural language that at least one foreign language course has to be taken or studied (e.g.,
Spanish, German, French, and Italian). Secondly, for the computing language, a single course must be studied
previously at the university level or in high school. They said in their study that they used non-parametric
approaches to discover the significant differences between those who experienced a second natural language
and those who did not. Thus, their statistical findings explain that there is no inconclusive relationship
between them, but further study will be more appropriate.
Additionally, a study was conducted to measure the programming skills of high school (K-12)
students [28]. This research was carried out on a group of students at a large university in the Pacific
Northwest of the United States. The work aims to measure the effects of previous programming experience in
introductory programming courses. Secondly, they evaluated the previous programming knowledge with the
introductory programming course midterm and final term assessments, as well as the survey and aptitude test.
Thus, their findings explain the mismatch or differences between the results of surveys and aptitude tests.
The results of the survey reveal that previous knowledge, particularly that obtained in high school, has only a
minor impact on midterm performance, while the aptitude performance of the students has a crucial impact
on both the midterm and final term in the introductory programming course.
A particular study was also conducted on problem-solving as a predictor of programming
performance [29]. However, the purpose of the research specified the correlation of problem-solving ability
with their academic performance in 1st year programming courses. Moreover, their limitation explains that
they used five specific variables in the study, such that a student's achievements in a programming course are
set as a dependent variable. As for independent variables, they defined four aptitude tests composed of
predictors: logical reasoning, non-verbal reasoning, numerical reasoning, and verbal reasoning. Notably, each
student's participating group consists of 379. Nevertheless, their finding indicates a correlation between
students' logical reasoning, numerical reasoning, verbal logic, and performance in computer programming
modules. Secondly, their study mentions no correlation between students' non-verbal reasoning and
performance in computer programming modules.
Countless instructors and educators claim that innumerable students are facing difficulty in learning
1st-year University or institution computing science courses. Thus, various initiatives are assessed to support
student’s academic success. University instructors provide several forms of academic support with the
programme, such as learning strategies that should be discussed with students. The forum is named
"Academic Enhancement Program" (AEP), which was invented and is active by the School of Computing
Science and the Learning Commons at Simon Fraser University. This particular forum provides learning
strategies for 1styear CS University courses from late 2006. In parallel to these learning strategies, the
authors in [30] came up with another strategy to improve or enhance the learning experience of students that
focused on peer instruction and active learning with audience response systems and was named “i-clickers".
In their study, they defined three variables such as predictors (MID, WIC, and FIN), which were used in
ordinary multiple regression analysis and will be analyzed for the potential of these activities over course
success. The limitation of the study is based on the introductory course CS, which is offered in the 2013 Fall
Semester and is composed of 363 students. Their findings indicate that the weighted i-clicker is not a very
suitable predictor for the final exam score.
Nonetheless, a recent study by [31] linked natural language aptitude to atomic differences in
learning programming languages. In their study, they believe that natural language itself is a strong predictor
for learning and coding programming languages, which signals that learning a new programming language
may be parallel in terms of learning a new natural language. However, they described various variations of
the findings in their study, such that behavioural and neural (resting-state EGG) are the indicators of
measuring language aptitude, whereas numeracy and fluid cognitive measures, i.e., fluid reasoning, working
memory, and inhibitory control, were defined as the predictors. On the other hand, researchers also suggested
that programming languages can be predicted with the predictor's mathematical aptitude [32]. Chomsky's
theory, based on formal language, recently defined accurate sentence building in natural languages as an
origin tool in mathematics theory and programming languages [33].
The possible limitation of the present approaches, as determined by the literature review, is that
previously, no single educational case study existed that investigated the aptitude predictors collectively, such
as cognitive, natural language, and mathematics. This is where the novelty of the present work comes in
trying to fill this gap by undertaking a comprehensive study that touches on all aspects of aptitude for
learning programming languages. Firstly, a large undergraduate academic record and course objectives
dataset related to Pakistan's universities are employed in this study. Secondly, the study evaluates the
Int. J. of DI & IC, Vol. 3, No. 4, December 2024: 40-61 44
ISSN: 2583-6250 Prisma Publications
developed dataset using several ML algorithms and uses an NLP state-of-the-art pre-trained text-based
algorithm. Thirdly, it evaluates all the models based on error metrics. Lastly, it compares all the employed
models to select the most effective and best-performing model.
3. METHODOLOGY
The complexity of understanding programming language aptitude necessitates a rigorous and
systematic research approach.
a carefully structured research design, our study aimed to capture the multifaceted nature of programming
language learning through two primary data sources, as detailed in Table 1.
The methodological architecture, visually represented in Figure 1, provides a holistic framework for
our investigative process. This approach allows for a nuanced examination of the various skills and factors
that potentially contribute to successful programming language acquisition.
Each course has tailored objectives, which may overlap with other institutions but are specific to this
university's curriculum.
3.3. Implementation
3.3.1. Dataset Overview
The implementation phase utilized a student dataset comprising 18 features, including identification,
prior education marks, and various course marks. This dataset focused on students who completed the
programming fundamentals course alongside elective and core subjects. Due to variations in course timing,
some missing values were present in the dataset. However, comprehensive academic records (SSC and
HSSC) were available for all students, minimizing the impact of missing data.
metrics depending on the data type. Figure 2 illustrates the workflow for filling in missing values in a
student's record.
3.4. Modeling Using Machine Learning (ML) and Natural Language Processing (NLP)
In this section, we outline a comprehensive framework for predicting programming fundamental
marks through machine learning (ML) and natural language processing (NLP) techniques. This approach
leverages state-of-the-art regression models and zero-shot text classification methods to map aptitude skills
and predict student performance.
We divided the data into training (80%) and testing (20%) sets using scikit-learn's train_test_split
function, with random_state=45 for consistency. Four state-of-the-art ML models were employed to predict
programming fundamentals marks.
Selecting an appropriate K value is crucial to balance between overfitting (small K) and underfitting
(large K). While rare exceptions exist, higher K values are generally recommended to avoid overfitting.
The RFR model is constructed by drawing an input variable X from the training set, creating a
specified number (𝑁) of regression trees, after (𝑁) variables in the RFR model and trees such as {{𝑇(𝑖)}}𝑁 𝑖
are increased, averaging predictions from all decision trees. The predictor of regression after n variables and
m trees is mentioned in Equation 2.
𝑁 1
𝑓𝑟𝑓 (𝑖) = ∑𝑁
𝑁=1 𝑇(𝑖) (2)
𝑁
RFR promotes tree diversity through bagging, which involves randomly sampling training data with
replacement. When constructing trees, RFR randomly selects a subset of features and chooses the best feature
and split point from this subset. This approach reduces the correlation between trees and lowers
generalization error [36].
RFR trees grow without pruning, making them relatively lightweight. Out-of-bag (OOB) subsets,
formed by data points not included in the training, can be used to estimate individual tree performance [37].
As the number of trees increases, the generalization error decreases, indicating reduced overfitting risk.
identify the specific aptitude skills. Our methodology encompasses ML and NLP models that are used to
address a specific problem. The results demonstrate that we can predict students' scores in the fundamental
programming course based on their previous courses and academic scores. Second, course objective
predictors represent the required skills, namely cognitive, natural language, and mathematics. These
predictors should be applied to the predicted scores of PF to determine the skill set a student possesses in
order to pursue their journey in the programming language domain. Additionally, previous academic courses
also define the abilities attained by students, making it easier to identify their specific ability traits. Therefore,
when designing curricula, instructors need to focus on students' abilities based on their previous courses. This
will make it easier for students to tackle their further learning aspects. Moreover, by employing this
methodology, instructors can also predict the learning of programming language courses, leading to more
effective learning paradigms. Figure 4 illustrates the essential aptitude skills required for learning
programming languages.
Nine features exhibit the most significant amount of missing data, with some courses showing up to
95% missing entries, as highlighted in Table 4. These courses are considered for removal due to the
infeasibility of imputation.
Table 4. Features with Maximum Missing Data
Index Features Maximum
Present
Values
1 Data Structures and Algorithms Marks 1236
2 3D Modeling Marks 1236
3 Communication and Presentations Skills Marks 1237
4 Linear Algebra Marks 1236
5 Pre-Calculus – I Marks 1187
6 Pre-Calculus – II Marks 1237
7 Probability and Statistics Marks 1236
8 Life and Living Marks 1237
9 Digital and Logic Design Marks 1236
Established criteria for "SSC Obtained Marks" feature values based on the determined ranges.
Filtered true values and removed false values from the feature index. The resulting symmetric data
distribution is illustrated in Figure 7, with the box plot quartile values (Minimum: 570, Q1: 685, Q2
(Median): 773, Q3: 841, Maximum: 967).
This approach effectively addressed the outliers, resulting in a more evenly distributed dataset
suitable for further analysis.
Through quantitative analysis, the lower bound (LB) was set at the 10th percentile (0.1) and the
upper bound (UB) at the 95th percentile (0.95). This approach removes extreme data points, retaining values
within the 10-95% range of the distribution, as shown in Table 6.
After filtering based on these bounds, outliers were removed, and the data exhibited a more balanced
distribution. This is reflected in the box plot (Figure 8), with quartile values as follows: minimum = 566, Q1
= 628, Q2 = 685, Q3 = 755 and maximum = 909. The removal of outliers ensures a more accurate
representation of the data.
The model's performance is visualized in Figure 9, showing actual vs. predicted programming
fundamentals marks.
Figure 9. Regression line plot of actual and predicted programming fundamental scores for KNR
All predictors showed positive influences on programming marks, with course-specific marks
having a stronger impact than general academic performance (SSC and HSSC marks).
Furthermore, the actual y values (actual programming marks) and predicted (Ý) values (predicted
programming marks) are plotted along with both regression lines, as shown in Figure 10.
Figure 10. Regression line plot of actual and predicted programming fundamental scores for RFR
The LGBMR model achieved an R² score of 88% on the training set and 85% on the test set, with an
MSE of 0.0067. The specific coefficient values and interpretation are presented in Table 9. The LGBMR
model suggests that general academic performance (SSC and HSSC marks) has a stronger influence on
programming performance than course-specific marks. Furthermore, the R2 prediction is visualized using a
regression line that represents both the actual (Y) and predicted (Ý) programming fundamentals marks, as
shown in Figure 11.
Figure 11. Light GBM Regressor, regression line plot of actual and predicted programming fundamental
scores
Noted findings:
Introduction to Communication and Presentation Skills, Programming Fundamentals, and English
Composition and Comprehension show strong potential for predicting cognitive skills.
English Composition and Comprehension demonstrates the highest potential for predicting natural
language skills.
Applied Physics and Discrete Structures exhibit the strongest potential for predicting mathematics
skills.
5. DISCUSSION
5.1. Key Findings
The K-Nearest Neighbors Regressor (KNR) demonstrated the best overall performance with R²
scores of 98% (train) and 97% (test). However, the Random Forest Regressor (RFR) showed the lowest
Mean Squared Error (0.001), indicating high precision.
Our analysis revealed that cognitive skills were the most crucial for learning programming,
accounting for 62% of the required aptitude. Natural language and mathematics skills contributed 24% and
14%, respectively. This finding challenges the assertion by [31] that mathematics is not an essential predictor
of programming proficiency.
The course predictors for the courses Introduction to Communication and Presentation Skills,
Programming Fundamentals, and English Composition and Comprehension showed strong potential for
predicting cognitive skills. Applied Physics and Discrete Structures demonstrated high potential for
predicting mathematics skills.
5.2. Implications
Our findings suggest that a balanced approach incorporating cognitive, linguistic, and mathematical
skills development is crucial for effective programming education. This challenges traditional curriculum
designs that may overemphasize mathematical skills at the expense of cognitive and linguistic development.
By identifying key aptitudes, educators can develop targeted interventions to enhance students'
programming skills. This could potentially improve retention rates and overall success in computing
programs. The predictive model developed in this study could inform more effective admissions criteria for
computer science programs, helping to identify students with the highest potential for success in
programming courses.
Regarding the course-specific focus on predicting performance in Programming Fundamentals, our
methodology demonstrated broader applicability across various courses in the computer science curriculum.
Our findings revealed that different courses served as effective predictors for specific aptitudes: Introduction
to Communication and Presentation Skills, Programming Fundamentals, and English Composition and
Comprehension showed strong potential for predicting cognitive skills, English Composition and
Comprehension also shows higher potential for predicting natural language skills, while Applied Physics and
Discrete Structures demonstrated high potential for predicting mathematics skills. The successful
implementation of our Machine Learning and Natural Language Processing approach suggests that this
framework could be effectively adapted to predict student performance in other courses.
6. CONCLUSION
This study aimed to identify the essential aptitudes for successfully learning programming
languages, addressing a fundamental question in computing education. Our findings reveal that cognitive
skills play the most crucial role (62%), followed by natural language (24%) and mathematics skills (14%),
challenging previous assertions about the relative importance of these aptitudes. The K-Nearest Neighbors
Regressor demonstrated the best overall performance in predicting student success, while specific courses
showed strong potential for predicting various aptitudes. These insights have significant implications for
curriculum design, student support strategies, and admissions criteria in computer science education,
suggesting a need for a balanced approach that incorporates cognitive, linguistic, and mathematical skills
development. While limited by its focus on a single institution and specific courses, this study provides a
foundation for future research, including cross-institutional and longitudinal studies. By employing a novel
approach combining Machine Learning and Natural Language Processing techniques, we have developed a
methodology that could be implemented in other institutions to predict student performance and inform
curriculum design.
ACKNOWLEDGEMENTS
As this research was conducted as part of the MS Thesis at Riphah International University so, the
authors would like to thank the Department of Computer Science and Information Technology, the Faculty,
and the Head of the Department, who permitted the use of the university's dataset for this research work.
Also, the authors would like to extend their gratitude to the programme coordinator, who allotted and helped
gather and give sequence/detail about the student's record.
CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest to this work.
REFERENCES
[1] J. Figueiredo and F. Garcia-Penalvo, “Teaching and Learning Tools for Introductory Programming in University
Courses,” SIIE 2021 - 2021 Int. Symp. Comput. Educ., no. September, 2021, doi:
10.1109/SIIE53363.2021.9583623.
[2] L. T. Yong, C. Y. Qi, C. S. Yee, A. Johnson, and N. K. Hoong, “Designing and Developing a PDA Food
Ordering System Using Interaction Design Approach: A Case Study,” in 2009 International Conference on
Computer Technology and Development, 2009, pp. 68–71. doi: 10.1109/ICCTD.2009.18.
[3] I. Milne and G. Rowe, “Difficulties in learning and teaching programming - Views of students and tutors,” Educ.
Inf. Technol., vol. 7, no. 1, pp. 55–66, 2002, doi: 10.1023/A:1015362608943.
[4] M. N. Ismail, N. A. Ngah, and I. N. Umar, “Instructional strategy in the teaching of computer programming: A
need assessment analyses,” Turkish Online J. Educ. Technol., vol. 9, no. 2, pp. 125–131, 2010.
[5] M. Kazemitabaar, J. Chow, C. K. T. Ma, B. J. Ericson, D. Weintrop, and T. Grossman, Studying the effect of AI
Code Generators on Supporting Novice Learners in Introductory Programming, vol. 1, no. 1. Association for
Computing Machinery, 2023. doi: 10.1145/3544548.3580919.
[6] R. Yilmaz and F. G. Karaoglan Yilmaz, “Augmented intelligence in programming learning: Examining student
views on the use of ChatGPT for programming learning,” Comput. Hum. Behav. Artif. Humans, vol. 1, no. 2, p.
100005, 2023, doi: 10.1016/j.chbah.2023.100005.
[7] Y. F. Liu, J. Kim, C. Wilson, and M. Bedny, “Computer code comprehension shares neural resources with
formal logical inference in the fronto-parietal network,” Elife, vol. 9, pp. 1–22, 2020, doi: 10.7554/eLife.59340.
[8] P. J. Rich, S. L. Mason, and J. O’Leary, “Measuring the effect of continuous professional development on
elementary teachers’ self-efficacy to teach coding and computational thinking,” Comput. Educ., vol. 168, no.
March, 2021, doi: 10.1016/j.compedu.2021.104196.
[9] B. Helmlinger, M. Sommer, M. Feldhammer-Kahr, G. Wood, M. E. Arendasy, and S. E. Kober, “Programming
experience associated with neural efficiency during figural reasoning,” Sci. Rep., vol. 10, no. 1, pp. 1–14, 2020,
doi: 10.1038/s41598-020-70360-z.
[10] R. Asif, A. Merceron, S. A. Ali, and N. G. Haider, “Analyzing undergraduate students’ performance using
educational data mining,” Comput. Educ., vol. 113, pp. 177–194, 2017, doi: 10.1016/j.compedu.2017.05.007.
[11] J. Köhler, L. Hidalgo, and J. L. Jara, “Predicting Students’ Outcome in an Introductory Programming Course:
Leveraging the Student Background,” Appl. Sci., vol. 13, no. 21, 2023, doi: 10.3390/app132111994.
[12] M. Endres, M. Fansher, P. Shah, and W. Weimer, “To read or to rotate? comparing the effects of technical
reading training and spatial skills training on novice programming ability,” ESEC/FSE 2021 - Proc. 29th ACM
Jt. Meet. Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., pp. 754–766, 2021, doi: 10.1145/3468264.3468583.
[13] M. Endres, Z. Karas, X. Hu, I. Kovelman, and W. Weimer, “Relating reading, visualization, and coding for new
programmers: A neuroimaging study,” Proc. - Int. Conf. Softw. Eng., pp. 600–612, 2021, doi:
10.1109/ICSE43902.2021.00062.
[14] A. Zavgorodniaia, A. Hellas, O. Seppälä, and J. Sorva, “Should Explanations of Program Code Use Audio, Text,
or Both? A Replication Study,” ACM Int. Conf. Proceeding Ser., vol. 2020, pp. 1–10, 2020, doi:
10.1145/3428029.3428050.
[15] Y. Kao, B. Matlen, and D. Weintrop, “From One Language to the Next: Applications of Analogical Transfer for
Programming Education,” ACM Trans. Comput. Educ., vol. 22, no. 4, 2022, doi: 10.1145/3487051.
[16] M. Endres, W. Weimer, and A. Kamil, “An Analysis of Iterative and Recursive Problem Performance,” SIGCSE
2021 - Proc. 52nd ACM Tech. Symp. Comput. Sci. Educ., pp. 321–327, 2021, doi: 10.1145/3408877.3432391.
[17] J. Jeuring, R. Groot, and H. Keuning, “What Skills Do You Need When Developing Software Using ChatGPT?
(Discussion Paper),” ACM Int. Conf. Proceeding Ser., pp. 1–11, 2023, doi: 10.1145/3631802.3631807.
[18] S. Rajendran, S. Chamundeswari, and A. A. Sinha, “Predicting the academic performance of middle- and high-
school students using machine learning algorithms,” Soc. Sci. Humanit. Open, vol. 6, no. 1, p. 100357, 2022,
doi: 10.1016/j.ssaho.2022.100357.
[19] S. Srikant, C. Science, T. Supervisor, L. Kolodziejski, and C. Science, “Understanding Computer Programs :
Computational and Cognitive Perspectives by,” no. 2011, 2023.
[20] C. H. Kuo, M. Mottarella, T. Haile, and C. S. Prat, “Predicting Programming Success: How Intermittent
Knowledge Assessments, Individual Psychometrics, and Resting-State EEG Predict Python Programming and
Debugging Skills,” 2022 30th Int. Conf. Software, Telecommun. Comput. Networks, SoftCOM 2022, 2022, doi:
10.23919/SoftCOM55329.2022.9911411.
[21] E. H. Brain and O. F. C. Programsthe, “Representations of Computer Programs in the Human Brain,” pp. 1–30,
2022.
[22] C. Angeli and M. Giannakos, “Computational thinking education: Issues and challenges,” Comput. Human
Behav., vol. 105, p. 106185, Apr. 2020, doi: 10.1016/J.CHB.2019.106185.
[23] S. Kılıç, S. Gökoğlu, and M. Öztürk, “A Valid and Reliable Scale for Developing Programming-Oriented
Computational Thinking,” J. Educ. Comput. Res., vol. 59, no. 2, pp. 257–286, 2021, doi:
10.1177/0735633120964402.
[24] R. Scherer, F. Siddiq, and B. Sánchez-Scherer, “Some Evidence on the Cognitive Benefits of Learning to Code,”
Front. Psychol., vol. 12, no. September, pp. 1–5, 2021, doi: 10.3389/fpsyg.2021.559424.
[25] C. H. Kuo and C. S. Prat, “Computer programmers show distinct, expertise-dependent brain responses to
Int. J. of DI & IC, Vol. 3, No. 4, December 2024: 40-61 59
ISSN: 2583-6250 Prisma Publications
violations in form and meaning when reading code,” Sci. Rep., vol. 14, no. 1, 2024, doi: 10.1038/s41598-024-
56090-6.
[26] E. Fedorenko, A. Ivanova, R. Dhamala, and M. U. Bers, “The Language of Programming: A Cognitive
Perspective,” Trends Cogn. Sci., vol. 23, no. 7, pp. 525–528, 2019, doi: 10.1016/j.tics.2019.04.010.
[27] J. Agarwal, G. W. Bucks, K. A. Ossman, T. J. Murphy, and C. E. Sunny, “Learning a Second Language and
Learning a Programming Language: An Exploration,” ASEE Annu. Conf. Expo. Conf. Proc., 2021, doi:
10.18260/1-2--37423.
[28] D. H. Smith, Q. Hao, F. Jagodzinski, Y. Liu, and V. Gupta, “Quantifying the Effects of Prior Knowledge in
Entry-Level Programming Courses,” in CompEd 2019 - Proceedings of the ACM Conference on Global
Computing Education, 2019. doi: 10.1145/3300115.3309503.
[29] G. Barlow-Jones and D. van der Westhuizen, “Problem solving as a predictor of programming performance,” in
Communications in Computer and Information Science, 2017. doi: 10.1007/978-3-319-69670-6_14.
[30] D. Cukierman, “Predicting success in university first year computing science courses: The role of student
participation in reflective learning activities and in I-clicker activities,” in Annual Conference on Innovation and
Technology in Computer Science Education, ITiCSE, 2015, pp. 248–253. doi: 10.1145/2729094.2742623.
[31] C. S. Prat, T. M. Madhyastha, M. J. Mottarella, and C. H. Kuo, “Relating Natural Language Aptitude to
Individual Differences in Learning Programming Languages,” Sci. Rep., vol. 10, no. 1, pp. 1–10, 2020, doi:
10.1038/s41598-020-60661-8.
[32] B. Shneiderman and R. Mayer, “Syntactic/semantic interactions in programmer behavior: A model and
experimental results,” Int. J. Comput. Inf. Sci., vol. 8, no. 3, pp. 219–238, 1979, doi: 10.1007/BF00977789.
[33] V. J. Shute, “Who is Likely to Acquire Programming Skills?,” J. Educ. Comput. Res., vol. 7, no. 1, pp. 1–24,
1991, doi: 10.2190/vqjd-t1yd-5wvb-rypj.
[34] Y. Ao, H. Li, L. Zhu, S. Ali, and Z. Yang, “The linear random forest algorithm and its advantages in machine
learning assisted logging regression modeling,” 2019. doi: 10.1016/j.petrol.2018.11.067.
[35] V. Rodriguez-Galiano, M. Sanchez-Castillo, M. Chica-Olmo, and M. Chica-Rivas, “Machine learning predictive
models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support
vector machines,” Ore Geol. Rev., vol. 71, pp. 804–818, Dec. 2015, doi: 10.1016/J.OREGEOREV.2015.01.001.
[36] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001, doi:
10.1023/A:1010933404324/METRICS.
[37] J. Peters et al., “Random forests as a tool for ecohydrological distribution modelling,” Ecol. Modell., vol. 207,
no. 2–4, pp. 304–318, 2007, doi: 10.1016/j.ecolmodel.2007.05.011.
[38] C. M. Bishop and N. M. Nasrabadi, Pattern recognition and machine learning, vol. 4, no. 4. Springer, 2006.
[39] A. Keprate and R. M. C. Ratnayake, “Using gradient boosting regressor to predict stress intensity factor of a
crack propagating in small bore piping,” IEEE Int. Conf. Ind. Eng. Eng. Manag., vol. 2017-Decem, no.
December, pp. 1331–1336, 2017, doi: 10.1109/IEEM.2017.8290109.
[40] J. Brownlee, “A gentle introduction to the gradient boosting algorithm for machine learning,” Mach. Learn.
Mastery, vol. 21, 2016.
[41] N. S. Zheng, X. W. Jiang, Y. Ao, and X. Zhao, “Prediction of tariff package model using ROF-LGB algorithm,”
ACM Int. Conf. Proceeding Ser., pp. 54–58, 2019, doi: 10.1145/3352411.3352421.
[42] L. Zhang, T. Xiang, and S. Gong, “Learning a deep embedding model for zero-shot learning,” Proc. - 30th IEEE
Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 3010–3019, 2017, doi:
10.1109/CVPR.2017.321.
[43] C. C. Aggarwal, “An Introduction to Outlier Analysis,” Outlier Anal., pp. 1–34, 2017, doi: 10.1007/978-3-319-
47578-3_1.
BIOGRAPHIES OF AUTHORS
Muhammad Faisal Iqbal is a data science professional. I hold a Bachelor's degree in Computer
Science from Sarhad University of Sciences & Information Technology and a Master's degree in
Data Science from Riphah International University, both in Islamabad, Pakistan. My professional
experience spans various domains, including Oracle Financial Applications, Database
Development, Data Science, Machine Learning, Deep Learning, Natural Language Processing, and
Multimodals. I have also provided training on software utilities and version releases to both private
and government organizations. My research interests align with the cutting-edge fields of Data
Science, Machine Learning, Deep Learning, Natural Language Processing, Multimodal,
Generative AI, and Large Language Models. He can be contacted at email:
muhammadfaisal.softech@gmail.com
Umer Khalil is an enthusiastic and adaptive professional with a passion for enhancing his skills
and contributing to innovative projects. With a strong foundation in remote sensing,
geoinformatics (GIS), and civil engineering, Umer specializes in urban systems, geospatial
mapping, and environmental sustainability. Umer is currently working as a GIS engineer in a tech
company. His expertise spans a range of interdisciplinary fields, including urban and regional
planning, smart city development, geospatial data analysis, machine learning applications, hazard
and risk assessment, and addressing wicked problems in environmental and urban contexts. He can
be contacted at email: umerkhalil745@gmail.com
Afia Ishaq is currently working as a Lecturer at Riphah Institute of System Engineering, Riphah
International University Islamabad, Pakistan. She can be contacted at email:
afiaishaq21@gmail.com