WIP: Beyond Code: Evaluating ChatGPT, Gemini, Claude, and Meta AI as AI Tutors in Computer Science and Engineering Education

Sagnik Nath; So Yoon Yoon

doi:10.1109/FIE61694.2024.10893528

Abstract

This Work-in-Progress research paper evaluates the validity of Large Language Models (LLMs) as conversational AI tutors for computer science learning. While current engineering education literature has predominantly emphasized the rapid evolution of LLMs as conversational AI tutors for programming languages, the exploration into their effectiveness within general STEM topics remains comparatively scarce. This WIP study thus centers on evaluating the potential of LLMs to facilitate understanding of core hardware design concepts critical to computer science and engineering (CSE) education. By cross-checking the responses from generative AI chatbots to an openended CSE-based question, we aimed to uncover how LLMs, such as ChatGPT-3.5, Claude, Gemini, and Meta AI, can contribute to teaching and learning of general CSE courses instead of a specifically coding-based one. Our method involved simulating a student query on the popular debate between CISC vs. RISC related to computer architecture and analyzing the chatbots' responses. This initial collection of data served as the foundation for a continual comparative analysis aimed at determining the inherent instructional value of each LLM and its validity and reliability. To systematically assess the responses, we introduced an evaluation framework focusing on metrics, such as response accuracy, persuasiveness, and depth of explanation. The current work anticipates not only enriching our understanding of how these advanced LLMs can support general CSE education but also identifying areas where further development is needed for a more holistic integration of LLM-based chatbots in assisting student comprehension in the overarching engineering education.

WIP: Beyond Code: Evaluating ChatGPT, Gemini, Claude, and Meta AI as AI Tutors in Computer Science and Engineering Education

Authors

Abstract

Related Articles