What Exams Has ChatGPT Passed?
Editor & Writer
Editor & Writer
Editor & Writer
Editor & Writer
- ChatGPT is inconsistent throughout different studies: It excelled at microbiology, while barely passing law exams and the USMLE.
- ChatGPT can impressively gather, analyze, and write answers, but it can still get basic ideas wrong, Minnesota Law School professors found.
- Professors at Wharton reported ChatGPT handled problems used extensively in the training and testing of MBA students.
ChatGPT has thrown a wrench in higher education, for better or for worse, as students, academics, administrators, and web denizens test the artificial intelligence (AI) language tool's ability to write essays and computer code and even combat plagiarism.
Now, we're learning that ChatGPT can also pass college-level exams.
Here's a look at how ChatGPT has fared so far in higher education.
ChatGPT Goes to the University of Minnesota Law School
A study from the University of Minnesota Law School let ChatGPT take four final exams and compared the results to human students. Researchers found that ChatGPT could pass each exam but wasn't a great student.
Final Exam: Constitutional Law: Federalism and Separation of Powers
ChatGPT Score: B (36th out of 40 students)
Final Exam: Employee Benefits
ChatGPT Score: B- (18th of 19 students)
Final Exam: Taxation
ChatGPT Score: C- (66th of 67 students)
Final Exam: Torts
ChatGPT Score: C- (75th of 75 students)
Overall Score: C+
Researchers concluded that ChatGPT would've been a mediocre law student and would be able to barely pass at a selective law school. ChatGPT was consistently at the bottom of the class, and it struggled with multiple choice questions but did much better on the essays.
The AI tool had a firm grasp of fundamental law and composed its essays well with no grammatical errors. However, it struggled to focus on the prompt, failed to go into detail, and misunderstood some technical terms.
"In an era where remote exam administration has become the norm, this could hypothetically result in a struggling law student using ChatGPT to earn a JD (Juris Doctor) that does not reflect her abilities or readiness to practice law," researchers concluded in the study.
Wharton Welcomes ChatGPT3
Management professor Christian Terwiesch at the Wharton School of the University of Pennsylvania decided to give ChatGPT3 a small exam and then had the AI tool create exam questions for human students.
Exam: seven-question MBA test given by Terwiesch
"The previous pages should convince even a skeptical reader that Chat GPT 3 has remarkable skills in handling problems as used extensively in the training and testing of our MBA students," Terwiesch said.
ChatGPT3 got several A+'s on individual questions but heavily struggled on multiple-part and sometimes simple questions.
Terwiesch graded ChatGPT3 a B/B- because of its performance and how well it generated questions for future exams. Before concluding the test, Terwiesch asked what it thought its implications on MBA education would be in contrast to K-12 education.
"Chat GPT-3 technology has the potential to greatly enhance MBA education by providing students with personalized, interactive learning experiences. It can be used to generate personalized study materials, provide real-time feedback, and facilitate online discussions and group projects. Additionally, Chat GPT-3 can be used to simulate real-world business scenarios, allowing students to practice decision-making and problem-solving in a safe, virtual environment.
"The impact of Chat GPT-3 on MBA education may be different from its impact on K-12 education because MBA students are typically older and more experienced, and are often working professionals. They may have different learning styles and needs than K-12 students, and may require more advanced, specialized content. Additionally, MBA programs often involve a greater emphasis on practical, real-world applications, which Chat GPT-3 may be particularly well-suited to support."
ChatGPT Becomes a Medical Professional
The non-peer-reviewed preprint study on medRxiv.org used 305 of the 376 United States Medical Licensing Examination (USMLE) questions, excluding questions with images, from the June 2022 sample exam. After each question, the study started a new session to reduce memory retention bias from ChatGPT.
Exam: United States Medical Licensing Examination
Score: >50% to >60%
"The USMLE pass threshold, while varying by year, is approximately 60%," the study said. "Therefore, ChatGPT is now comfortably within the passing range. Being the first experiment to reach this benchmark, we believe this is a surprising and impressive result."
The study said ChatGPT outperformed PubMedGPT, another similar AI tool focused exclusively on biomedical domain literature. It struggled the most on Step 1 of the test, which focuses on subjects humans find more challenging or opaque.
ChatGPT Keeps Its Biology Major After Its First Year
Alex Berezow, Ph.D. microbiologist and executive editor of Big Think, gave ChatGPT a 10-question quiz for an intro-level microbiology college course.
Quiz: Intro-level Microbiology Quiz
Score: 95% A+
ChatGPT would likely continue its science, technology, engineering, and mathematics (STEM) career after taking a quiz. Berezow went into the project skeptical because his questions required specialized knowledge and the ability to synthesize that knowledge, concisely respond, and construct a mathematical solution to a word problem.
"With the exception of Q1, ChatGPT passed with flying colors," Berezow said. "If I were grading the quiz, I would give ChatGPT a 95% — which is far better than what most human students likely would get."
ChatGPT Won't Be Opening a Gastroenterology Center Anytime Soon
The Feinstein Institutes for Medical Research gave ChatGPT two self-assessment tests from the American College of Gastroenterology.
Exam: 2022 and 2021 American College of Gastroenterology (ACG) Self-Assessment Tests
Score: GPT-3: 65.1%; GPT-4: 62.4%
Score Needed to Pass: 70%
The researchers tested GPT-3 and the newest ChatGPT version, GPT-4, with two 300-question multiple choice exams without image-based questions. Both language processing models failed the exams which caused researchers to conclude that it has a ways to go before it's used for medical education in gastroenterology.
According to Business Wire, some potential explanations for failure are a lack of access to paid medical journals or its sourcing of questionable outdated or non-medical resources.
"ChatGPT has sparked enthusiasm, but with that enthusiasm comes skepticism around the accuracy and validity of AI’s current role in health care and education," said Andrew C. Yacht, senior vice president, academic affairs and chief academic officer at Northwell Health. "Dr. Trindade’s fascinating study is a reminder that, at least for now, nothing beats hitting time-tested resources like books, journals and traditional studying to pass those all-important medical exams."