ChatGPT ‘passed’ BCS exam, according to Science Bee’s experiment

UNB || BusinessInsider

Published: 20:28, 16 March 2023

ChatGPT ‘passed’ BCS exam, according to Science Bee’s experiment

Representational collected image

Since it became publicly accessible in November last year, ChatGPT – an AI chatbot created by OpenAI Company – has dominated the discourse on the internet and social media. Based on the Generative Pre-Trained Transformer 3 or GPT-3 language paradigm, ChatGPT is capable of carrying on a conversation, responding to inquiries, producing stories, poems, and comics, as well as resolving challenging programming issues.

ChatGPT has also participated in and even passed numerous challenging examinations across the globe including the Wharton MBA Exam, the American Medical Licensing Exam, and the Law School Exam, as part of experiment.

Although the chatbot recently failed the Indian UPSC (Union Public Service Commission) exam, which is the benchmark test for recruitment to higher civil services of the Government of India, Bangladeshi netizens wondered whether ChatGPT would be able to pass the BCS (Bangladesh Civil Service) exam or not.

Science Bee, one of the largest science-based education platforms for youths in the country, has recently revealed on its social media platforms that ChatGPT has “successfully passed” the BCS preliminary exam, scoring 130 out of 200 marks in total.

Talking about the experiment with UNB, Science Bee Founder Mobin Sikder and Executive Member Metheela Farzana Melody shared how the team tested the chatbot for BCS exam, following a month of planning and preparation and seven days of frequent testing.

“First of all, we researched how to take the test to get the most realistic results,” Mobin told UNB. “Since ChatGPT is trained on a dataset available till September 2021, we decided to conduct the test on the questions of the latest BCS exam – 44th BCS, held in May 2022.”

“After selecting the exam, we collected the question papers and answers. Since the question paper is allowed to be taken away after the exam, securing it did not require much time. The answer sheet is, however, not published directly. So, we prepared the final answer sheet on our own, after multiple testing from various third-party sources,” team Science Bee explained.

Language barrier emerged as a headache during the experiment as BCS exam is conducted in Bangla language and the chatbot is trained in English. It had to be translated into English in order to keep the exam fair.

In the 44th BCS, 1 mark was allotted for each question where the candidate got 1 mark for the correct answer, and 0.5 mark was deducted for each wrong answer. However, candidates had the option to skip or not answer any question; in that case, no marks were added or subtracted. The same mark distribution was provided to ChatGPT and at the beginning, it was informed about the MCQ exam and command through text prompt – and it became ready to take the exam.

However, there were some picture-based questions, according to team Science Bee. Since ChatGPT-3 is not multimodal, it cannot read or understand images; hence it was not possible to input those questions, so those were rejected. Besides, it was not possible to translate some questions related to Bangla language and literature into English as it would change the thematic description.

“The total number of such rejected questions was 22. As these are weaknesses of ChatGPT, invalid questions were treated as unanswered and no negative marking was done,” according to team Science Bee.

The remaining 178 questions were asked to ChatGPT with options, and it answered 142 questions correctly. 24 questions were answered incorrectly and while answering the remaining 12, the chatbot stated that the correct answer option was not found. That means the chatbot got 142 marks for as many correct answers, 12 marks were deducted for providing 24 wrong answers, and no marks were deducted or added for no answer. So, as per the 44th BCS exam questions, ChatGPT passed with a total of 130 marks.

In the 44th BCS exam, a total of 3,50,716 candidates applied and of them, 2,76,760 candidates participated in the preliminary exam. Only 15,708 candidates passed the preliminary exam, according to reports.

“As there is no specific pass mark for BCS and the cut-off mark is not officially released, we were in touch with several candidates who appeared for the 44th BCS exam. According to the information given by them, the cut-off mark in general cadre was 125±. Since ChatGPT secured 130 marks in our test, it can be said that ChatGPT has successfully passed BCS preliminary exam,” team Science Bee told UNB.

Further explaining the performance of the chatbot, Science Bee said that according to the test, ChatGPT was able to answer the questions quite well. However, it was pretty weak in Bangla language and literature category where it answered only 5 out of 35 questions. On the other hand, it performed well in the categories of science, computer or English language and literature. It took a considerable amount of time to answer most of the questions in the mental skills or math categories correctly.

“Besides, many times there have been incidents like getting stuck in the middle of answering. In that case, we had to take the help of ‘Regenerate Response’ to proceed and move forward,” team Science Bee said.

The questions for the exam were collected and translated by Metheela. Overall management of the test was conducted by Science Bee’s Content Production Head Annoy Debnath, and the final report was edited by Mobin and Sadia Binte Chowdhury.

“We did this test as part of an interesting experiment and will conduct further tests with other examinations when ChatGPT-4 will be available. The chatbot is learning consistently and becoming powerful every single day, and through this type of test, we want to convey a message to aspiring learners and students that we need to move one step ahead of ChatGPT with our learnings.”

“That means, we need to stop relying on memorising and copy-paste practices because ChatGPT can do it and will be doing it even better with future versions, and also there are other AI projects in the pipeline such as Google’s Bard. It can be a great assistant and companion to humankind, and it will not replace anyone if we can continue to improve our learning. That is the motto of our research, aligned with our motto and tagline ‘learn like never before’. We want people to understand the importance of learning and be skilled in order to make AI useful,” Mobin and team Science Bee told UNB.