Just when we started getting used to ChatGPT’s spooky features, OpenAI dropped a new version of its AI language model.
According to OpenAI, GPT-4 is much more advanced than GPT-3, which powers ChatGPT. And to prove it, they put GPT-4 through a bunch of tests.OpenAI tested GPT-4(opens in new tab) From high school graduation to professional level, there are a variety of standardized tests for math, science, coding, history, literature and even tests to become a sommelier. The exam consisted of multiple-choice and open-ended questions, and the GPT-4 was scored using standard methodology for each exam.
How to access GPT-4 now
Put down your pencils, GPT-4, it’s time to check your score.
Is law school difficult?
GPT-4 not only entered law school, but also passed the bar exam. The AI language model scored in the 88th percentile on the LSAT (Law School Entrance Examination), scored in the 90th percentile, and performed even better on the Bar Examination (Standardized Bar Examination). By comparison, GPT-3 was in the bottom 40% of LSAT and 10% on the bar.
college entrance exam was easy
On GPT-4, I completed both the mathematics and literacy sections of the SAT and all three sections of the GRE, which are categorized into quantitative, verbal and writing skills. I scored in the 80th or 90th percentile of every section except the writing section of the GRE… bombed in the 54th percentile.
GPT-4, a typical overachiever, all AP (Advanced Placement) high school exam. With the exception of a few outliers, from 84th he scored between 100th and beat most of the competitors.
GPT-4 was ranked 44th in AP English and only 22nd in AP English Literature. So it may still be some time before GPT-4 supersedes it. GPT-4 was less hot, ranking between 43rd and 59th in AP Calculus BC scores. This proves that calculus is not easy, even on supercomputers. However, GPT-4 still gets a 4, so it may be off the university’s calculations.
GPT-4 has some coding work
GPT-4 still has some work to do with that coding skill. This is interesting because one of its uses in the market is to help developers. It has a rating of 392 on Codeforces, which hosts competitive programming events, and ranks below 1199 in the beginner category.
I did pretty well on Leetcode’s easy levels (solved 31 out of 41 questions), but struggled when it came to medium or hard levels of difficulty (21/80 and 3/45 respectively). As we saw in the developer demo live stream, GPT-4 can be perfectly written in Python, but requires manual tweaking to set the right parameters and some of these test scores are could be explained. Or maybe you didn’t eat breakfast that morning.
Okay, but can a GPT-4 be a sommelier?
GPT-4 successfully passed the sommelier exam. Placed lowest (77th percentile) on the most advanced sommelier exam. But for any non-human entity that has never tasted wine, that he slides one.
OpenAI has released a full breakdown(opens in new tab) About GPT-4 performance. GPT-4 may not write the next great American novel, but his future as a mathematically brilliant lawyer and wine lover looks pretty bright.