The news that an artificial intelligence (AI) program that can answer an SAT geometry test like an average student has been developed is considered a major step towards them seeming almost human.
The program, which scored 49pc on the paper exam, is called GeoS and the SAT was an attempt to see whether the program could be presented a geometry test and answer it like a human.
Rather than just processing digitally through the program, which would give a perfect score, it would be presented with the physical paper and forced to interact with it like a startled 16-year-old student would.
To do this, GeoS used a combination of computer vision to interpret diagrams as well as natural language processing to read and understand text, which the team from the Allen Institute for Artificial Intelligence (AI2) has shown in an interactive explainer test.
Publishing its findings online, the team said that GeoS scored as well as an average student in geometry, achieving a grade of 49pc, which if extrapolated across the entire US SAT math test would have seen it achieve a score of 500 out of 800.
This is much more than the Turing Test
In doing so, it achieved two things that similar systems had never achieved before, those being, implicit relationships and understanding ambiguous references.
So what does GeoS see when it is presented with a geometry question? Putting it simply, it takes the diagram and text in the question and aligns the two by comparing what it is asking logically with the multiple choice answers available, quite like a human.
According to the CEO of AI2, Oren Etzioni, it is important to note that this test is not the same as the Turing Test.
“Unlike the Turing Test, standardised tests such as the SAT provide us today with a way to measure a machine’s ability to reason and to compare its abilities with that of a human,” Etzioni said.
“Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate. Creating a system to be able to successfully take these tests is challenging, and we are proud to achieve these unprecedented results.”
Geometry test image via Shutterstock