Testing Artificial Intelligence: How Low Can You Go?

By Hans Buwalda - July 12, 2018

Artificial intelligence (AI) has been around for a long time with periods of great interest but also with "AI Winters" when public interest waned. Today, with the increasing power of hardware and infrastructure and the interest by big players in multiple industries, AI is propelling to the forefront once more. With the growing importance of AI comes the question: How do I test it? AI systems do not necessarily behave predictably. This means that traditional test cases of the form "do this, expect that" are not always sufficient.

Kenneth Boulding wrote an article in 1956 on a hierarchical classification of systems that I found very useful in thinking about them. For him, the lowest level of systems were static systems, like a table or a chair. The next level up was a clock that moves. Then came the thermostat that actively responded to input. Next up were living cells that maintained themselves, then plants that were loosely connected colonies of cells, then animals in which the cell colonies form a being with coordinated behavior. Humans have the properties of animals but can also think and are self-conscious. In the words of Boulding, “a human not only knows, but knows that it knows." His final two levels were social organizations and a possible "transcendental" level that supersedes all but that he considered unknowable and therefore was not specified further in his article.

Simplified, I believe we can consider most computer systems at the "thermostat" level, in that they can perform actions based on input. Traditional test cases are a level lower. When they are executed, they provide input and compare outcomes in a predictable way, like the clock in Boulding's hierarchy. I believe this is a structural property in testing systems: they can be a level lower than their system under test but not lower than that.

In applying AI systems to the Boulding hierarchy, it seems reasonable to assume that they are at least one level above the thermostat, especially if they can learn and improve themselves. Traditional test cases will not suffice to test them. Some other kind of model will be needed to drive the testing system, raising it the thermostat level. The model can detect where the AI application is going and if its outcomes are at least as expected. Take an AI powered traffic light control system. If an intersection is empty while traffic is waiting, it is not intelligent enough. If all lights are green, a boundary condition has been overstepped. The model is a system, but comparable to a thermostat, a level lower than the AI.

The notion that a testing system can be a level lower than the system under test does not mean it necessarily has to stay there. Testing systems themselves can also incorporate AI, even if testing non-AI systems. This can, in particular, be promising when testing very complex systems and even in making AI systems self-testing. You can look for examples at aitesting.org.

I realize that I'm simplifying a complex reality with a constant stream of new ideas and technologies, but hopefully this article gives a starting point for a fascinating journey in testing AI.

Tags:

artificial intelligence

kenneth boulding

Up Next

5 Myths about API Security

July 11, 2018

Get TechWell Insights Delivered Weekly

All TechWell Insights by this Author

Related Insights

1 comment

Komal Lopez

Hi Hans,

That’s an interesting title. You are absolutely right, ‘Testing systems themselves can also incorporate AI, even if testing non-AI systems.’

On similar lines you might like to check out this post, ‘AI definitely needs QA monitoring’…we would love to get your views on this
post. Here’s the link: https://www.cigniti.com/blog/artificial-intelligence-needs-qa-monitoring/

August 7, 2018 - 10:12am

About the Author

Hans Buwalda

Hans Buwalda is an internationally recognized expert in test development and testing technology management and a pioneer of keyword-driven test automation. He was the first to present this approach, which is now widely used throughout the testing industry. Originally from The Netherlands, Hans now lives and works in California as CTO of LogiGear Corporation, directing the development of what has become the successful Action Based Testing™ methodology for test automation and its supporting TestArchitect™ toolset. Prior to joining LogiGear, Hans served as project director at CMG (now CGI) in the Netherlands. He is co-author of Integrated Test Design and Automation and a frequent speaker at international conferences.

Testing Artificial Intelligence: How Low Can You Go?

Up Next

About the Author

Hans Buwalda

Connect with Me

TechWell Insights To Go