Toward an Autonomous Software Testing Benchmark

Autonomous testing model.png

We've been thinking long and hard about software testing and automation here at Katalon. We've noticed a shift in the Software Testing industry from Test Automation to Autonomous Software Testing (AUST). By "autonomous," we mean software testing that is completely driven by Artificial Intelligence (AI) and Machine Learning (ML), without any human intervention. 

We figured the industry shift from Test Automation to AUST could either be messy — a sort of Wild West approach to product maturation — or goal-driven, something to which the entire Software Testing community could aspire in a spirit of cooperation. That's where self-driven cars — or autonomous vehicles, as the mobility industry likes to call them — come in. 

The measured advances in autonomous vehicles have inspired us to assemble a way to benchmark progress toward autonomy in the software testing domain. We call the benchmark the Autonomous Software Testing Model (ASTM). 

We wanted a way to gauge just where software testing platforms are in their development of "self-driving" software testing — Katalon included. We looked to the Society of Automotive Engineers for guidance on building an ASTM benchmark.

Building a software testing benchmark

Building a software testing benchmark - Levels of autonomy.png

The benchmark for the maturation of autonomous vehicles that the Society of Automotive Engineers (SAE) created observes that humans will be involved in the evolution of the technology as a matter of course. In other words, autonomous vehicles will not suddenly pop out of a black box overnight. Instead, humans will be involved with developing, training, and testing the technology for some time to come.

The ASTM model, then, sets out six stages of human-machine integration. Zero (0) represents full manual design and implementation of testing procedures. The highest level, Six (6), sees full AI/ML design, implementation, and decision-making of tests:

Level 0: Manual Testing

The human performs all activities and actions and makes all decisions concerning testing of the System Under Testing (SUT). 

Level 1: Assisted Test Automation

The human performs most actions for testing the SUT with the support of the computer, e.g., automated software tools. The computer carries out certain testing actions automatically under the human’s control. The human holds the main role and makes all testing decisions. 

Level 2: Partial Test Automation

Both the human and the computer perform testing actions and generate possible decision options, but the human still makes most testing decisions. The computer carries out testing actions based on the decisions made by the human. 

Level 3: Integrated Automated Testing

The computer generates a list of decision options, selects an option, and performs actions based on this option if the human approves. The human can also select another decision option for the computer to carry out. 

Level 4: Intelligent Automated Testing

The computer generates decision options, determines the best option, and performs testing actions according to this option. The human can still intervene if necessary. 

Level 5: Autonomous Testing

The computer has full control of the testing process for the SUT, including making decisions and carrying out all testing actions. The human cannot intervene.

The software testing lifecycle

We felt we needed to ground the ASTM even more soundly in the software testing cycle. Defining each stage of software testing would enable us to apply the benchmark to software testing platforms on the market. We identified the nine activity areas that define software testing as:

  1. Test Management (TMN)
  2. Test Orchestration (TOR)
  3. Test Maintenance (TMA)
  4. Test Generation (TGR)
  5. Test Optimization (TOP)
  6. Test Monitoring (TMO)
  7. Test Execution(TEX)
  8. Test Management (TMN)
  9. Test Evaluation and Reporting (TRE)

Autonomous software testing model self-assessment

We applied the ASTM benchmark to Katalon.  We scored the Katalon functionality on a spider diagram to gauge the degree to which Katalon has evolved its autonomous features.

Autonomous software testing self-assessment.png

In particular, Katalon rates 3rd-level autonomous activity in the areas of Test Orchestration (TOR), Test Maintenance (TMA), Test Monitoring (TMO), and Test Execution (TEX). The benchmark implies that while the Katalon Platform can generate a list of decision-testing options for these activities, the system still requires human approval to act on the options; or, instead, the human can select another decision option for Katalon to act on. 

If you are a Katalon user, where would you place our platform's levels of autonomy? 

If you would like to learn more about how Katalon's AI-powered capabilities can:

  • speed up software deployment
  • increase the accuracy of your software testing efforts
  • reduce overall software development costs,

then contact us for a demo. 


Download our academic paper entitled "Autonomous Software Testing Model" for a fuller description of the ASTM benchmark. The paper also includes:

  • Full descriptions of the software testing lifecycle activities
  • Charts that cross-reference ASTM stages with standard testing activities
  • A look at how the ASTM benchmark gauges other software testing platforms

More to read: