New data from 1,500+ QA pros: The 2025 State of Software Quality Report is live
DOWNLOAD YOUR COPY
All All News Products Insights AI DevOps and CI/CD Community

Toward an Autonomous Software Testing Benchmark

The evolution from automated to autonomous testing calls for a benchmark to assess the maturity of software testing platforms.

Hero Banner
Smart Summary

Advancing autonomous software testing necessitates a standardized benchmark to measure platform maturity and guide industry evolution. We introduce the Autonomous Software Testing Model (ASTM), a six-stage framework inspired by autonomous vehicle progression, designed to assess AI/ML-driven testing capabilities from manual control to full AI autonomy. This model provides a cooperative path forward for the entire software testing community.

  • Establish Autonomous Software Testing Maturity Levels: The Autonomous Software Testing Model (ASTM) defines six distinct stages of human-machine integration, from Level 0 (full manual testing) to Level 5 (full AI/ML control and decision-making), outlining a clear progression path for testing platform autonomy.
  • Integrate Autonomous Features Across the Testing Lifecycle: Apply the ASTM benchmark across nine critical software testing activity areas, including Test Management, Orchestration, Maintenance, and Execution, to holistically assess and evolve platform autonomy in specific functional domains.
  • Benchmark and Elevate Platform Autonomy: Employ the ASTM for self-assessment, like our evaluation of Katalon's Level 3 autonomy in areas such as Test Orchestration and Maintenance, to identify current capabilities and strategically develop toward higher levels of AI/ML-driven testing independence.
Good response
Bad response
|
Copied
>
Read more
Blog / Insights /
Toward an Autonomous Software Testing Benchmark

Toward an Autonomous Software Testing Benchmark

Contributors Updated on

Autonomous testing model.png

We've been thinking long and hard about software testing and automation here at Katalon. We've noticed a shift in the Software Testing industry from Test Automation to Autonomous Software Testing (AUST). By "autonomous," we mean software testing that is completely driven by Artificial Intelligence (AI) and Machine Learning (ML), without any human intervention. 

We figured the industry shift from Test Automation to AUST could either be messy — a sort of Wild West approach to product maturation — or goal-driven, something to which the entire Software Testing community could aspire in a spirit of cooperation. That's where self-driven cars — or autonomous vehicles, as the mobility industry likes to call them — come in. 

The measured advances in autonomous vehicles have inspired us to assemble a way to benchmark progress toward autonomy in the software testing domain. We call the benchmark the Autonomous Software Testing Model (ASTM). 

We wanted a way to gauge just where software testing platforms are in their development of "self-driving" software testing — Katalon included. We looked to the Society of Automotive Engineers for guidance on building an ASTM benchmark.

Building a software testing benchmark

Building a software testing benchmark - Levels of autonomy.png

The benchmark for the maturation of autonomous vehicles that the Society of Automotive Engineers (SAE) created observes that humans will be involved in the evolution of the technology as a matter of course. In other words, autonomous vehicles will not suddenly pop out of a black box overnight. Instead, humans will be involved with developing, training, and testing the technology for some time to come.

The ASTM model, then, sets out six stages of human-machine integration. Zero (0) represents full manual design and implementation of testing procedures. The highest level, Six (6), sees full AI/ML design, implementation, and decision-making of tests:

Level 0: Manual Testing

The human performs all activities and actions and makes all decisions concerning testing of the System Under Testing (SUT). 

Level 1: Assisted Test Automation

The human performs most actions for testing the SUT with the support of the computer, e.g., automated software tools. The computer carries out certain testing actions automatically under the human’s control. The human holds the main role and makes all testing decisions. 

Level 2: Partial Test Automation

Both the human and the computer perform testing actions and generate possible decision options, but the human still makes most testing decisions. The computer carries out testing actions based on the decisions made by the human. 

Level 3: Integrated Automated Testing

The computer generates a list of decision options, selects an option, and performs actions based on this option if the human approves. The human can also select another decision option for the computer to carry out. 

Level 4: Intelligent Automated Testing

The computer generates decision options, determines the best option, and performs testing actions according to this option. The human can still intervene if necessary. 

Level 5: Autonomous Testing

The computer has full control of the testing process for the SUT, including making decisions and carrying out all testing actions. The human cannot intervene.

The software testing lifecycle

We felt we needed to ground the ASTM even more soundly in the software testing cycle. Defining each stage of software testing would enable us to apply the benchmark to software testing platforms on the market. We identified the nine activity areas that define software testing as:

  1. Test Management (TMN)
  2. Test Orchestration (TOR)
  3. Test Maintenance (TMA)
  4. Test Generation (TGR)
  5. Test Optimization (TOP)
  6. Test Monitoring (TMO)
  7. Test Execution(TEX)
  8. Test Management (TMN)
  9. Test Evaluation and Reporting (TRE)

Autonomous software testing model self-assessment

We applied the ASTM benchmark to Katalon.  We scored the Katalon functionality on a spider diagram to gauge the degree to which Katalon has evolved its autonomous features.

Autonomous software testing self-assessment.png

In particular, Katalon rates 3rd-level autonomous activity in the areas of Test Orchestration (TOR), Test Maintenance (TMA), Test Monitoring (TMO), and Test Execution (TEX). The benchmark implies that while the Katalon Platform can generate a list of decision-testing options for these activities, the system still requires human approval to act on the options; or, instead, the human can select another decision option for Katalon to act on. 

If you are a Katalon user, where would you place our platform's levels of autonomy? 

If you would like to learn more about how Katalon's AI-powered capabilities can:

  • speed up software deployment
  • increase the accuracy of your software testing efforts
  • reduce overall software development costs,

then contact us for a demo. 

inline-CTA.png

Download our academic paper entitled "Autonomous Software Testing Model" for a fuller description of the ASTM benchmark. The paper also includes:

  • Full descriptions of the software testing lifecycle activities
  • Charts that cross-reference ASTM stages with standard testing activities
  • A look at how the ASTM benchmark gauges other software testing platforms
     

More to read:

Ask ChatGPT
|
Katalon Team
Katalon Team
Contributors
The Katalon Team is composed of a diverse group of dedicated professionals, including subject matter experts with deep domain knowledge, experienced technical writers skilled, and QA specialists who bring a practical, real-world perspective. Together, they contribute to the Katalon Blog, delivering high-quality, insightful articles that empower users to make the most of Katalon’s tools and stay updated on the latest trends in test automation and software quality.
on this page
Click