A Complete Guide To AI/ML Software Testing

A Guide To Using AI/ML in Software Testing | Katalon

There is no doubt about it: Artificial Intelligence (AI) and Machine Learning (ML) has changed the way we think about software testing. Ever since the introduction of the disruptive AI-powered language model ChatGPT, a wide range of AI-augmented technologies have also emerged, and the benefits they brought surely can’t be ignored. In this article, we will guide you to leverage AI/ML in software testing to bring your QA game to the next level.

What is AI and ML in Software Testing?

AI/ML in software testing is the integration of Artificial Intelligence and Machine Learning technologies into many phases and aspects of software testing to enhance it. With the help of AI/ML, software testers can now supercharge their testing by leveraging the human-level decision-making capabilities of these technologies. 


An intelligent system like AI/ML can bring tremendous benefits to QA teams that know how to incorporate it the right way. Although not a truly recent technology, AI/ML has improved so much in the past few years, and there are so many scenarios and possibilities of utilizing them. 

How To Use AI/ML In Software Testing?

According to the State of Software Quality Report 2024, test case generation is the most common application of AI for both manual testing and automation testing, followed by test data generation. You can download the report for the latest insights in the industry.

top QE activities where AI is applied


There are so many that we can use AI/ML to power-up our software testing, and the key to unlock those capabilities is knowing what these technologies can potentially do, then find creative ways to incorporate them into your day-to-day testing tasks. Note that there are 3 major approaches when choosing an AI/ML system to incorporate into your software testing, including:

  • Building your own AI tailored to the needs of the organization, which is a long-term and resource-intensive move
  • Leveraging foundation models via APIs or open models
  • Using an off-the-shelf tool with built-in AI features for testing

The final decision on which approach to use depends on the vision of the entire organization and the team. Whichever one you choose, there are generally 5 areas an AI/ML system can contribute the most:

1. Automated Smart Test Case Generation

We all know how disruptive ChatGPT has been to the software engineering industry. Basic coding tasks have now been handed to ChatGPT, requiring freshers in the field to immediately upgrade their skills to meet the rising standards. 

Similarly, in the software testing field, we can also request the ChatGPT to write some test cases for us. Keep in mind - ChatGPT is not the one-size-fit-all solution. Sometimes it works, but sometimes when the requirements get too complex, you need to provide extra context so it can better execute your prompt. 


For example, I asked ChatGPT to “write a Selenium unit test to check if clicking on a button on https://exampleweb[dot]com will lead to the correct link “https://exampleweb[dot]com/example-destination”,  and ChatGPT provides a fairly well-written code snippet with detailed explanation of each step as well as clear assertion to verify that the destination link is indeed the expected URL. It is actually impressive!

Example of Automated Smart Test Case Generation

Traditionally, these automated test scripts have to be written by skilled testers with coding skills using test automation frameworks. Not just that, ongoing maintenance was necessary whenever source code changes occurred, or else the scripts won’t understand the updated code, resulting in wrong test results. This is a common challenge among automation testers since they don’t have enough resources to constantly update their test scripts in the ever-changing Agile testing environment.


Thankfully, when incorporating AI/ML in software testing, you can now use simple language prompts to guide the AI in crafting tests for specific scenarios and speed up your test maintenance work.

2. Test Case Recommendation

The power of Machine Learning lies in its ability to learn from data and make predictions on its own without having to be explicitly programmed for each specific scenario. 

This means that as time goes on, the AI learns more about how users use the app that you're testing. It starts adjusting how it creates tests to match what your business needs. It figures out the usual ways you test things by looking at the rules, existing tests, and historical data to then suggest the best test cases for your situation.

Imagine you're running an eCommerce website with thousands of products, different user paths, and constant updates to the website. It is extremely challenging for testers to cover all possible scenarios even with certain areas automatically tested. 

This is where Machine Learning comes into play. At first, you need to feed the AI with data about how users interact with your website. This data includes things like what products they view, what actions they take, and when they abandon their carts. 

As the AI gathers and analyzes this data over time, it starts noticing patterns. It recognizes that certain products are frequently viewed together, and that users tend to abandon their carts during specific steps in the checkout process.


With this information, the AI no longer has to  follow a predefined set of tests; instead, it tailors the testing to mimic how real users actually interact with the website. In this case, the AI will focus more on testing the checkout process and the interactions between related products.


The great thing about this technology is that the more you use it, the better it becomes. Talking about aging like fine wine! In a further time frame, AI/ML can even turn into a full-on predictive analytics engine, capable of detecting potential issues based on the patterns and defects it had learned. Having a leading indicator to guide the direction of QA activities empowers QA managers and Team Leaders to make more informed decisions.

3. Test Data Generation

In certain cases, to test comprehensively you would need a large volume of data, or data with a variety of combination inputs. 


Let’s use the eCommerce website example above again. For global eCommerce websites that ship across borders, generally there are variations in the way shipping fee is calculated, such as package weight, geographic zone, distance, carrier selection, coupons, or even additional charges. Such complexity makes covering all shipping scenarios in the testing project quite difficult. 

a spreadsheet of test data

However, you can leverage AI/ML to generate a mock data set that you can use to execute the test case without having to collect real-life data. In the case mentioned above, you can totally command the AI to provide you with a list of addresses in regions that your business operates in. If you want to go into the minute details (shipping weight, time zones, additional fees, etc.), just tell the AI and save yourself hours of manual work.

4. Test Maintenance For Regression Testing

Software and websites often undergo updates, especially in organizations following the Agile testing approach. When these updates happen, test scripts created to test specific parts of the software can stop working correctly. 


For instance, if we wrote a test to click on a button identified by the label "login-button," any change in this label could cause the test to fail. Fixing numerous test cases each time code changes (which happens frequently) can be time-intensive. This is also a common bottleneck testers face during regression testing sessions.

AI can automatically adjust the test scripts whenever there's a code change. For example, if AI can't locate an object using its current method, it will try another method and proceed with the test. This feature is known as the "Self-Healing Mechanism." It reduces the burden on testers and ensures tests continue running smoothly even when there are changes in the code.

5. Visual Testing

Visual testing is an entirely different arena to software testers. In the past, human testers had to rely on their eyes to find visual differences between how the user interface (UI) looked before it was launched and how it appeared after it was launched. The problem arises when we want to automate visual testing, which involves comparing the screenshots of the UI before and after it was launched:

  • The computer recognizes extremely minor, pixel-level visual inconsistencies between those screenshots as visual bugs. These so-called “bugs” don’t affect User Experience that much.
  • Different computers render images differently, so the images in comparison may technically be different but not visually noticeable
  • Certain UI elements (such as the Cart icon on eCommerce websites) change all the time, and test scripts don’t understand that those changes happen for a good reason. They only understand that visual differences mean visual bugs.

This is a common problem with visual testing tools too, but with AI, it has been successfully addressed. The AI can learn if certain zones should be “ignored” even if there are changes there, and if the differences it notices between the 2 screenshots are reasonable. In a way, we give the AI the capability to look at visual testing from a more human perspective, instead of comparing every single pixel so rigidly.

Benefits Of Using AI/ML in Software Testing

Seeing that it is only the beginning phase of AI-based testing, and yet there have been so many bottlenecks and challenges commonly found in automation testing successfully addressed, we can only expect to see more and more exciting stuff coming along. For now, AI is the future, but in the future, AI will inevitably be the norm, and testers must fully leverage their benefits to catch up with the trend.


Here is how AI can help:

  • When AI evolves, the productivity of developers increases, leading to more code at more speed, and of course, more quality issues. Testers need AI too so that they can boost productivity and move at a corresponding speed.
  • As AI evolves, the threshold is pushed to a new height. Applications integrated with AI introduced a new host of quality challenges that testers must be responsible for, and AI-based testing can totally help them resolve those new challenges. 
  • AI accelerates test creation and brings better test maintenance.
  • AI helps teams see more clearly and make more informed decisions thanks to their recommendations.
  • AI streamlines the testing process to help everything run more smoothly and efficiently.

It is important to note that AI/ML does not replace testers. They only supercharge testers, which means that they should become an “extension” of their mind, helping them solve problems faster and smarter. It is similar to having an assistant that can perform amazing cognitive tasks, all the time.

Interested? Watch a recording of our webinar on how Generative AI can help you level up software testing:

Revolutionizing Testing with Generative AI | Webinar Katalon

Challenges of AI/ML in Software Testing

Having AI/ML in software testing comes with added responsibilities that we should embrace. Some of the challenges that testers should be aware of include:

  • Training Data Quality: AI/ML models require a large and diverse datasets for effective training, so during the early days of AI implementation, its recommendations may not yet be tailored to the organization’s specific needs. However, over time, they get more and more adaptable and familiar with the patterns in the system, leading to better insights.
  • Unforeseen Test Cases: AI/ML models might miss certain scenarios if they have not been adequately trained, and we wouldn’t even know that those test cases are missing.
  • Overfitting and Underfitting: Overfitting occurs when a model is too specific to the training data and doesn't generalize well, while underfitting occurs when the model is too simple to capture patterns. Ensure to balance these 2 extremes when working with AI by being explicit and detailed about what you want it to accomplish. 
  • Model Drift: there is an interesting phenomenon in AI development where the model becomes less effective due to changes in the software usage, known as model drift. This requires continuous data monitoring to combat.

Best Practices When Using AI/ML in Software Testing

  • Gain a foundational understanding of AI/ML systems as well as your workflows: AI seems like magic (in some ways it is) but the key to using AI is being practical in your approach. Lay a solid foundation of the AI/ML systems you’re working with, understand your own workflows, then find out ways that you can integrate two of them with each other. AI should streamline your workflow and simplify tasks which you previously spent a lot of time on, and you can only do that if you understand both of them correctly.
  • Be patient: AIs take time to develop and learn about the tasks you assign them. Treat the AI as a blank canvas that you can gradually train to perform complex tasks. It is even better to have a dedicated plan for how you want to integrate AI into your workflow. You don’t have to take big leaps. Even baby steps that are non-disruptive work also: move at your own pace.
  • Learn prompt engineering: when working with AI and especially generative AI, it is crucial to provide them with well-structured and precise input prompts to generate accurate and relevant outputs from the models. It gives you a certain sense of control over the probabilistic nature of the system. Prompt engineering is all about bringing context, specifications, and boundaries to the table, and that is a real skill in itself.
  • Remember that it is just a tool: no matter what, AI is only a tool, and it is a truly powerful tool if used in concert with the testers. Testers will not be replaced by AI, but rather, they should and will be empowered by the technology. The more skilled and experienced the tester is, the more benefits they can extract out of these tools. Bring your own creativity and originality to the table, and let AI propel those ideas to another level. 

Testing With AI vs Testing For AI Systems

Testing using AI and testing for AI systems are 2 completely different topics. 

Testing with AI is using AI models to supercharge your testing, while testing for AI systems is to test if those models themselves perform as expected. However, the non-deterministic nature of AIs makes testing for them quite challenging.

By non-deterministic, we mean that AI systems can exhibit completely different behaviors for the same request. In the context of generative AI, this means they provide answers to the same question differently. 

This makes traditional software testing concepts like “test coverage” obsolete, since it is impractical to measure test coverage when there are theoretically infinite possibilities and scenarios that can happen within a large enough AI system.

Challenges of Testing AI Systems

  • AI/ML models can be highly complex and involve intricate mathematical algorithms.
  • There are almost infinite possibilities of results generated by AI, and it is impractical to test all of them
  • AI/ML applications are considered "black boxes," meaning their internal workings are complex and hard to interpret. This makes it challenging to determine what inputs are meaningful to test.
  • AI/ML models can be susceptible to adversarial inputs—carefully crafted inputs that exploit vulnerabilities and cause unexpected behavior. These inputs might not be captured by traditional coverage criteria.
  • AI/ML models can adapt and "learn" from new data, making their behavior evolve over time, and test cases for them must evolve accordingly.

Start Using AI/ML in Software Testing With Katalon

Katalon logo

Katalon is a comprehensive AI-augmented software quality management platform, orchestrating the entire software testing lifecycle from test creation, execution, maintenance, to reporting across web, API, desktop, and mobile applications.

  • StudioAssistLeverages ChatGPT to autonomously generate test scripts from a plain language input and quickly explains test scripts for all stakeholders to understand.
  • Katalon GPT-powered manual test case generatorIntegrates with JIRA, reads the ticket’s description, extracts relevant information about software testing requirements, and outputs a set of comprehensive manual test cases tailored to the described test scenario.
  • SmartWait: Automatically waits until all necessary elements are present on screen before continuing with the test.
  • Self-healing: Automatically fixes broken element locators and uses those new locators in following test runs, reducing maintenance overhead.
  • Visual testing: Indicates if a screenshot will be taken during test execution using Katalon Studio, then assesses the outcomes using Katalon TestOps. AI is used to identify significant alterations in UI layout and text content, minimizing false positive results and focusing on meaningful changes for human users.
  • Test failure analysis: Automatically classifies failed test cases based on the underlying cause and suggests appropriate actions.
  • Test flakiness: Understands the pattern of status changes from a test execution history and calculates the test's flakiness.
  • Image locator for web and mobile app tests (Katalon Studio): Finds UI elements based on their visual appearance instead of relying on object attributes.
  • Web service anomalies detection (TestOps): Identifies APIs with abnormal performance.

A forerunner in the AI testing landscape, Katalon continuously enriches its product portfolio with these innovative AI-infused capabilities, empowering global QA teams with unprecedented accuracy and operational efficiency. Learn more about Katalon AI testing here.


Start AI Testing With Katalon