The Importance of "Test Efficacy" in Visual Testing


Visual Testing Is More Than Meets the Eye

Humans are visual, both on and off-screen. Research shows that 94% of online visitors will form their first opinion on your business based on the design of your website. For e-commerce, 88% of shoppers are less likely to return to a website that gave them an unpleasant experience. 

These two simple statistics highlight how detrimental it can be to have an inconsistent product UI/UX out in the market. From a mistakenly placed check-out button to overlapping text, customers can become easily frustrated by simple errors that crop up during regular code changes or updates to how browsers interpret UI frameworks. 

Digital businesses are most at risk of losing customers and profits, one visual bug at a time. Automated visual testing is a critical step in improving and maturing your QA processes to avoid these mishaps. Yet, traditional visual testing creates challenges by greatly increasing the number of initial “failed” test results due to the sensitivity of traditional “pixel-based” visual comparisons. “Test flakiness” is simply a measure of a test's overall pass/fail count. It doesn’t take into account how often a pass should have been a fail and vice versa. 

Test Efficacy vs. Test Flakiness 

Test efficacy is often used in clinical trials to measure test performance and reliability. It is much more than simply the test passing or failing. It is the confidence that the test passes when it should pass and fails only when it truly should. This is not usually an issue in traditional automated testing. However, when you implement traditional pixel-comparison visual testing and start seeing the number of failed tests go up because of minor and inconsequential changes, QA teams and product leads get concerned. Tests becoming unreliable because they do not have the desired result is an example of test efficacy dropping. 

A flaky test is a test that both passes and fails periodically. You don’t know if the outcome is good or bad because the inconsistency negates the result. Flaky tests are both annoying and costly since they often require engineers to manually execute tests or, worse, trigger entire test suite runs to recreate the issue. But the actual cost of test flakiness is a lack of confidence in your tests.

Flaky tests will significantly impact your ability to confidently and continuously deliver quality software.


  • Test coverage and efficiency measure how well testing resources are used. 
  • Test flakiness measures how consistently a test passes.
  • Test efficacy measures how consistently a test result (whether pass or fail) is the desired result.

The Importance of Test Efficacy in Visual Testing?

As mentioned earlier, visual testing often increases the number of failed tests. This, in turn, increases the amount of time spent manually reviewing the failed test results. For example, a manual review will often reveal that the failed test was caused by an inconsequential change of a button moving a few pixels. Each time this happens, the test efficacy of the automated test drops. 

Having visual testing capabilities that consistently lead to desired results, i.e., high test efficacy, lowers the manual effort required to validate failed visual tests. 

How AI Increases Test Efficacy in Visual Testing

Visual testing involves the look and feel of an application. Many QA teams do not rely on automation for visual testing. Instead, they manually conduct UI regression tests and commit hundreds or thousands of resource hours per month to validate applications visually. Just like functional regression tests, UI regression tests need to be automated to scale quality processes properly. Companies will benefit immensely from automating these visual tests in both the capacity of testing and quality of life for the testers.

But as the challenges discussed earlier imply, how do you implement automated visual testing without greatly impacting skyrocketing test flakiness and your overall test efficacy? How can an organization trust tests that often fail when they shouldn’t? This is where the power of AI comes in. 

AI offers the ability to add an automated validation layer to visual tests. AI improves the visual testing process by replacing the manual review step with an AI-powered automated visual review to filter out false positives further and maintain test efficacy. For example, when a visual test goes through the AI validation and still fails, the trust that this failure is valid increases. A tester avoids wasting their time reviewing and possibly opening a bug ticket.

Are you looking for a tool for visual testing that helps minimize visual test flakiness and improve visual testing efficacy?

Try Katalon’s AI Visual Testing Trial with the following two options: content-based or layout-based visual testing. With AI-enabled analysis, only valid variations of UI are identified, reducing duplicate effort and troubleshooting hours significantly.


Button _ Learnmore