What We Learned Stress-Testing Katalon Internally (The Good, the Pain, and the Breakthroughs)

Richie Yu

Senior Solutions Strategist Updated on

Learn with AI

When we committed to using Katalon for all our internal testing, we didn’t do it because it was perfect.

We did it because it wasn’t.

We wanted to see what would happen if we pushed the platform to its limits—used it not just in ideal conditions, but in the messy, nuanced, high-urgency workflows of a real QE team. No backup tools. No shortcuts. Just us, our product, and the problems we hadn’t yet solved.

And what happened? We broke things. We hacked around limitations. We stumbled into capabilities we hadn’t fully appreciated. And most importantly—we learned.

Here’s what going “all in” on Katalon revealed.

Why We Did It: Pressure Reveals Product Truth

In theory, we could have kept using a mix of best-in-breed tools to plug gaps. But that wouldn’t have taught us much.

So we made a hard rule: Use only Katalon. Even if it meant frustration. Especially if it meant frustration.

We purposely said: even if we know there’s a better tool, we’re not using it. Use Katalon or go bust.

Mush Honda

Chief Quality Architect

The goal wasn’t just to adopt, it was to stress-test the platform. We wanted to surface the blind spots, force decisions, and make the product better through real-world pressure. Then when our customers got their hands on the platform, they'd have a much better experience.

Discovery #1: Mobile Testing Was More Unified Than We Thought

The way we set ourselves to build our mobile testing capabilities was to build testing assets once, then reuse it across iOS and Android.

But as we've seen with every other mobile testing tools in the market, that's easier said than done as in the end they require separate test assets for each OS. So we kept our expectations low, given the technical complexities and technology limitations we know exist in accomplishing that unified testing experience across iOS and Android.

We were ecstatic when we realized that Katalon Studio not only let us reuse the same test objects and scripts across both platforms, but there was no duplication and no alternate flows required. Just parameterization and good design.

Traditionally, you'd need separate test objects. Studio let us avoid that entirely.

Mush Honda

Chief Quality Architect

This unlocked faster mobile regression and reduced maintenance overhead especially for testers rotating between app teams.

Discovery #2: Reporting Pain Drove Platform Growth

When earlier versions of TestOps didn’t support manual test case tracking, our teams got creative. We embedded manual tests directly into automation repositories just to get full-cycle reporting.

It was messy. But it worked. And it highlighted exactly what the platform was missing.

That workaround became a catalyst for re-architecting TestOps to support the entire software testing lifecycle. Manual test management was prioritized and it was built into TestOps’ next version—because real users showed it was essential.

We hacked reporting by bundling manual test cases into automation folders. The next version of TestOps fixed that.

Mush Honda

Chief Quality Architect

Discovery #3: Org Changes Shouldn’t Break Your Tool

Early in our adoption, we structured TestOps projects around teams. Then engineering restructured—breaking the alignment between our test assets and our product areas.

We rebuilt everything. But that pain led to something better.

The next version of TestOps introduced portfolio structures and more flexible reporting views, so teams could nest by product, feature, or responsibility. That flexibility wouldn’t have emerged without real-world pain and disruption.

Discovery #4: You Can’t Fake AI Testing Readiness

When we tried to test LLM-based systems—like StudioAssist or customer-facing AI tools—we hit a wall. Studio didn’t yet support validation of non-deterministic outputs. We had to fall back on manual judgment and external libraries to evaluate quality.

This wasn’t just an edge case—it’s the future of testing.

We’re manually testing AI responses because Studio didn't yet handle the inherent variability of AI systems well.

Mush Honda

Chief Quality Architect

That insight triggered research spikes to explore to build testing capabilities that enable tolerance-based validations, benchmark frameworks, and AI-specific testing modes. More importantly, building those capabilities in such a way that software testing teams would embrace and adopt them based on their current level of maturity, without expecting teams to have to completely reskill themselves to realize the benefits of those capabilities for testing AI systems - that's an enormous challenge in the industry that we're taking head-on.

What We’d Tell Other Teams

If you’re considering an “all-in” approach on a platform—whether Katalon or anything else—here’s what we learned:

Constraints drive clarity. Removing your safety nets will expose what’s working and what’s not. Fast.
Your hacks are your roadmap. The workarounds your team invents are the best source of future features.
Push past what’s documented. Some of the most powerful capabilities only reveal themselves under pressure.
Stress + structure = scale. Building under real tension creates a more resilient platform—not just for you, but for everyone who follows.

Final Word: Dogfooding Is Only Real If It Hurts

It’s easy to “dogfood” your product when everything’s stable and your use cases are clean. But that doesn’t teach you anything new.

We chose to break things. And in doing so, we helped build something better—not just for us, but for every team that now relies on Katalon.

Because a platform that can survive your hardest problems is one you can confidently offer to solve someone else’s.

Explain

FAQs

Why did Katalon choose to test its own products exclusively with Katalon?

The team wanted to uncover platform limitations, learn from real-world pressure, and strengthen the product by using it under demanding internal workflows.

What did the internal team discover about Katalon’s mobile testing capabilities?

They found that Katalon allowed reuse of the same test objects and scripts across iOS and Android without duplication or alternate flows, improving regression speed and maintenance.

How did reporting limitations lead to improvements in TestOps?

Early gaps forced teams to embed manual tests into automation repositories, revealing missing capabilities and driving a full re-architecture to support comprehensive test lifecycle management.

What organizational challenge influenced TestOps project structure updates?

A team reorganization exposed the rigidity of early project structures, prompting the introduction of more flexible portfolio and reporting views.

What testing challenge emerged when validating AI-driven systems?

Non-deterministic AI outputs highlighted the need for tolerance-based validations, benchmark frameworks, and dedicated AI testing capabilities beyond traditional deterministic checks.

Richie Yu

Senior Solutions Strategist

Richie is a seasoned technology executive specializing in building and optimizing high-performing Quality Engineering organizations. With two decades leading complex IT transformations, including senior leadership roles managing large-scale QE organizations at major Canadian financial institutions like RBC and CIBC, he brings extensive hands-on experience.