14 test automation rules we wish someone had told us earlier

A test automation suite is a piece of production software. Treat it as a side-project and it will rot inside a year. Treat it as a first-class codebase and it pays back for the lifetime of the product.

These rules come from rebuilding suites for banks, payment gateways, and e-commerce teams in India over the last five years. Most of them are not new. The point is that almost every team breaks at least five of them.

1. Start from the testing pyramid, not from the UI

Mike Cohn drew this in 2009 and it is still the right shape. A suite with thousands of Selenium tests and three unit tests is the single most common anti-pattern we see.

The testing pyramid (Mike Cohn, 2009). A healthy suite has many cheap fast tests and few expensive slow ones.

Push tests as low as they can run. A regex validation for an Aadhaar number does not need a browser. A pricing rule does not need a payment gateway. If the bug can be caught in a unit test, do not write a UI test for it.

2. Automate what is boring and stable, not what is exciting and new

New features change every week. Automating them in week one means you rewrite the test in week three. Wait until the design has settled, then automate.

Boring stable flows are the high-value targets: login, checkout, OTP verification, the password-reset email, the invoice PDF download. These run on every commit forever and need to be green.

3. Every test must run in any order, on any machine, in parallel

If test B depends on test A creating a user, both tests are broken. Each test should set up its own data, do its thing, and clean up. The cost is more setup code. The benefit is that a flaky CI machine running four tests in parallel does not produce a fake red build at 2 a.m.

python
# Bad: depends on order
def test_a_register_user():
    register("[email protected]")

def test_b_login_user():
    login("[email protected]")  # only works if test_a ran first

# Good: each test owns its data
def test_login(db_factory):
    user = db_factory.user(email="[email protected]")
    response = login(user.email)
    assert response.status_code == 200

4. Locate elements by stable IDs, never by visible text

A copy change from “Continue” to “Next” should not break a test. Add a stable data-testid attribute on every element your tests touch. Treat that attribute like a public API. Once shipped, you do not rename it without warning the QA team.

5. No `sleep()` calls, ever

Every time.sleep(5) is a future flake. Use explicit waits that poll until a condition is true.

python
# Bad
driver.find_element(By.ID, "submit").click()
time.sleep(5)
assert "Welcome" in driver.page_source

# Good
WebDriverWait(driver, 10).until(
    EC.text_to_be_present_in_element((By.ID, "greeting"), "Welcome")
)

6. The three-times rule for flakes

A test that fails once and passes on retry is not flaky. A test that flakes three times in a week is broken and must be either fixed or quarantined the same day. We have never seen flaky tests fix themselves.

7. Page objects, with a hard ceiling on size

Page Object Model has been the standard for years. The mistake is letting page object classes grow to 500 lines and 40 methods. Cap each page object at around 200 lines. If it grows beyond that, the page itself probably has too many responsibilities and the page object should split too.

8. Run on CI from day one

A suite that only runs on the QA lead's laptop does not exist. Wire it into GitHub Actions, GitLab CI, Jenkins, whatever you have. Run on every push to a feature branch, every merge to main, and on a nightly schedule.

Use the cheapest runner that fits, with parallel sharding. A well-written PyTest suite of 400 UI tests should finish under 12 minutes on 4 parallel shards. Anything slower than 15 minutes and developers will start ignoring it.

9. Make results visible, not buried in logs

Plug an Allure or pytest-html report into your CI and post the link as a PR comment on every failed run. A screenshot of the failing page belongs in the report. Without that, a failure means thirty minutes of debugging before you even know what broke.

10. Stop fighting your environments

Tests that work in staging but fail in dev usually have nothing to do with the tests. They have to do with environment drift. Spin up the same docker-compose stack for both, seed it with the same fixtures, and your environments become disposable. Any test that needs a snowflake environment is a problem regardless of how clever the assertions are.

11. Test data is half the battle

Hard-coded usernames lose. Factories win. Build a small data factory that creates fresh users, orders, accounts on demand, with sensible defaults that you can override per test.

python
# tests/factories.py
def make_user(email=None, **overrides):
    return User.create(
        email=email or f"qa+{uuid4().hex[:8]}@yoursite.com",
        full_name=overrides.get("full_name", "QA Bot"),
        kyc_status=overrides.get("kyc_status", "verified"),
    )

12. Treat test secrets like production secrets

Hard-coded passwords in a Git repo are a breach waiting to happen. Store every credential in your CI's secret vault (GitHub Secrets, GitLab CI variables, AWS Secrets Manager). Test accounts in production must have the minimum required scope and rotate them quarterly.

13. API tests catch more bugs than UI tests, faster

A typical Indian e-commerce checkout fires 40+ API calls. The UI is mostly a thin client over those APIs. If a coupon code applies the wrong discount, the bug is in the POST /orders/checkout response, not in the Razorpay screen. API tests find that bug in 200ms. UI tests find it in 14 seconds.

14. Every production incident gets a test

When something breaks in prod, the fix is not enough. The retrospective also asks: what test would have caught this? Write that test before you close the ticket. Two well-known examples of incidents that a single regression test would have prevented:

The 2021 NSE India trading halt sparked by a network-link failover bug that the disaster-recovery drill had not rehearsed under the load profile of a live trading day.
BBC iPlayer's Christmas 2016 outage caused by a feature flag combination that worked individually but failed when enabled together.

Closing

These 14 rules are not glamorous. The teams that follow them ship features faster, get paged less, and end up with QA engineers who actually enjoy the work. If you want to learn these rules with your hands on real apps, our Pro Software Testing & Automation program is 31 phases, 130 lessons, free to start.

References

The Practical Test Pyramid (Martin Fowler) · The canonical modern restatement of Mike Cohn 2009.
PageObject (Martin Fowler) · The original write-up of the pattern, still the clearest.
Selenium WebDriver explicit waits · Why time.sleep is wrong, and what to use instead.
pytest documentation · Fixtures, parametrize, plugins.
SEBI report on NSE India trading halt, 24 February 2021 · A real example of how lack of failover testing under realistic load caused a 4-hour outage.