QA & Testing — Three-Layer Strategy#

This page documents how we systematically verify that every page of the Bits webapp renders, every button has a working handler, and no critical user flow is broken. The strategy is layered so each layer catches a different class of bug at a different cost-per-test.

Layer	What it catches	Effort	Run time	Source of truth
A — Page-render smoke	Dead routes, missing imports, broken role guards, console errors on mount	Auto-generated	~20 s for ~70 pages	`webapp/e2e/smoke-pages.spec.ts`
B — Click-sweep	Click handlers that throw, dialogs that render blank, broken event wiring, undefined-prop crashes	Auto-generated	~30 s for ~600 click attempts	`webapp/e2e/click-sweep.spec.ts`
C — Critical-path scenarios	Multi-step business flows: onboarding, donations, ticket purchase, moderation decisions	Hand-written	varies	`webapp/e2e/<feature>.spec.ts` (one per feature)

Layers A + B are safety nets — cheap to run on every PR, cover the entire surface area, fail fast when a page is broken or a button stops working. Layer C verifies that the business logic on top still does the right thing.

1. The feature inventory#

Layers A and B are generated mechanically from a single artifact: the feature inventory.

Path: documentation/feature-inventory.md
Source: every data-testid literal in webapp/src/**/*.{ts,tsx}
Coverage: 137 source files, ~1,257 unique testids, every one traceable to a file:line in the source tree
Verification: bidirectionally checked — every testid that an existing Playwright spec references is in the inventory, and every inventory row traces back to a real source line

The inventory is the authoritative input set for test design. The codebase enforces (per webapp/CLAUDE.md) that every interactive element gets a stable data-testid, so the inventory is by construction a complete catalog of user-actionable features.

Patterns the inventory captures#

Kind	Source pattern	Example
`static`	`data-testid="literal"`	`<Button data-testid="feed-create-post">`
`template`	data-testid={`prefix-${var}`}	`<Card data-testid={`feed-post-card-${postId}`}>` (rendered as `feed-post-card-{postId}`)
`slotprop`	MUI v9 slot props	`<TextField slotProps={{ htmlInput: { 'data-testid': 'login-identifier' } }}>`
`drilled`	A `testId` prop drilled into a child wrapper	`<DataGridRow testId="user-row-123">`

Regenerating the inventory#

When you add new pages or testids, refresh the inventory:

python3 scripts/extract_testids.py > documentation/feature-inventory.md

The extractor is a deterministic Python script — same input, same output. Re-run it before opening a PR if the diff would otherwise leave the inventory stale.

2. Layer A — page-render smoke#

File: webapp/e2e/smoke-pages.spec.ts (auto-generated, do not edit by hand)

For each route in webapp/src/App.tsx the spec emits one test that:

Resolves a user with the right role. Reads the route’s <AuthGuard allowRoles={…}> declaration to pick admin / alumni / student / faculty / staff / parent / public.
Logs in via the UI (/auth/login → login-identifier → login-password → login-submit).
Navigates to the route.
Asserts the page-root testid is visible (e.g. feed-root, admin-users-root).
Asserts no fatal console errors fired during the mount. Benign React dev warnings, 404 / 401 probes, and Failed to load resource are tolerated; everything else fails the test.

Skipped routes#

Some routes can’t be smoked without test fixtures the dev preview doesn’t provide (live IDs, OAuth callbacks, public marketing pages with no testid root). They’re listed in a comment block at the bottom of smoke-pages.spec.ts with a one-line reason each. Examples:

/auth/onboarding              — multi-step wizard; needs Layer C scenario
/me/credentials/callback      — OAuth callback; needs upstream provider token
/admin/events/:id/scan        — needs a live event id
/donate/:slug                 — needs a live campaign slug

These are addressed by hand-written Layer C specs.

3. Layer B — click-sweep#

File: webapp/e2e/click-sweep.spec.ts (auto-generated)

For each route, the sweep clicks every “safe” testid on the page one at a time. After every click it:

Presses Escape to dismiss any dialog/menu the click opened.
Checks the URL — if it changed (CTA buttons, links to detail pages), navigates back to the original route.
Re-asserts the page-root testid is still visible.

This catches:

Click handlers that throw — uncaught exceptions kill the page and the root testid disappears.
Dialogs that render blank — opening a dialog whose content errors out.
Broken event wiring — onClick pointing at a deleted prop or an undefined function.
Missing imports — lazy-loaded sub-components that no longer exist.

What the sweep deliberately skips#

Some testids match patterns the sweep considers unsafe to click without fixtures. They’re filtered by name:

Pattern	Why skipped
`-submit`, `-confirm`, `-finalize`, `-finalise`	Form submits — need real form data
`-delete`, `-remove`, `-revoke`, `-suspend`, `*-block`	Destructive actions
`-publish`, `-cancel`, `*-leave`	State transitions on real entities
`nav-`, `-link`, `*-back`	Navigate-away — would need full back-stack handling
`-input`, `-textbox`, `-search`, `-q`	Inputs need typed text, not clicks
`-pay`, `-charge`, `-purchase`, `-checkout`, `*-redeem`	Real-money / inventory actions
`*-send` (message send)	Mutates real conversations
Templated testids (`feed-post-card-{postId}`)	Need a real id we don’t have
`chatbot-launcher`	Hijacks the page with a drawer

These are exactly the testids that need a Layer C scenario — they’re left out of the sweep because clicking them blindly is meaningless without a real workflow.

4. Layer C — critical-path scenarios#

Files: every webapp/e2e/<feature>.spec.ts that wasn’t auto-generated.

These are hand-written specs that verify the multi-step business flows. Examples already in the codebase:

auth-new-user.spec.ts — onboarding wizard end-to-end, including OTP delivery via Mailpit and the privacy step.
chatbot.spec.ts — open the drawer, submit a prompt, click a citation card to navigate.
donations.spec.ts — create a campaign, donate, verify 80G receipt.
moderation-dashboards.spec.ts — load a moderation queue, record a decision with policy citation, check audit row.
ai-assistants.spec.ts — admissions citation flow, analytics justification gate, moderation triage Apply.

When a feature lands, the team adds a Layer C spec for its happy path + at least one sad path. Layers A and B regenerate automatically and pick up the new page’s testids without any edits.

5. Running the suite#

Prerequisites#

Dev backend on :16222 (make preview)
Dev webapp on :13222 (served by the same make preview)
Demo data seeded (make seed-dev) — provides admin@bits-pilani.ac.in / demo123 and friends used by the auto-generated layers
Playwright browsers installed (npx playwright install chromium)

Run Layer A + B together against the dev preview#

cd webapp && \
  PLAYWRIGHT_BROWSERS_PATH=/Users/alevsk/.cache/ms-playwright \
  BITS_WEBAPP_URL=http://127.0.0.1:13222 \
  VITE_API_BASE=http://127.0.0.1:16222 \
  BITS_E2E_USE_DEMO=1 \
  npx playwright test smoke-pages.spec.ts click-sweep.spec.ts \
    --project=chromium --reporter=list

Expected: 138 tests, all passing, ~50 seconds wall-clock.

The BITS_E2E_USE_DEMO=1 flag tells the auto-generated specs to use the demo accounts seeded by make seed-dev instead of seeding fresh users. This is the right mode for the dev preview, which doesn’t run Mailpit.

Run against the test infrastructure#

The full integration pipeline (used by CI) uses make test-e2e, which brings up its own ephemeral Postgres + Mailpit + backend on test ports and seeds fresh users via the onboarding API. Omit BITS_E2E_USE_DEMO=1 and the specs fall back to the seed-fresh-user path automatically.

make test-e2e

Run a single layer#

# Just smoke
npx playwright test smoke-pages.spec.ts --project=chromium

# Just sweep
npx playwright test click-sweep.spec.ts --project=chromium

# A specific page from either
npx playwright test --grep "/admin/users renders"

6. Regenerating the auto-generated specs#

When a page’s testids change (you add a button, rename a tab, etc.), the generated specs need to be refreshed:

python3 scripts/generate_smoke_specs.py

This rewrites both smoke-pages.spec.ts and click-sweep.spec.ts from the current state of webapp/src/App.tsx and documentation/feature-inventory.md. Commit the regenerated files alongside the source change so the tests stay in sync.

The generator is idempotent — running it twice produces the same output.

7. Maintenance + signal triage#

When a smoke test fails#

The page failed to render at all, or it rendered with a console error. Read the failure context (Playwright writes test-results/<test>/ error-context.md with a DOM snapshot) — usually it’s:

A missing role guard / wrong role mapping in App.tsx
A backend RPC the page calls hard-failed (check make preview logs)
A new console error — either fix the source or add the pattern to the tolerated list in smoke-pages.spec.ts

When a sweep test fails#

The sweep crashed mid-page. The error context shows which testid was being clicked. Usually it’s:

A button whose handler now throws — fix the handler
A dialog that opens fully on click but its content threw — check the dialog’s child component
A new CTA button that navigates to a new route the sweep doesn’t know to recover from — already handled by the URL-recovery logic, but if the destination is broken the sweep will fail there

When a new page is added#

Add the page + its testids in webapp/src/.
Regenerate the inventory: python3 scripts/extract_testids.py > documentation/feature-inventory.md
Regenerate the specs: python3 scripts/generate_smoke_specs.py
Run the new tests locally to confirm they pass.
Commit all four (source, inventory, both specs).

8. Coverage reference#

Numbers from the most recent run against make preview with seeded demo data.

Metric	Count
Routes in `App.tsx`	99
Routes covered by Layer A	72
Routes covered by Layer B	66
Routes deferred to Layer C	27 (live-id / OAuth / public marketing pages)
Total Layer A + B tests	138
Wall-clock time, both layers	~47 s
Unique testids in inventory	1,257
Source files exercised	137

If a future PR pushes any of these numbers down without explanation, the generator missed something — investigate before merging.

9. Generated artifacts at a glance#

File	Generator	Run when
`documentation/feature-inventory.md`	`scripts/extract_testids.py`	Source testids change
`webapp/e2e/smoke-pages.spec.ts`	`scripts/generate_smoke_specs.py`	Routes or testids change
`webapp/e2e/click-sweep.spec.ts`	`scripts/generate_smoke_specs.py`	Routes or testids change

All three are checked into the repository so they’re visible in code review and reproducible from a fresh checkout. Edit them by hand only if you intend to fork — the generator will overwrite manual edits on the next run.

10. Where this strategy doesn’t help#

Visual regressions — pixel-level layout drift, font swaps, theme changes. Use the screenshot specs in webapp/e2e/screenshots.spec.ts for that, refreshed manually when the design intentionally changes.
Cross-browser bugs — Layers A + B run on chromium by default. The full suite (make test-e2e) runs on chromium + firefox; webkit needs the webkit Playwright project explicitly.
Race conditions / flakiness under load — generated tests run a single user. Concurrency bugs surface in perf/ k6 scripts and the full integration suite, not here.
Accessibility — separate axe-core sweep is on the roadmap but not yet wired into the layered strategy.