WORK STREAM
Infrastructure
Cross-cutting improvements that support all three work streams. None are user-facing, but every initiative depends on them — provider abstraction, safer deployments, and the observability to know when something breaks.
Stripe placed behind Go interfaces. Direct Stripe API calls replaced with abstracted boundaries. Reduced coupling across the test suite. Foundation for future provider flexibility.
Argo Rollouts for both subs-api and billing-webhooks. Graduated rollout with automatic rollback on error rate spikes. No more all-or-nothing deploys to the payment critical path.
Prometheus metrics, subs_dunning table, Grafana dashboards. Who entered bad debt, why, what products. Account-state drift reduced from 13,000 to 500.
~15 false-positive alerts resolved across both services. Signal-to-noise ratio improved enough that real failures are visible again.
Baseline metrics for the payment lifecycle. Entry/exit counters for bad debt. The instrumentation that makes data-driven decisions possible.
Direct path to self-service debt resolution. Customer receives email with a link to resolve their bad debt without contacting support.
Automatic subscription cancellation when dunning exhausts retries. Replaces manual support intervention.
31-day advance notification for annual subscribers approaching renewal. Reduces involuntary churn from expired payment methods.
Warning emails before cancellation takes effect. Last-chance window to update payment method.
Infrastructure work is invisible when it's working — and unmissable when it's not. These investments keep the payment critical path safe while the product evolves.