Provider Agnostic Geo Lookup Library
| Status | Effort | Client Cost Saving |
|---|---|---|
| Draft | 20 | TBD |
Summary
Create a Python library that provides a provider-agnostic interface for geographic address lookup services. This library will support automatic failover between providers, ensuring service continuity during provider outages, and can be deployed across multiple projects and platforms (DRF, Lambda).
Rationale
The Problem
On 2026-02-04, getaddress.io experienced several hours of downtime. This caused a complete failure of the quote service flow because:
- Single Provider Dependency: The geo service currently relies exclusively on getaddress.io with no fallback mechanism
- Cross-Client Impact: This single point of failure affects multiple clients (Adrian Flux, Sterling, Bikesure) and multiple product lines (EPAs, JAF forms), amplifying the business risk
- User Experience Impact: While users could technically enter addresses manually, this option is not immediately obvious in the UI
- Business Impact: No quotes could be processed during the outage across any client or product line, directly affecting revenue
Why This Change is Needed
- Resilience: We have an avoidable single point of failure that affects multiple clients and product lines - a single provider outage takes down quote functionality across the entire business
- Code Exists: The codebase technically supports multiple providers, but only one is implemented and active
- Reusability: Other hut42 projects (Viitata, Forecaster, Katala) require similar geo lookup functionality
- Cost Optimization: Some providers like Google offer free tiers that could serve as fallback options
Proposed Solution
Python Library
A standalone Python library that:
- Abstracts Provider Implementation: Unified interface regardless of underlying provider
- Supports Multiple Providers:
- getaddress.io (primary)
- Google Places API (fallback - potentially free tier)
- Additional providers as needed
- Manual or Automatic Failover: Configurable provider switching (see phased approach below)
- Framework Agnostic: Works with DRF services and Lambda functions
- Built from Existing Code: Refactored from the current geo service codebase
Phased Approach
Phase 1 - Manual Switchover: Implement multi-provider support with manual provider switching via configuration. Service health monitoring handled externally by Uptime Robot, with manual intervention to switch providers during outages.
Phase 2 - Automatic Failover: Add built-in health checking and automatic failover management to the library, removing the need for manual intervention.
Architecture
Phase 1 - Manual Switchover
┌─────────────────────────────────────────────────────────┐
│ Geo Lookup Library │
├─────────────────────────────────────────────────────────┤
│ GeoLookupClient │
│ ├── Provider Registry │
│ ├── Provider Selector (config-driven) │
│ └── Response Normalizer │
├─────────────────────────────────────────────────────────┤
│ Providers │
│ ├── GetAddressProvider (primary) │
│ ├── GooglePlacesProvider (fallback) │
│ └── BaseProvider (abstract interface) │
└─────────────────────────────────────────────────────────┘
│
▼
┌──────────┐ ┌─────────────────┐
│ Geo Svc │◄───────│ Uptime Robot │
│ │ | (monitoring) │
└──────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Manual config │
│ change to swap │
│ provider │
└─────────────────┘
Phase 2 - Automatic Failover
┌─────────────────────────────────────────────────────────┐
│ Geo Lookup Library │
├─────────────────────────────────────────────────────────┤
│ GeoLookupClient │
│ ├── Provider Registry │
│ ├── Health Checker ◄── NEW │
│ ├── Failover Manager ◄── NEW │
│ └── Response Normalizer │
├─────────────────────────────────────────────────────────┤
│ Providers │
│ ├── GetAddressProvider (primary) │
│ ├── GooglePlacesProvider (fallback) │
│ └── BaseProvider (abstract interface) │
└─────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌──────────┐ ┌───────────┐
│ Geo Svc │ │ Geo Svc │
| [Flux] | | [Hutsoft] |
└──────────┘ └───────────┘
Key Features
| Feature | Phase | Description |
|---|---|---|
| Provider Abstraction | 1 | Unified API regardless of underlying service |
| Response Normalization | 1 | Consistent response format across all providers |
| Configurable Provider | 1 | Select active provider via configuration |
| External Health Monitoring | 1 | Uptime Robot monitors provider health |
| Manual Switchover | 1 | Change provider via config during outages |
| Metrics & Logging | 1 | Track provider performance and failures |
| Built-in Health Checker | 2 | Periodic health checks on all configured providers |
| Automatic Failover Manager | 2 | Circuit breaker pattern for automatic provider switching |
Affected Applications
Primary Integration
- Universal geo service (immediate integration)
Main Consumers (highest traffic)
- Adrian Flux, Sterling and Bikesure EPAs
- Adrian Flux and Sterling JAF forms
Other Potential Consumers
- Viitata
- Forecaster
- Katala
- Any future projects requiring geo lookup
Impact
API Compatibility
This proposal maintains 100% API compatibility with the existing geo service. No consumer application changes will be required. The library will be integrated behind the existing service interface, making this change completely transparent to Adrian Flux, Sterling, Bikesure EPAs, JAF forms, and all other consumers.
Services Affected
- Universal geo service (internal code changes for library integration)
- Quote service (indirect - improved reliability)
Technical Considerations
- API compatibility will be maintained for existing geo service consumers
- Response format normalization between different providers
- Rate limiting and quota management per provider
- Caching strategy to reduce API calls and costs
- Configuration management for API keys across environments
Data Considerations
- Different providers may return slightly different address formats
- UK-specific address formatting (getaddress.io strength)
- Postcode validation consistency across providers
Provider Analysis
getaddress.io (Current/Primary)
- Pros: UK-focused, excellent postcode lookup, current integration exists
- Cons: Single point of failure, outage yesterday
- Cost: Paid service
Google Places API (Proposed Fallback)
- Pros: Highly reliable, generous free tier, global coverage
- Cons: Less UK-specific, may require address normalization
- Cost: Free tier available (suitable for fallback volumes)
Other Options to Evaluate
- Postcodes.io (UK specific, open data)
- Ideal Postcodes
- OS Places API
Implementation Plan
Phase 1: Multi-Provider Support with Manual Switchover
1.1 Library Development
- Extract and refactor existing geo code into standalone library
- Define abstract provider interface (BaseProvider)
- Implement GetAddressProvider (port existing code)
- Implement GooglePlacesProvider
- Create response normalization layer
- Implement configurable provider selection
- Write comprehensive test suite
- Package as installable Python library
1.2 Geo Service Integration
- Install library in universal-geo-service
- Configure primary and fallback providers
- Update service to use library interface
- Integration testing
- Deploy to staging
1.3 Monitoring & Rollout
- Configure Uptime Robot to monitor provider endpoints
- Set up alerting for provider outages
- Document manual switchover procedure
- Deploy to production
- Test manual switchover process
Phase 2: Automatic Health Checking & Failover
2.1 Library Enhancement
- Implement Health Checker component
- Implement Failover Manager with circuit breaker pattern
- Add automatic provider switching logic
- Configure failover thresholds and retry policies
- Update test suite for failover scenarios
2.2 Integration & Rollout
- Update geo service to use automatic failover
- Deploy to staging
- Performance and failover testing
- Deploy to production
- Monitor automatic failover events
Wider Adoption (Post Phase 1 or 2)
- Provide library documentation and examples
- Support Viitata integration
- Support Forecaster integration
- Support Katala integration
Risks & Mitigation
| Risk | Impact | Phase | Mitigation |
|---|---|---|---|
| Provider response format differences | Medium | 1 & 2 | Robust response normalization and testing |
| Increased complexity | Low | 1 & 2 | Clean abstraction layer, good documentation |
| Google free tier limits exceeded | Low | 1 & 2 | Monitor usage, upgrade plan if needed |
| Manual switchover delay | Medium | 1 | Uptime Robot alerting, documented runbook |
| Failover latency | Low | 2 | Health checks, circuit breaker pattern |
| Address accuracy differences | Medium | 1 & 2 | Thorough testing with UK addresses, consider primary-only for critical paths |
Success Metrics
Phase 1
- Ability to switch providers within minutes of detected outage
- Zero extended quote service outages due to geo provider failures
- Library adoption in at least 2 additional projects within 6 months
- Reduced operational impact from geo lookup alerts
Phase 2
- < 500ms automatic failover time when primary provider fails
- Zero manual intervention required for provider outages
- Reduced operational alerts related to geo lookups
Estimations
Phase 1: 3 days (20 hours)
| Task | Effort |
|---|---|
| Project setup, current implementation investigation, alternative provider investigation | 4 hours |
| Development build | 8 hours |
| Testing and integration | 8 hours |
Phase 2: 2 days (16 hours)
| Task | Effort |
|---|---|
| Development build | 8 hours |
| Testing and integration | 8 hours |
Status
Current Status: Draft
History:
- 2025-02-05: Proposal created following getaddress.io outage