ADR-0013: S3-Compatible Storage as SPIKE’s Backing Store

Status:
- accepted
- Supersedes ADR-0011: PostgreSQL as SPIKE’s Backing Store
Date: 2024-11-07
Tags: Storage, Authorization, Policy, S3, MinIO

Context

SPIKE needs a reliable, secure, and performant backing store to maintain encrypted data including:

Root keys (encrypted with admin password)
Admin tokens (encrypted with root key)
Secrets (encrypted with root key)

The system requires:

Secure storage of encrypted blobs
Path-based access control
Audit logging capabilities
Flexible deployment options (cloud and on-premises)
Integration with existing identity providers

After further analysis, we recognized that our secrets storage model closely resembles object storage patterns, where:

Secrets are essentially encrypted blobs
Access is path-based
Authorization decisions are made at the path level
Storage and retrieval operations are simple CRUD operations

Decision

We will use S3-compatible storage systems (AWS S3, MinIO) as the backing store for SPIKE, leveraging their native policy engines for access control.

Rationale

Authorization Model

S3’s IAM/policy engine is battle-tested and well-understood
Path-based policies align perfectly with SPIKE’s access patterns
Eliminates the need to build and maintain a custom policy framework
Policies can be managed through existing tools and processes

Storage Capabilities

Excellent for blob storage (our encrypted secrets)
Strong consistency guarantees (especially with newer S3 versions)
Built-in versioning support
Cross-region replication options
Excellent scalability characteristics

Operational Benefits

Multiple implementation options:
- AWS S3 for cloud deployments
- MinIO for on-premises deployments
- Other S3-compatible systems for special cases
Rich ecosystem of tools and utilities
Robust backup and lifecycle management
Built-in metrics and monitoring
Cost-effective for our access patterns

Security Features

Native encryption at rest
SSL/TLS support
Integration with various identity providers
Built-in audit logging
Object versioning for recovery

Consequences

Positive

Simplified architecture by using the storage system’s native policy engine
Reduced code complexity in SPIKE
Better separation of concerns (storage/policy vs. application logic)
Flexibility in deployment options (cloud or on-prem)
Future-proof: Can adopt better policy engines (e.g., OPA) without changing the storage layer
Built-in versioning and audit capabilities

Negative

Dependent on S3 API compatibility
May need to implement additional caching layer for performance
Limited by S3’s eventual consistency model for some operations
Need to ensure policy engine capabilities are consistent across different S3 implementations

Mitigations

Implement abstraction layer to handle S3 implementation differences
Document consistency requirements and guarantees
Regular testing with different S3-compatible systems

Implementation Notes

Storage Pattern

Memory is the primary storage medium
S3 serves dual purposes:
- Authorization source (via IAM/policies)
- Persistent backup store
Write pattern:
- Check S3 policy authorization
- If authorized, write to memory
- Asynchronously write to S3 for persistence
Read pattern:
- Check S3 policy authorization
- If authorized, serve from memory
- Only read from S3 during cold starts or recovery
  - for non-HA deployments
  - for HA deployments, the design will need to be adjusted
Delete pattern:
- Check S3 policy authorization
- If authorized, remove from memory
- Mark as deleted in S3 (using versioning)

Storage Layer

Use AWS SDK for S3 operations
Implement the storage interface that can work with any S3-compatible system
Encrypt all data before storage
Use versioning for secret history

Caching Strategy

Implement in-memory cache for performance
Cache only after confirming S3 permissions
Clear cache on policy changes
Implement TTL for cached items

Policy Management

Use a native S3 policy format
Document common policy patterns
Provide helper utilities for policy creation
Test policies across different S3 implementations

Future Considerations

If more complex policy requirements emerge, we can:
1. Continue using S3 for storage
2. Integrate OPA or similar for advanced policy evaluation
3. Keep existing S3 policies as coarse-grained control

References

Notes

This approach keeps SPIKE lean and focused while leveraging battle-tested components for storage and authorization. By using S3’s native policy engine initially, we avoid premature optimization while maintaining the flexibility to adopt more sophisticated policy engines like OPA if needed in the future.