ADR-0017: Synchronous Persistence for SPIKE Secrets Store
- Status: accepted
- Date: 2025-01-25
- Tags: Security, Persistence, Database, Backing-Store, Performance
Context
SPIKE is a secrets store that can use a SQLite backing store (among other backing store options) to persist secrets. However, the source of truth for the secrets is held in memory. SQLite is primarily used as a backup to rehydrate secrets in case the secrets store crashes or needs to be recovered.
Persistence operations were initially designed to be asynchronous, using methods
like AsyncSaveSecret()
, to minimize blocking and improve performance. However,
this design has introduced unnecessary complexity, race conditions, and edge
cases, with no significant benefit to the overall system. SQLite, being fast and
lightweight, already offers sufficient performance without the need for
additional asynchronous operations.
Problem
The asynchronous approach to persistence introduces the following issues:
- Increased complexity: Asynchronous operations, while designed to improve performance, add complexity to the system, making it harder to reason about and troubleshoot.
- Race conditions and edge cases: The asynchronous operations have led to potential race conditions, which compromise the system’s reliability.
- Debugging difficulty: To avoid the race conditions above, we could have use abstractions, including Go channels. However, using Go channels and asynchronous operations creates challenges for debugging, as tracking state transitions becomes non-trivial.
Given that SQLite is already fast enough for our needs, the performance benefit of using asynchronous operations is minimal. As a result, we no longer see a significant justification for using asynchronous persistence operations in this context.
Decision
-
Synchronous Persistence: All database persistence operations will now be synchronous.
- Justification: Since SQLite is sufficiently fast, and we are not seeing performance bottlenecks at the database level, the simplicity of synchronous operations outweighs the potential complexity of maintaining asynchronous ones.
- Expected Outcome: This decision reduces the complexity of the codebase, eliminates the potential for race conditions, and makes the system easier to debug and maintain. We will continue to monitor for any performance impact that might arise due to this decision.
-
Fallback to Async if Performance Issues Arise: In the unlikely event that we observe significant performance issues with synchronous operations, we will consider optimizing specific areas locally.
- Optimization Strategy: If performance degradation is observed, we will explore optimization options such as local caching, batching of persistence operations, or fine-tuning SQLite settings. Asynchronous operations may be reintroduced selectively in these cases.
Consequences
- Reduced Complexity: By removing asynchronous operations, the system will be simpler and easier to maintain, with fewer edge cases and race conditions to handle.
- Performance Tradeoff: Synchronous operations may result in slight performance degradation if there is a heavy load on the persistence layer. However, this is unlikely given the current design and SQLite’s speed.
- Easier Debugging: The synchronous model simplifies debugging, as there are no concurrent operations that need to be tracked.
Alternatives Considered
- Async Persistence: We initially considered keeping asynchronous operations to prevent blocking and improve performance. However, this would introduce complexity that isn’t justified by the system’s current requirements and SQLite’s speed.
- Go Channels for Sync Operations: Using Go channels to handle synchronization in asynchronous operations was also considered, but it would increase debugging complexity and not address the core issue effectively.
This ADR will be revisited if performance issues arise, but for now, the shift to synchronous persistence aligns with the goal of simplifying the codebase and improving system stability.
- ADR-0021: SPIKE Keeper as a Stateless Shard Holder
- ADR-0020: Switch to Zola for Documentation System
- ADR-0019: Plugin-Based Storage Backend Architecture
- ADR-0018: Administrative Access to SPIKE
- ADR-0017: Synchronous Persistence for SPIKE Secrets Store
- ADR-0016: Memory-First Secrets Store
- ADR-0015: Use Singular Form for File and Package Naming
- ADR-0014: Maintaining SQLite as SPIKE’s Primary Storage Backend
- ADR-0013: S3-Compatible Storage as SPIKE’s Backing Store
- ADR-0012: HTTP Methods for SPIKE API
- ADR-0011: PostgreSQL as SPIKE’s Backing Store
- ADR-0010: Session Token Storage Strategy for SPIKE Nexus
- ADR-0009: Multi-Administrator Support System
- ADR-0008: Administrative Access Control System
- ADR-0007: Root Key Lifecycle and Management Strategy
- ADR-0006: Trust Boundary Definition and Security Assumptions
- ADR-0005: Use SPIFFE mTLS for Inter-Component Authentication and Communication
- ADR-0004: SPIKE Keeper Minimalist Design Approach
- ADR-0003: Root Key Management and Storage Strategy
- ADR-0002: Use Docsify for Documentation System
- ADR-0001: Display Secrets in Plain Text in SPIKE Pilot Admin CLI