ADR-0028: Use Human-Readable Error Messages in CLI Tools
- Status: accepted
- Date: 2025-11-17
- Tags: CLI, Error Handling, User Experience, Go
Context
The SPIKE Pilot CLI tool (spike) is primarily consumed by human users
(developers and operators) rather than programmatic consumers. Error handling
in Go typically uses sentinel errors and errors.Is() for programmatic error
checking, which works well for libraries and SDKs. However, CLI tools have
different requirements because their primary consumer is a human reading
terminal output.
We need to determine the appropriate error handling strategy for the SPIKE Pilot CLI that balances Go best practices with user experience requirements.
Decision
We will use human-readable, contextual error messages in the SPIKE Pilot CLI rather than exposing raw sentinel errors to users.
Specifically:
- Return formatted error messages with context using
fmt.Errorf() - Include actionable information (what failed, why, suggested next steps)
- Use plain English descriptions rather than error codes
- Provide helpful suggestions when appropriate
- Reserve sentinel errors for internal library code and SDK usage
Rationale
CLI Tools vs Libraries
Different types of software have different error handling needs:
| Software Type | Consumer | Error Strategy |
|---|---|---|
| Library/SDK | Other code | Sentinel errors, errors.Is() |
| CLI Tool | Human user | Formatted, contextual messages |
| API Service | HTTP client | Structured error responses |
Industry Best Practices
Popular CLI tools follow this pattern:
Git:
fatal: not a git repository (or any of the parent directories): .git
Docker:
Error: No such container: mycontainer
Error response from daemon: manifest for nginx:invalid not found
kubectl:
Error from server (NotFound): pods "myapp" not found
All provide human-readable context, not raw error types.
User Needs for CLI Tools
When a CLI command fails, users need:
- What went wrong: Clear description of the failure
- Why it failed: Context about the cause
- What to do next: Actionable suggestions when possible
Example comparison:
Sentinel error approach (bad for CLI):
return apiErr.ErrSecretNotFound
// Output: "secret not found"
Formatted error approach (good for CLI):
return fmt.Errorf(
"secret not found at path '%s'. Use 'spike secret list' to see secrets",
path,
)
// Output: "secret not found at path 'secrets/db/password'.
// Use 'spike secret list' to see available secrets"
When to Use Each Approach
Use sentinel errors when:
- Writing library code consumed by other Go code
- Other code needs to make programmatic decisions based on the error type
- Building SDKs or packages
Use formatted errors when:
- Building CLI tools for human users
- Error messages are displayed in terminal output
- Context and suggestions improve user experience
Examples from SPIKE Pilot
Good: Human-Friendly Errors
// Provides context and path
if !validSecretPath(path) {
return fmt.Errorf("invalid secret path: %s", path)
}
// Includes actionable information
if cmd.NotReadyError(err) {
stdout.PrintNotReady()
return fmt.Errorf("server not ready")
}
// Suggests next steps
return fmt.Errorf(
"unauthorized: your SPIFFE ID '%s' does not have permission. " +
"Check policies with 'spike policy list'",
spiffeID,
)
Internal: Sentinel Errors Still Used
The SDK and internal packages still use sentinel errors appropriately:
// internal/net/response.go - Server responses
reqres.FallbackResponse{Err: data.ErrNotReady}
// spike-sdk-go - SDK for programmatic use
return apiErr.ErrUnauthorized
The CLI layer translates these into human-friendly messages.
Consequences
Positive
- Improved user experience: Users get clear, actionable error messages
- Faster problem resolution: Context helps users fix issues without consulting documentation
- Reduced support burden: Self-explanatory errors reduce support requests
- Aligned with CLI best practices: Matches user expectations from other tools
- Appropriate for audience: Developers and operators are human users, not machines
Negative
- Harder to parse programmatically: If scripts wrap the CLI, they cannot
use
errors.Is() - Less structured: Error messages may vary in format
- Translation complexity: Internalization would be more challenging (though not currently required)
Mitigations
For programmatic consumers (if needed in the future):
- Consistent exit codes (0 = success, 1 = error)
- Optional
--jsonflag for structured output - Documented error message patterns
Implementation Guidelines
When writing CLI error messages:
- Be specific: Include relevant details (paths, IDs, names)
- Provide context: Explain what operation was attempted
- Suggest actions: Point users toward solutions when possible
- Use plain English: Avoid jargon and error codes
- Be concise: Don’t overwhelm with excessive detail
Good example:
return fmt.Errorf(
"failed to decrypt file '%s': file does not exist",
inFile,
)
Bad example:
return ErrFileNotFound // Unhelpful for CLI users
References
- Go Error Handling: https://go.dev/blog/error-handling-and-go
- CLI Design Guidelines: https://clig.dev/
- Comparison with popular CLI tools (git, docker, kubectl)
Related ADRs
This decision applies specifically to CLI tools. Other components follow different patterns:
- SPIKE SDK uses sentinel errors for programmatic consumers
- SPIKE Nexus API returns structured error responses
- Internal packages use sentinel errors for type checking
- ADR-0032: Standard 12-Byte Nonce Size for AES-GCM
- ADR-0031: AST-Based Test Enforcement for Route Guard Functions
- ADR-0030: Minimal Error Messages in API Responses
- ADR-0029: Restrict Recovery and Restoration Operations to SPIKE Pilot
- ADR-0028: Use Human-Readable Error Messages in CLI Tools
- ADR-0027: Separate Audit Logs from Operational Logs
- ADR-0026: Configurable Data Directory for SPIKE Components
- ADR-0025: Path Patterns as Key Namespaces with Regular Expression Matching
- ADR-0024: Transition from In-Memory Cache to Direct Backend Storage for High Availability
- ADR-0023: Decision Against Implementing Lock/Unlock Mechanism in SPIKE Nexus
- ADR-0022: Continuous Polling of SPIKE Keepers Despite 404 Response
- ADR-0021: SPIKE Keeper as a Stateless Shard Holder
- ADR-0020: Switch to Zola for Documentation System
- ADR-0019: Plugin-Based Storage Backend Architecture
- ADR-0018: Administrative Access to SPIKE
- ADR-0017: Synchronous Persistence for SPIKE Secrets Store
- ADR-0016: Memory-First Secrets Store
- ADR-0015: Use Singular Form for File and Package Naming
- ADR-0014: Maintaining SQLite as SPIKE’s Primary Storage Backend
- ADR-0013: S3-Compatible Storage as SPIKE’s Backing Store
- ADR-0012: HTTP Methods for SPIKE API
- ADR-0011: PostgreSQL as SPIKE’s Backing Store
- ADR-0010: Session Token Storage Strategy for SPIKE Nexus
- ADR-0009: Multi-Administrator Support System
- ADR-0008: Administrative Access Control System
- ADR-0007: Root Key Lifecycle and Management Strategy
- ADR-0006: Trust Boundary Definition and Security Assumptions
- ADR-0005: Use SPIFFE mTLS for Inter-Component Authentication and Communication
- ADR-0004: SPIKE Keeper Minimalist Design Approach
- ADR-0003: Root Key Management and Storage Strategy
- ADR-0002: Use Docsify for Documentation System
- ADR-0001: Display Secrets in Plain Text in SPIKE Pilot Admin CLI