#
Janitor Process
#
Overview
The Janitor is a Docker container that manages the lifecycle of archived persistence files generated by the Nekuti Matching Engine. It handles the compression, archival, and lifecycle management of command files and snapshots once they have been moved to the archive directory by the Snapshotter.
#
Purpose
The Janitor serves three primary functions:
- Deep Storage Archival: Compresses and transfers archived files to long-term storage
- Snapshot Pruning: Retains only complete, necessary snapshots for long-term storage efficiency
- Storage Optimization: Manages disk usage across persistence and deep storage volumes
#
Architecture
#
File Flow
flowchart TD A[Engine] --> B[/Engine Persistence Directory/] B --> C[Snapshotter] C --> D[/Engine Persistence Archive Directory/] D --> E[Janitor] E --> F[/Deep Storage/] style B fill:#cfe,stroke:#333 style D fill:#cfe,stroke:#333 style F fill:#cfe,stroke:#333
#
Volume Requirements
The Janitor requires access to two mounted volumes:
#
Core Operations
#
Deep Storage Archival
#
Command Files
- Trigger: Upon completion of file write by the engine to the archive directory
- Process: Compressed to ZIP format and written to Deep Storage
- Retention: All command files are preserved indefinitely in Deep Storage
- Cleanup: Original file removed after successful archival to Deep Storage
#
Snapshots
- Trigger: Upon completion of file write by the engine to archive directory
- Process: Compressed to ZIP format and written to Deep Storage
- Retention: Snapshot preservation in Deep Storage is subject to pruning policy
- Cleanup: Original directory removed after successful archival to Deep Storage
#
Snapshot Pruning Strategy
The Janitor implements a tiered retention policy for snapshots:
#
Command File Retention Policy
- Command files are retained indefinitely
#
Snapshot Retention Policies
- Do Not Prune (default for docker image deployments): All snapshots are retained indefinitely
- Command Span: Retains at least 1 snapshot per 1 million commands
#
Configuration
#
Docker Deployment
janitor:
user: $PUID:$PGID
image: janitor:current
restart: unless-stopped
volumes:
- $ENGINE_PERSISTENCE_DIRECTORY:/app/persistence
- $DEEP_STORAGE_DIR:/app/deep_storage
container_name: janitor
#
Storage Planning
#
Deep Storage Requirements
#
Compression Ratios
- Command Files: Typically compress to 60-70% of original size
- Snapshots: Compression varies by content (70-90% of original)
#
Capacity Planning
To assist with capacity planning, the following table describes typical capacity requirements for Engine Persistence and Deep Storage volumes. The requirements serve as an indication only and will vary depending on engine usage.
#
Operational Considerations
#
Monitoring
#
Health Checks
- Engine Persistence Archive directory usage: Monitor usage of the archive directory on the Engine Persistence volume. In case files are accumulating there, check the health of the janitor
- Deep storage utilization: Ensure deep storage has sufficient capacity. Failure of the janitor process risks causing causing engine failure due to lack of storage space on the Engine Persistence volume
#
Recovery Scenarios
#
Partial Archive Loss
- Command Files: No recovery possible - commands are authoritative
- Snapshots: Can be regenerated from command files
- Compressed Archives: Verify integrity before deletion of originals
#
Deep Storage Connectivity Issues
- Janitor will pause until connectivity to the Deep Storage volume is restored
- The Engine Persistence archive directory may accumulate files during outages
- Ensure Disk usage on both Engine Persistence and Deep Storage is monitored
#
Backup Strategies
#
Deep Storage Backup
Data persisted into the Deep Storage volume is essential for the recovery of engine state. Therefore, it is necessary to consider strategies to manage potential disaster scenarios:
- Consider the implementation of backups of Deep Storage volumes
- Consider geographic distribution of Deep Storage backups
#
Capacity Management
- Monitoring: Set utilization threshold alerts
- Growth Planning: Project storage needs based on trading volume