Dynamic Compute Adaptation
Chloros 1.1.0 introduces intelligent hardware detection and automatic processing strategy selection. The processing engine adapts to your hardware — from a Jetson Nano to a multi-GPU workstation — without any manual configuration.
How It Works
When Chloros starts, it automatically profiles your system:
Detects the operating system — Windows or Linux
Identifies CPU cores and total RAM
Detects GPU presence — NVIDIA CUDA capability, VRAM, model
Identifies Jetson model (if applicable) — via
/proc/device-tree/modelChecks thermal sensors (Jetson) — for temperature-aware processing
Selects the optimal compute strategy — based on all detected hardware
Configures worker count, pipeline type, and memory allocation automatically
The result is cached so subsequent runs start faster. If hardware changes (e.g., a GPU is added), Chloros re-profiles on the next launch.
Compute Strategies
Chloros selects one of three compute strategies based on your hardware:
GPU_PARALLEL
Yes (12GB+ VRAM or 16GB+ shared)
3-4
fused_gpu
Desktop GPUs with 12GB+, Jetson Orin NX 16GB, AGX Orin
GPU_SINGLE
Yes (< 12GB VRAM)
1-3
tiled_gpu
Entry-level GPUs, Jetson Nano, Orin Nano
CPU_PARALLEL
No
cores - 1
cpu_fallback
Systems without NVIDIA GPU
Pipeline Types
fused_gpu— Full GPU processing path. All debayer, correction, and index operations run on the GPU in a single fused pass. Highest throughput but requires more VRAM.tiled_gpu— Memory-efficient GPU path. Processes images in tiles to fit within limited GPU memory. Lower throughput but works on memory-constrained devices.cpu_fallback— CPU-only processing using multi-threaded parallelism. Used when no NVIDIA GPU is available.
Platform-Specific Behavior
Jetson Nano 8GB
GPU_SINGLE
1
tiled_gpu (serialized)
Memory-efficient mode, processes one image at a time
Jetson Orin NX 16GB
GPU_PARALLEL
3
fused_gpu (concurrent)
Recommended edge device — real parallel GPU processing
Jetson AGX Orin 64GB
GPU_PARALLEL
4
fused_gpu (concurrent)
Maximum edge performance
Desktop with 8GB GPU
GPU_SINGLE
3
tiled_gpu
Good desktop performance with memory-efficient tiles
Desktop with 12GB+ GPU
GPU_PARALLEL
3-4
fused_gpu
Optimal desktop performance
CPU-only system
CPU_PARALLEL
cores - 1
cpu_fallback
No GPU required, uses ThreadPool
Jetson unified memory: Jetson devices share GPU and CPU memory. A Jetson Orin NX 16GB reports ~15.3GB of VRAM, but this is the same physical RAM used by the OS and CPU processes. Chloros accounts for this when setting memory allocation thresholds.
Dynamic GPU Memory Allocation
Chloros uses a 4-thread processing pipeline:
Thread 1 (Detection) — Image loading, EXIF parsing, target detection
Thread 2 (Calibration) — Reflectance calibration computation
Thread 3 (Processing) — GPU debayer, vignette correction, index calculation
Thread 4 (Export) — File writing, metadata embedding
As earlier pipeline threads complete their work (e.g., all images have been detected), their GPU memory allocation is released and redistributed to the remaining active threads. This means Thread 3 (the GPU-intensive stage) gets progressively more memory as the pipeline advances, improving throughput for the most compute-intensive work.
Allocation Stages
Early
1, 2, 3, 4
Split across all threads
Mid-Early
2, 3, 4
Thread 1 memory redistributed
Mid-Late
3, 4
Threads 1+2 memory goes to 3+4
Late
3 or 4
Maximum memory for remaining thread
Texture Aware Processing
The Texture Aware debayer method (Chloros+ only) uses significantly more GPU memory than the Standard method due to the AI/ML denoising model:
Systems with < 7GB VRAM are forced into a synchronous processing loop for Texture Aware mode (one image at a time)
Systems with 7GB+ VRAM can process Texture Aware concurrently, though at reduced worker count compared to Standard
Thermal Management (Jetson)
Jetson devices have thermal constraints, especially in enclosed or airborne deployments. Chloros monitors GPU and CPU temperatures and automatically adjusts processing:
< 70°C
Normal operation — full speed
70°C (Warning)
Reduce batch size
80°C (Critical)
Aggressive throttling — lower concurrency and worker count
90°C (Shutdown)
Stop GPU processing entirely
Temperature monitoring uses tegrastats on Jetson platforms. On desktop systems with adequate cooling, thermal throttling is rarely triggered.
Memory Pressure Handling
Chloros monitors system memory pressure during processing:
Memory threshold: 85% utilization triggers conservative behavior
OOM reduction: If an out-of-memory event occurs, allocation is reduced by 25% (0.75x multiplier)
Pipeline fallback: Under severe memory pressure, the pipeline falls back from
fused_gputotiled_gpuautomaticallySwap recommendations: On Jetson, Chloros warns you if swap space is insufficient for your dataset size
Monitoring Compute Adaptation
CLI Status Output
When processing starts, the CLI displays the detected hardware profile:
System Diagnostics
Run chloros-cli selftest to see a full hardware profile and verify compute capabilities:
This checks CUDA availability, GPU memory, denoiser models, and backend connectivity.
Next Steps
Processing Pipeline — Understanding the 4-thread pipeline architecture
NVIDIA Jetson Guide — Jetson-specific deployment and optimization
CLI : Command Line — Full CLI reference
Last updated