When Flash Isn’t Just Flash: Real-World Lessons Using FATFS, LittleFS & Other Filesystems in IoT Devices
Choosing a filesystem in embedded firmware looks simple — until real devices start writing logs, staging firmware updates, caching sensor data, or recovering from unexpected resets. What starts as a neat API turns into challenges involving corruption, wear, mount-time failures, memory stalls, directory explosions, and behavior that rarely shows up in simulation.
This article breaks down the actual problems we faced with FATFS, LittleFS, and SPIFFS, why they occurred, and the fixes and architectural patterns that finally stabilized our systems.
Problem: Filesystems Behave Differently in Theory vs. the Real World
Each filesystem promises something on paper:
- FATFS → compatibility, simplicity, widely used
- LittleFS → power-loss safety, wear-leveling, small footprint
- SPIFFS → minimal metadata, flash-friendly for a few files
But the moment firmware starts doing real work — writing logs every few seconds, saving snapshots, updating configs, storing intermediate buffers — these abstractions collide with:
- sudden power loss
- wear patterns
- metadata growth
- fragmentation
- I/O latency spikes
- unrelated bugs surfacing as file corruption
The result: subtle, long-tail failures that take weeks or months to diagnose.
FATFS: Where It Broke And How We Fixed It
Challenge 1 — Power-Loss Corruption of FAT Tables
Even when writing small files, unexpected resets consistently produced:
- broken FAT chains
- orphaned clusters
- files that appeared normal but contained corrupted data
- partial writes that silently truncated
Why it happens:
Directory entries, FAT tables, and allocation blocks are updated in separate, non-atomic writes. Any reset in between leaves the filesystem in a half-updated state.
Fix
- Reduce write frequency (batching in RAM)
- Never update FATFS inside timing-sensitive code paths
- Introduce write barriers and delayed commits
- Add periodic integrity checks
- Move frequently changing data out of FATFS entirely
Challenge 2 : Severe Fragmentation Over Time
As files were appended, deleted, or replaced, FATFS fragmented rapidly.
Symptoms included:
- multi-second read times
- random slowdowns in data access
- pipelines missing timing deadlines
- inconsistent load times across devices
Fix
- Preallocate large files to avoid cluster scatter
- Replace append-heavy workloads with circular buffers
- Consolidate many small files into a single large block
- Periodically defragment by rewriting the entire dataset
These alone reduced read-time variance dramatically.
Challenge 3 : No Wear Awareness equals Accelerated Flash Aging
FATFS writes frequently to:
- FAT tables
- directory structures
- the first few erase blocks of flash
This created hotspots that experienced disproportionate wear.
Fix
- Minimize directory rewrites
- Rotate storage regions via custom offsets
- Reduce small file churn
- Shift volatile, frequently-updated data to a flash-friendly filesystem
These changes extended flash lifetime significantly.
LittleFS: Amazing at Power-Loss, But Surprisingly Tricky
LittleFS is often marketed as “the filesystem that just works under power loss.”
It genuinely is — but it has its own set of real-world pain points.
Challenge 1 : Metadata Map Growth
LittleFS stores everything as an object, so thousands of tiny files equals thousands of metadata nodes.
This caused:
- directory traversals to slow
- mount times to spike
- metadata blocks to balloon
- free-space reports to become misleading
Fix
- Stop creating thousands of tiny files
- Switch to:
- journaling logs
- consolidated structured files
- daily or hourly rollovers
- Introduce automatic log compaction
After this, mount time dropped from unpredictable to consistent.
Challenge 2 : Write Amplification
A tiny config write could trigger:
- metadata updates
- block relocations
- garbage-collection runs
We measured up to 5× more flash writes than anticipated.
Fix
- Cache frequently accessed state in RAM
- Batch updates
- Migrate config formats to compact, single-write structures
- Reduce rewrite frequency of stable values
This dramatically reduced flash wear.
Challenge 3 : Random Latency Spikes From GC
LittleFS performs background or opportunistic garbage collection.
Under load, that meant:
- random multi-millisecond stalls
- blocking writes during critical logic
- interrupts missing their windows
These were the hardest to debug because they looked like timing bugs in unrelated code.
Fix
- Absolutely no filesystem writes in ISR or time-critical paths
- Introduce a dedicated filesystem worker thread
- Buffer everything
- Schedule GC during idle periods
The system became far more deterministic.
SPIFFS: Simple, Lightweight… Until You Push It
SPIFFS shines when storing a handful of persistent files.
But as soon as the filesystem needed structure, or file churn increased, problems emerged.
Challenge 1 : No Directory Support
We quickly hit limitations when trying to categorize or separate data.
Everything lived in a flat namespace → naming collisions + scaling pain.
Fix
- Emulate directory structure via naming conventions
- Or migrate to LittleFS where proper hierarchy was required
Challenge 2 : Slow Reads/Writes Under Increasing Load
SPIFFS is optimized for simple cases.
When storing a mix of large and small files:
- read times increased
- writes slowed due to full-block erasures
- latency became inconsistent
Fix
- Move large or frequently updated content to other filesystems
- Keep SPIFFS only for small, rarely updated system files
Challenge 3 : Heavy Full-Block Rewrites
SPIFFS tends to rewrite entire blocks for even small updates.
This greatly accelerated wear.
Fix
- Avoid any append-heavy use case
- Keep files static
- Adopt alternative FS for dynamic workloads
Over time, most SPIFFS usage was migrated out.
Implementation Patterns That Finally Stabilized Everything
Pattern 1 : Hybrid Filesystem Architecture
No filesystem does everything well.
We fixed stability by:
- placing frequently changing data in flash-friendly FS
- placing bulk sequential data in simple FS
- avoiding one-size-fits-all storage layouts
This reduced corruption and improved longevity.
Pattern 2 : Never Allow Real-Time Logic to Write to Flash
At any point, the filesystem can stall.
Writes must be buffered, deferred, or batch-processed.
This alone eliminated:
- timing jitter
- DMA starvation
- ISR instability
Pattern 3 : Avoid Small File Explosions
Tiny files wreck metadata, fragmentation, mount times, and wear.
Consolidation was the universal solution.
Pattern 4 : Validate Filesystem State After OTA or Crashes
We caught issues early by scanning and rebuilding metadata structures after updates.
Takeaways: Filesystems Are Not Just APIs : They’re Architecture
1. Decide based on workload, not marketing tags
Different workflows break different filesystems.
2. Flash writes must be treated as expensive and unpredictable
Never write synchronously from critical paths.
3. Hybrid layouts solve more than they complicate
No single FS is universally reliable.
4. Flash health monitoring is essential
Wear tells you where architecture is wrong.
5. Always assume the device will lose power at the worst possible moment
LittleFS shines here; FATFS does not.