ToF Calibration Persistence: Dual-Bank Flash Redundancy
A time-of-flight proximity sensor's accuracy depends entirely on three calibrated
numbers: an offset correction, a crosstalk coefficient, and a reference SPAD
count. Without them, the sensor's distance readings carry systematic errors of
tens of millimetres — enough to misfire a capture trigger or miss a detection
event entirely. At Hoomanely, the challenge was not calibrating the sensor. The
challenge was keeping that calibration alive across firmware updates. Our camera
node uses dual-bank flash for live over-the-air firmware upgrades. After a
bank swap, the logical address space flips. A naive calibration layout — one slot
at a fixed address — gets silently invalidated when the active bank changes. We
needed a storage architecture that would survive bank swaps, power failures
mid-write, and layout migrations from older firmware versions. Here is exactly
how we built it.
The Problem: Three Numbers That Must Survive Everything
A time-of-flight proximity sensor measures distance by emitting infrared pulses
and timing the round trip. Raw timing gives you a distance, but three systematic
error sources corrupt that reading in every real deployment.
Offset error accumulates from the sensor's mechanical mounting — the physical
distance between the emitter aperture and the mounting surface is never exactly
zero. On our camera node, where the sensor operates with a narrow 4×4-zone field
of view, even a 2 mm mounting offset translates to trigger misfires at the
boundary of the capture window.
Crosstalk is infrared energy that bounces off the sensor's cover glass and
returns to the detector without ever reaching the target. Every transparent or
translucent cover introduces crosstalk. Uncorrected, it makes every measurement
report shorter than the actual distance — the glass reflection looks like a
partial hit at a range you did not intend to detect.
Reference SPAD calibration optimises the single-photon avalanche diode array,
selecting the best-performing photodetectors and setting their bias correctly. A
sensor running on factory SPAD defaults has higher noise variance, which
translates directly to jitter in successive distance readings — the kind of jitter
that makes debounce logic unreliable.
All three parameters must be stored in non-volatile memory. The firmware
constraint that complicated this: our camera node's bootloader performs live OTA
updates by swapping which flash bank the MCU executes from. After a firmware
upgrade, the MCU may run from Bank 2 while Bank 1 becomes the inactive bank — or
the reverse, depending on the SWAP_BANK option bit in the flash option bytes.
Any calibration stored at a fixed logical address in Bank 1's tail could end up
in the wrong physical bank after a swap, silently invalidated.
A second failure mode appeared early in development: an OTA sector reservation
conflict. The bootloader reserves the last sector of each bank for OTA
metadata. Our first calibration layout placed the backup slot at bank-end-minus-one
— exactly overlapping the OTA reserved sector. A firmware update would silently
zero the calibration alongside the old metadata. No error code. No log entry.
Just a device that boots with factory defaults and fires at the wrong distances.

The Approach: Four Slots, One Contract
The design settled on four calibration slots: Bank1-Primary, Bank1-Backup,
Bank2-Primary, and Bank2-Backup. Every calibration write touches all four
simultaneously. Every boot reads from the first slot that passes validation and
stops. The redundancy ensures that a power loss mid-write, a flash write failure
on one sector, or a full bank swap cannot leave the device without valid
calibration.
The address calculation is where correctness lives. With 1 MB per bank and 8 KB
sectors, the OTA bootloader occupies the last sector. Calibration slots therefore
land at offsets −3 and −2, never at −1:
/* Flash calibration layout — non-overlapping with OTA at -1 sector */
#define CAL_SECTOR_SIZE 8192UL /* 8 KB per sector */
#define CAL_BANK1_PRIMARY_ADDR (BANK1_BASE + BANK_SIZE - (3 * CAL_SECTOR_SIZE))
#define CAL_BANK1_BACKUP_ADDR (BANK1_BASE + BANK_SIZE - (2 * CAL_SECTOR_SIZE))
#define CAL_BANK2_PRIMARY_ADDR (BANK2_BASE + BANK_SIZE - (3 * CAL_SECTOR_SIZE))
#define CAL_BANK2_BACKUP_ADDR (BANK2_BASE + BANK_SIZE - (2 * CAL_SECTOR_SIZE))
The erase function must also handle bank swapping. Reading FLASH->OPTSR_CUR at
erase time tells the driver whether Bank 1 or Bank 2 is the physical primary. A
logical Bank 1 address maps to FLASH_BANK_2 when SWAP_BANK is set — and
erasing the wrong physical bank silently preserves stale calibration data:
uint32_t optsr_cur = FLASH->OPTSR_CUR;
bool swap_bank = (optsr_cur & FLASH_OPTSR_SWAP_BANK_Msk) != 0;
bool is_logic_b1 = (sector_addr < BANK2_BASE);
uint32_t physical_bank = swap_bank
? (is_logic_b1 ? FLASH_BANK_2 : FLASH_BANK_1)
: (is_logic_b1 ? FLASH_BANK_1 : FLASH_BANK_2);
Without this mapping, an erase operation that believes it is targeting the logical
Bank 1 calibration slot — after a bank swap has occurred — would silently erase
the wrong physical sector. The calibration data it intended to update would remain
as the stale pre-swap version, and the next boot would load it as though it were
fresh.
The flash programming interface on this MCU requires 16-byte (128-bit) aligned
writes — a quad-word programming granularity. Calibration data smaller than 16
bytes must be padded to the next 16-byte boundary, with unused bytes initialised
to 0xFF to match the erased flash state, before any write call is made.

The Process: Load, Validate, Migrate, Protect
Loading at boot iterates six candidates in fixed priority order. For each
address, the code first checks whether the region is fully erased (all 0xFF
bytes) and skips it. It then reads the raw bytes into a temporary struct and
validates two fields: a format version constant and a 16-bit checksum computed
over every field except the checksum itself. The first candidate that passes both
checks is applied immediately — no further candidates are checked:
for (size_t i = 0; i < ARRAY_SIZE(candidates); i++) {
if (!is_valid_flash_address(candidates[i].addr)) continue;
if (is_flash_erased(candidates[i].addr, sizeof(CalibData))) continue;
memcpy(&calib, (void *)candidates[i].addr, sizeof(CalibData));
if (calib.version != CALIB_FORMAT_VERSION) continue;
if (compute_checksum(&calib) != calib.checksum) continue;
/* Valid — apply and return */
apply_calibration(&calib);
return CALIB_OK;
}
/* No valid candidate found */
return CALIB_NOT_FOUND;
After loading, a migration check runs automatically. If the calibration came
from a legacy slot (old firmware layout, single bank only), or if not all four
new dual-bank slots contain valid data, the firmware promotes the loaded
calibration to all four current slots. This happens once, silently, on the first
boot that encounters the new firmware — no manual re-calibration is required in
the field.
Sanity bounds are enforced on load: offset values outside ±100 mm and crosstalk
values above 20,000 kcps are rejected as physically implausible and the device
falls back to factory defaults with a warning log entry. A corrupt flash write
that passes the checksum but produces out-of-range values is caught before it
can move the capture window to an impossible distance.
Storing calibration requires an explicit software unlock. A static boolean
flag, calibration_write_enabled, defaults to false and acts as a write-protect
gate throughout the calibration subsystem. The flag must be set explicitly by
calling enable_calibration_write() before any store attempt. After a successful
four-slot write, the flag is automatically cleared — so an accidental
subsequent call to the store function cannot overwrite good data even if the
calling code forgets to re-enable protection. This is a belt-and-suspenders
design: the intent to write must be declared locally, and the window closes the
moment the write succeeds.
Why It Matters at Hoomanely
Hoomanely builds Physical Intelligence — a continuous, AI-powered health
monitoring platform that gives pet owners clinical-grade insight into their
animals' wellbeing before symptoms become visible. The camera node at the centre
of that system does not record video. It fires a burst capture only when the
proximity sensor confirms a pet is within the detection window — close enough that
the image is sharp, correctly framed, and diagnostically useful to the Biosense
AI Engine.
That detection window spans 30 mm to 400 mm. The calibration offset correction we
apply is on the order of tens of millimetres. An uncalibrated sensor running on
factory defaults fires at slightly wrong distances — triggering captures on
reflections at the wrong range, or missing the pet's approach because their face
sits just outside the corrected boundary. Neither failure produces a loud error.
Both produce a slow, silent degradation in capture quality.
Every image the Biosense AI Engine uses to build a pet's personalised health
baseline began with the proximity sensor firing at the right distance. A
calibration silently lost to a firmware update is not a visible fault — it is
weeks of subtly wrong capture statistics eroding the accuracy of the health model
that acts as the animal's long-term clinical record. The four-slot redundancy,
the bank-swap-aware erase logic, and the automatic migration path exist
specifically to ensure that a routine firmware upgrade cannot produce that class
of silent degradation.

Results: What the Architecture Delivered
The calibration system has handled every OTA update shipped to production hardware
without a calibration loss event. The three-step manual calibration procedure
— reference SPAD alignment with the sensor uncovered pointing to open space,
crosstalk measurement against a white reflective target at 400 mm, offset
measurement against the same target at 140 mm — produces a complete CalibData
struct that is written to all four flash slots with independent erase-verify-write
cycles per slot.
A boot after calibration finds Bank1-Primary valid and applies it in under 1 ms.
A boot after a bank-swapping OTA update finds Bank2-Primary valid — same
calibration values, different physical storage location — and applies them without
any re-calibration required. A boot on a brand-new device finds all six candidates
empty and applies factory defaults with a prominent log warning to trigger
a calibration run before deployment.
The automatic legacy migration path promoted all pre-existing single-slot
calibrations from earlier firmware versions into the full four-slot layout across
two hardware revisions — with zero field re-calibration events and zero slot
corruption reports since the architecture was deployed.
Key Takeaways
- Calibration storage must be OTA-aware. A fixed address in flash is not
safe when dual-bank firmware updates can swap the logical memory map. Compute
calibration addresses relative to bank geometry, not as absolute constants. - Four copies cost almost nothing and recover everything. Eight kilobytes
of redundant calibration across two banks is negligible in a 2 MB flash budget.
The cost of losing calibration — degraded detection quality for weeks — is not. - Bank-swap-aware erase logic is non-optional. Reading
OPTSR_CURbefore
every erase is the only way to guarantee you are erasing the correct physical
sector after a bank swap has occurred. - Software write protection must auto-close. A write gate that remains open
after a successful store is a bug waiting to corrupt the next valid calibration.
The enable call opens; the successful write closes — not the caller. - Silent degradation is worse than a crash. Corrupted calibration does not
produce an error. It produces subtly wrong triggers for weeks. The redundancy
architecture exists specifically to eliminate this class of silent, undetectable
failure.
The proximity sensor is the first gate in Hoomanely's Physical Intelligence
pipeline. It decides whether an image gets captured, whether the Biosense AI
Engine receives a new data point for that pet, and whether the animal's health
baseline advances for that day. We built the four-slot calibration architecture
because we understood what a silent calibration loss would mean downstream — not
a crash, not a log entry, just health insights built on data that was slightly
wrong, slightly off-axis, slightly less trustworthy than it should have been.
Robust calibration persistence is not a firmware detail. At Hoomanely, it is
part of the clinical foundation.