Progress Update #3
Two steps forward and three steps back. A major change to the foundation of the HTH firmware.
Progress Update #3: Resetting the Firmware Foundation
These last couple weeks haven't shown much forward progress. Instead, I've been digging into a handful of issues that I didn't want plaguing the project forever and have ended up being more of a rabbit hole than I expected.
I’ve spent most of the week splitting time between Christmas festivities, and in the (very little) downtime pulling on a thread in the HTH firmware that eventually led me back to revisiting some early choices with the software design.
After sitting with it for a while, I’ve made the call to pivot away from the vendor-supplied RUI3 software development kit and rebuild the firmware on top of the community-driven Wisblock nRF52 API instead. This has meant throwing out a lot of working code and re-implementing things I already had working, which is always a bit disheartening. But it was becoming increasingly clear that some of the underlying issues I had been running into weren’t going to get easier with time, and I was likely going to end up needing to make the switch in the future anyways, so I just decided to bite the bullet and jump now.
Why I Decided to Reset
The vendor-supplied stack I had been using previously got me surprisingly far, and for the bulk of the things I tried it worked great. But as I moved closer to something I’d actually hand to another person, a few problems kept resurfacing and I didn't have a great path forward on either of them.
Power management was the biggest one. I could get the system to sleep reliably, but not deeply enough to feel confident in long-term battery deployments. On my simplest test setups I was regularly seeing 500 microamps of current drawn from the battery during the deep-sleep cycle between sensor updates. With my rough math, using an 1100mah battery gives about 12 weeks of runtime, which isn't terrible, but this doesn't account for the device periodically waking up to transmit data. This is likely workable for outdoor nodes with solar panel augmentation, but I don't love that being a hard requirement for every deployment; and was pretty sure we can do better based on a handful of forum posts and previous experience optimizing for power on my own hardware designs with other microcontrollers.
The other area I've been having issues is the BLE (Bluetooth) stack. I’m pretty sure this one is self-inflicted and stems from my choice to support easy field upgrades of firmware. For a lot of technical reasons that I won't go into here, the vendor-provided firmware stack doesn't “just work” with the meshtastic node hardware without a complete firmware update, and this does not fit with the simple field-update procedure I have in mind for this project. To get things working I had to make a handful of workarounds and compromises, one of which was to completely disable BLE. While technically speaking I don't need BLE for any of the node functions right now, I would like the option to leverage it for future firmware feature if it makes sense. One in-particular that would be interesting is a wireless firmware update that can be done if you are within BLE range.
None of these problems are catastrophic on their own. However, considering them in concert and looking at where I want to take this project in the future, I felt it was probably best to just take some time now and address the underlying cause.
My initial investigation indicated that setups using the Wisblock framework stack were showing much lower power consumption than I seeing, so I figured I would start there. After confirming that I could reproduce some simple low-power demo, I pressed ahead full-steam to migrate core HTH features into the new framework. It took the bulk of the week to get most everything moved over, and for the majority of that time the firmware was completely broken as I was removing and replacing key functional blocks. At this point I've got both node and gateway firmware back up and running, and am actively working through initial regression tests to check for feature parity and stability compared to the old firmware. So far, it seems things are working better, but there are a few rough edges that need some more scrutiny before I am comfortable calling them done.
What Changed for the Better
One of the more encouraging things about the switch was how quickly my major long-standing issues disappeared.
Sleep current dropped immediately with very little time invested (this was my main goal). Where the previous firmware sat around half a milliamp, the new stack consistently lands closer to 90 microamps in deep sleep and I think I can probably get it down close to 50 ua with some very minor design tweaks. This change alone takes the projected max possible runtime on our same 1100mah battery out to 2+ years and is now largely a function of how often the sensor is sending updates. That's nearly a 5x improvement and not bad for a week worth of refactoring. On its own this change probably would have been worth the effort, but wait there's more!
BLE also stopped feeling like a ticking time bomb that was going to need to be diffused eventually. After the firmware migration BLE actually behaves the way I expect: stable connections, predictable behavior, and overall it just does what it's supposed to. That opens the door to much cleaner close proximity provisioning and debugging workflows down the road.
What Got Rougher Instead
The framework reset hasn’t been a total win.
The radio stack works, but it feels less polished out of the box than what I had before. There is a lot more magical global state tracking going on and it has taken some careful tuning to get behavior back to where I wanted it, particularly around state transitions and edge cases. Overall the framework has felt less refined than the RUI3 SDK, but I'm willing to live with it so long as it technically works.
The WisBlock APIs also make some strong assumptions about how a project is supposed to work. In a few places, I had to work around hard-coded behavior to get the flexibility I needed. It’s manageable, but noticeable.
The AT command implementation is another area that’s functional but not particularly elegant. It does what it needs to do, but it’s rougher than the vendor-supplied RUI3 version and something I’ll want to spend some time cleaning up once the demo is stable.
I also ran into a minor quirk where the battery ADC seems to report nonsense values when no battery is connected. I don’t remember seeing that behavior before, so I’ll need to circle back at some point to investigate, but it seems to work fine when a battery is connected, and as of now that covers all of the planned use cases.
Where Things Stand Now
As of this morning, the system is fully up and running on the new firmware.
I have both a node and a gateway running the new code. The SHTC3 environmental sensor, the DS18B20 temperature probe, and the internal nRF52 die temperature sensor are all behaving as expected. I’ve been running a 3-day longevity test against the dashboard while I've been off celebrating Christmas with the family. When I checked in this morning all the dashboard data was still coming in and looked as expected. Downlinks are working, including remote changes to the send-interval, which allows changing the uplink rate dynamically. I haven’t yet fully validated the end-to-end remote OTA provisioning flow, and thit is likely to remain on the todo list until after the demo is out for people to start expirementing with.
What I’m Focusing on Next
This coming week is about tightening the demo experience rather than building new infrastructure.
I need to validate the provisioning flow so users can reliably set a site password on a node, and I need to improve how the toolkit distinguishes between gateways and nodes when devices enumerate on different serial ports.
What This Means for the Schedule
I’m still at least a week away from releasing the demo, possibly a bit more. But the important shift is that the remaining work feels concrete and bounded.
I’m no longer blocked by fundamental platform limitations and have what feels like a much firmer foundation on which I can continue building. At this point, it’s about finishing the demo features, polishing the walkthrough, and making sure the first user experience is solid. Thanks for reading along, and stay tuned, we are getting very close!
If you want to be notified as soon as the demo is ready please consider subscribing. All subs get access to the totally free copy of the demo software, the demo firmware, and a detailed walk-through for building you're own off-grid, low-power, temperature sensing node with no vendor lock-in or cloud-backed data services.
