Morse_pageset_write -32768 / Command 44:f2 timeout on STA enable — FGH100M-H + ESP32-S3 SPI, reproducible across multiple boards

We’re hitting `morse_pageset_write failed to write page: -32768` repeatedly on STA enable. This is identical to the symptom in [Scan demo error messages](https://community.morsemicro.com/t/scan-demo-error-messages/) (Sep 2025). On that thread the staff response asked for a logic analyzer capture — we don’t have one available, so posting here in case (a) someone has since identified a software cause, (b) it was fixed in a later release, or (c) there’s a known config we’re missing.

-–

### Hardware

- **Host:** XIAO ESP32-S3, ESP-IDF v5.1.1

- **HaLow module:** Seeed Wio-WM6108 via EXT01 B2B carrier

- **Chip:** MM6108, reports chip ID `0x306`

- **Module variant:** Quectel FGH100M-H (using `bcf_fgh100mhaamd.mbin`)

- **Tested across multiple HaLow boards — every one shows the identical failure**, so this isn’t a single-unit hardware fault.

### Software

- `mm-iot-esp32` v2.8.2, pure upstream from the official GitHub tag `2.8.2`

- morselib v2.8.2-esp32, FW 1.15.3 (`mm6108.mbin`, 368996 bytes — MD5 matches upstream byte-for-byte)

- Linking `libmorse_nocrypto.a` with the mbedtls shim from `mm_shims/crypto_mbedtls_mm.c`

- SPI mode, 40 MHz, manual CS (`spics_io_num = -1`)

- Pin map (per the EXT01 schematic):

| Signal | GPIO |

|—|—|

| `MM_RESET_N` | 1 |

| `MM_WAKE` | 2 |

| `MM_SPI_IRQ` | 3 |

| `MM_SPI_CS` | 4 |

| `MM_BUSY` | 5 |

| `MM_SPI_SCK` | 7 |

| `MM_SPI_MISO` | 8 |

| `MM_SPI_MOSI` | 9 |

- `CONFIG_MM_BCF_FGH100MHAAMD=y`

- `CONFIG_MBEDTLS_NIST_KW_C=y`

- `CONFIG_SPIRAM_MALLOC_ALWAYSINTERNAL=16384`

- Country code `“US”`, regulatory domain set via `mmwlan_set_channel_list()` before boot

- AP uses **SAE**, passphrase ~17 chars

### What works

```

HaLow DBG: step 4 — mmwlan_boot()

Actual SPI CLK 40000kHz

HaLow DBG: step 4a — disabling 802.11 power save

HaLow DBG: power save disabled OK

HaLow DBG: step 4b — mmipal_init(DHCP)

Morse LwIP interface initialised. MAC address 3c:22:7f:37:4e:fd

HaLow DBG: mmipal initialized with DHCP, link callback registered

HaLow: chip ID = 0x306, fw = 1.15.3, lib = 2.8.2-esp32

HaLow: SSID=‘’ len=15 pass_len=17 security=SAE

HaLow: connection attempt 1/3 (50 s timeout)

```

FW binary uploads successfully (369 KB transfer with no SPI errors), chip ID and version read back correctly, mmipal initializes the LwIP netif, the chip’s MAC address is read. Every synchronous host->chip transaction works.

### What fails

The instant `mmwlan_sta_enable()` triggers the internal scan task:

```

E 5074 morse_pageset_write failed to write page: -32768

E 5074 morse_pageset_tx could not write 1 pkts - rc=-32768 items=1 pages=1

E 5674 morse_pageset_write failed to write page: -32768

E 5674 morse_pageset_tx could not write 1 pkts - rc=-32768 items=1 pages=1

E 6274 Command 44:f2 timed out

E 6274 Failed to execute START HW_SCAN command

… repeats with Command 800c:102, then Command 19:112 …

E 16082 Command 19:112 timed out

E 16082 Health check failed (errno=-116)

… MMOSAL Assert → reset → boot loop

```

`spi_device_transmit()` returns `ESP_OK` on every transfer at the host side, so the SPI transactions complete at the hardware level — the chip is receiving data but refusing to acknowledge any pageset write.

### Already ruled out

- **PSRAM DMA buffers.** Overrode `mmosal_malloc_` to use `heap_caps_malloc(MALLOC_CAP_INTERNAL | MALLOC_CAP_DMA)`. Separately added bounce-buffering in `mmhal_wlan_spi_write_buf` to catch any non-DMA-capable address. No change.

- **Multi-transaction CS atomicity.** Wrapped `cs_assert`/`cs_deassert` with `spi_device_acquire_bus`/`release_bus`. No change.

- **SPI clock.** 40 MHz is required; dropping to 20 MHz breaks transport init (`mmwlan_boot()` returns `err=1`). 40 MHz is well within the 50 MHz max.

- **Power save timing.** `mmwlan_set_power_save_mode(MMWLAN_PS_DISABLED)` called immediately after `mmwlan_boot()` returns `MMWLAN_SUCCESS`. Calling it *before* boot corrupts internal state and breaks transport init.

- **BUSY pin.** `CONFIG_MM_BUSY=5` (separate from `MM_SPI_IRQ=3`) per the EXT01 schematic. Setting both to GPIO 3 causes an interrupt-storm WDT timeout during boot, confirming they must be on different pins. With pulldown enabled BUSY reads LOW (= not busy) at all observed times.

- **SDK modifications.** Reverted our `mm-iot-esp32-v282` tree to byte-identical upstream `2.8.2` and re-applied only a minimal shared-SPI-bus refactor (~4-line diff in `wlan_hal.c` so an SD card can coexist on `SPI2_HOST`). Failure unchanged.

- **Single-board hardware fault.** Multiple HaLow boards tested; all fail identically.

### Asks

1. Is `morse_pageset_write -32768` documented internally as a known failure mode for FGH100M-H + SAE on morselib 2.8.2? Has it been fixed in 2.9.x / 2.10.x?

2. Was a root cause ever identified privately on the Sep 2025 thread above?

3. Is there a required mmwlan API call between `mmwlan_boot()` and `mmwlan_sta_enable()` for FGH100M-H specifically — `mmwlan_set_subbands_enabled`, an explicit BCF override, anything else?

4. Any chip-side debug we can enable (verbose log level, debug morselib build, config flag) to see *why* the chip rejects the writes?

Happy to provide additional logs or config dumps. Thanks! Morse_pageset_write -32768 / Command 44:f2 timeout on STA enable — FGH100M-H + ESP32-S3 SPI, reproducible across multiple boards

I used to have these issues persistently as well, they stopped when I migrated my code to use the flow control implementation for tx rather than what was shown in the examples. good luck.

I’m not sure if it will fix your issue, but there has definitely been some work in the scan state machine to improve its robustness between v2.8 and v2.11.
I would strongly recommend you try an updated version of the SDK to see if it fixes your issue, or at least rule out “buggy software” as the issue.

Thanks @dmckeown and @Idennis. Quick update:

Migration to mm-iot-esp32 2.10.4: I followed Idennis’s advice and upgraded from 2.8.2. The build is clean — lib=2.10.4-esp32 and fw=1.17.6 reported by the chip. However, my known-good camera-skipped baseline (which worked on 2.8.2) now also fails on 2.10.4 with the same pageset_tx rc=-32768 signature. The chip reaches CONNECTED (2), mmipal reports L2 Link Up at RSSI -35 dBm, then every page write fails starting with the first data frame. Eventually the system panics in mmosal_mutex_get with a spinlock assert (recursive lock detected) — secondary fault from the retry recursion.

I targeted 2.10.4 because I couldn’t find a publicly-released 2.11.x for ESP32. The MorseMicro/mm-iot-esp32 repo’s latest tag is 2.10.4 (last push 2026-03-13). MorseMicro/mm-iot-sdk is at 2.11.2 but only ships ARM Cortex-M morselib archives (arm-cortex-m33f, m4f, m7f) with STM32-based BSPs — no Xtensa/ESP32 build that I could see. More on this in question 1 below.

My integration context (in case it suggests anything):

  • Hardware: Seeed XIAO ESP32-S3 Sense + Quectel FGH100M-H (Wio-WM6108) on the EXT01 B2B carrier. Chip 0x306. BCF: bcf_fgh100mhaamd.mbin, reports BCF API 8.0.0 on 2.8.2.

  • Shared SPI bus architecture: SD card and HaLow share SPI2_HOST. The bus is initialised once at boot with SPICOMMON_BUSFLAG_MASTER + max_transfer_sz=0 (matching upstream sta_connect example). Switching active device only reroutes MISO via the GPIO matrix (esp_rom_gpio_connect_in_signal(gpio, FSPIQ_IN_IDX, false)) — the bus is never torn down. Our patched mmhal_wlan.c calls our spi_bus_manager_init() in wlan_hal_spi_init() (replacing spi_bus_initialize), uses SHARED_SPI_HOST in spi_bus_add_device, and removes spi_bus_free from deinit. This worked on 2.8.2 for the camera-skipped path.

  • Bounce buffers in spi_master_rw: defensive bounce of any non-DMA-capable buffer (PSRAM) to internal DMA-capable memory before the transfer. Camera capture fragments PSRAM, so this is needed when camera is active.

  • WiFi/BLE coex is disabled (# CONFIG_ESP_WIFI_SW_COEXIST_ENABLE is not set) — in 2.8.2 the coex firmware loading at boot was disrupting morselib’s burst TX timing.

  • The standalone upstream sta_connect example on 2.8.2 connected fine on the same hardware in ~7 s, so the hardware/AP/RF side is known-good.

Specific questions:

  1. SDK version for ESP32: Is there a 2.11.x build for Xtensa/ESP32 available (perhaps through the customer portal) that hasn’t been pushed to MorseMicro/mm-iot-esp32 on GitHub? If ESP32 has been dropped from the 2.11.x line, is mm-iot-esp32 2.10.4 effectively the final ESP32 release, or will updates continue on a separate cadence?

  2. BCF compatibility: Our bcf_fgh100mhaamd.mbin reports BCF API 8.0.0. Is this compatible with the chip firmware shipped in mm-iot-esp32 2.10.4 (1.17.6)? The 2.10.4 SDK no longer ships board-specific BCFs in framework/morsefirmware/ (only mm6108.mbin and mm8108b2-rl.mbin), so I’m carrying our old BCF forward. Is there a newer BCF for Quectel FGH100M-H available, and if not, what’s the expected procedure for projects using customer-provided BCFs to verify forward compatibility?

  3. Shared SPI bus: Has anyone successfully integrated an external SPI-attached peripheral (in our case SD card) on the same SPI host as the HaLow chip, using GPIO matrix MISO rerouting? Are there gotchas with shared-bus arbitration we should be aware of in 2.10.4 specifically?

  4. Flow-control hook: I see mmhal_wlan_pktmem_tx_flow_control_state as an outward HAL function in libmorse_nocrypto.a (both 2.8.2 and 2.10.4). Is there documentation on what our HAL implementation should return, and when it’s queried? If our HAL is reporting “ready” when the chip isn’t, that could explain the persistent -32768 retries.

  5. Camera + HaLow coexistence: Has anyone integrated an ESP32 camera (esp_camera driver, DVP interface + GDMA + LCD_CAM peripheral) in the same project as HaLow? esp_camera_deinit() leaves LCD_CAM peripheral clock, LEDC channel, I2C1 peripheral, and the camera GPIO matrix routing in place — could any of those residual states plausibly affect HaLow’s burst SPI TX timing? Our 2.8.2 symptom was: HaLow fully works without camera, but with esp_camera_init()/deinit() earlier in the same boot, the chip rejects every burst write after reaching CONNECTED.

  6. Our specific symptom on 2.10.4: morse_pageset_tx ... rc=-32768 on every page write after the L2 link comes up — even with camera fully disabled. Does this match a known signature in the scan-state-machine work between 2.8 and 2.11? Any debug knob in 2.10.4 morselib I should turn on to see the actual chip-side error code?

We will be archiving mm-iot-esp32 in favour of morsemicro/halow • v2.10.4-esp32-1 • ESP Component Registry

This will lag the mm-iot-sdk releases by a couple of weeks. We will update it soon.

For the FGH100MH, a version 8 BCF won’t cause you any issues.

Yes, have a look at Morse Micro IoT SDK: WLAN Datapath API though this function won’t he helpful to you.

We normally don’t recommend this as our chip follows the SDIO over SPI spec which does require some special handling of the CS line. Have you tested without the SD card attached, and verified that the FGH100MH works on its own?

The cause of this specific error can vary. Most recently I saw it to be caused by a starved health check thread:

Can you try the workaround described towards the end of ARM-CORTEX-M33 with no DSP support - #3 by Roy to determine if a starved health check is the issue.

If so, you may need to balance any thread priorities for software you have added.

Yes — we verified the FGH100MH works on its own. A standalone sta_connect example (no SD, just XIAO ESP32-S3 + EXT01 + WM6108) connects in ~7 s on our hardware. The -32768 failures and SDIO timeouts only appear when SD shares SPI2. Given your note about SDIO-over-SPI CS handling, what’s the recommended pattern for sharing the bus with SDSPI in ESP-IDF, or do we need to move SD to a separate SPI host / SDMMC peripheral?

Thanks @ajudge — significant progress to report, and a question for you
once you’ve seen the data.

Migrated to morsemicro/halow^2.10.4-esp32-1 via the ESP Component
Registry on ESP-IDF v5.4.2. Build is clean.

  • Bare-board validation succeeds in ~3.6 s (XIAO ESP32-S3 + EXT01 +
    Wio-WM6108, no SD, no other peripherals).
  • Same package on the assembled DAQ board fails the same way as the
    v282 SDK did
    : chip boots fine (BCF API 8.0.0, mf16858, Morselib 2.10.4,
    FW 1.17.6, chip 0x0306, MAC reads correctly), mmhalow_connect() issues
    Attempting to connect to: halowlink2-6bc7, then the chip refuses page
    reads/writes during the SAE auth exchange:
    morse_pageset_read[600] Failed to read page: -32768.
  • Roy’s workaround (forcing every morselib task to
    configMAX_PRIORITIES - 1 in mmosal_task_create) does not help on
    this combination
    — it actually makes things worse, chip’s digital reset
    fails before connect is even attempted.
  • We’re stuck at the SD-coexistence question. Need your guidance.

Roy’s priority workaround (negative result)

Applied per the “ARM-CORTEX-M33 with no DSP support” thread post #3 — every
morselib task in mmosal_task_create forced to configMAX_PRIORITIES - 1.
Did not help on v282 + IDF 5.1.1 (already reported). On this new stack
(morsemicro/halow 2.10.4-esp32-1 + IDF v5.4.2), it actively breaks the chip
boot: mm610x_digital_reset[116] Failed × 2, garbage chip ID 0x8200f4xx
then 0x0000, no recovery.

Questions

  1. SD coexistence on the same SPI host: any supported pattern? SD chip
    is physically on the bus even when its driver is detached and its CS is
    HIGH. DAQ’s design holds SD CS HIGH for the entire HaLow window. Does
    the MM6108 require electrical isolation of other devices on the bus
    (i.e. nothing else physically wired even with CS HIGH), or is there a
    software-only setup (transaction queue, CS hold time, specific
    spi_bus_initialize flags) that can make SD’s presence tolerable? Is the onboard SD card reader of the esp32s3 unusable by default if paired with the halow board?

  2. spi_bus_initialize: SPI bus already initialized on the new package.
    We pre-initialize SPI2 in our spi_bus_manager for the shared-bus
    design. mmhalow_init then calls spi_bus_initialize again, gets
    ESP_ERR_INVALID_STATE, and proceeds. Empirically the chip then boots
    on DAQ (when SD CS is HIGH). On the bare board where the bus
    isn’t pre-initialized, the chip also boots fine. But if we skip our
    pre-init on the DAQ board, the chip’s digital reset fails. So the
    “already initialized” path appears to be functionally important for
    us — is this intended, or a side-effect we shouldn’t rely on?

  3. SPI3 (SPI3_HOST) for SD instead of SPI2: software-only relocation.
    Are there known issues running SD on a separate ESP32-S3 SPI host while
    HaLow has SPI2 exclusively, given the chip’s CS handling requirements?
    We may have pin freedom for this if you confirm it’s a clean path.

  4. SDMMC instead of SPI for SD: PCB rework required (no pin freedom on
    current hardware), but worth knowing if it would be the most reliable
    long-term option per your experience.

Bonus: four Windows package bugs we patched locally

While bringing up the validation project on Windows + IDF 5.4.2, we hit
four issues in the morsemicro/halow 2.10.4-esp32-1 package CMake.
Sending these along since you might want them upstream:

  1. components/shims/CMakeLists.txt lines 25 and 37 — references
    idf::firmware but the managed-component target name is
    idf::morsemicro__firmware. Fails at the Generate step with “dependency
    target idf::firmware does not exist”. Affects all platforms.

  2. components/morselib/CMakeLists.txt line 116 — regex
    objcopy$ doesn’t match the Windows binary name objcopy.exe. Needs
    objcopy(\.exe)?$. Without the fix, the libmorse mangling step ends up
    calling …objcopy.exear and …objcopy.exeranlib (no separator).
    Windows-only.

  3. Same file, AR/RANLIB paths${AR} derived from ${CMAKE_OBJCOPY}
    keeps forward slashes that confuse cmd.exe when used as a pipe target
    (| C:/.../ar.exe -M is parsed with C: as the drive-change builtin).
    Worked around with file(TO_NATIVE_PATH "${AR}" AR) plus explicit
    .exe suffixes on Windows. Windows-only.

  4. AR MRI archive script (libmorse.ar) generated in the same file
    the CREATE/ADDLIB commands embed absolute paths unquoted, and AR’s
    MRI parser splits on whitespace. Any project path containing a space
    (C:\Users\...\DAQ\...) causes No such file or directory at
    mangle time. We worked around by relocating the project to a space-free
    path; the upstream fix would be to either change the working directory
    for AR so the script can use relative paths, or to copy the inputs to a
    temp path with no spaces before running AR. Affects any platform with
    spaces anywhere in the build tree.

Thanks again — looking forward to your guidance on the SD path. Once we
have that locked in we’re good to go.

Our official recommendation is to not share the bus. We have some tight timing requirements that are often impacted by other devices on a shared bus - likely what you are seeing here.

Thanks for sharing the patches - we will address them in a bugfix release shortly.