Quectel FGH100M-H throughput and SPI Test Mode Results

I am not getting the throughput I am expecting with the quectel fgh100m-h.

My questions are:
is the throughput I am seeing expected for a SPI clk at 40MHz ?
What should I check to make sure I have the module configured correctly?

what I have done so far:
I initially had the spi clock at 25MHz and was seeing throughputs of 8mbps. so almost doubling the spi clock to 40mhz I would have expected an almost double of throughput, but only saw a 25% increase.

I also tested with tx rate fixed and saw no improvement.

these are the morse kernel module params I am using:

    boot.kernelModules = lib.mkAfter [ "morse" "dot11ah" ];
    boot.extraModprobeConfig = ''
      options morse country=US spi_clock_speed=40000000 test_mode=6
    '';

here is the derivation I am using to compile the morse driver modules:
relevant part should just the makeFlags.

drv = stdenv.mkDerivation rec {
    pname = "morse_driver";
    inherit version;

    outputs = [ "out" "modules" ];

    # 6d1b281ae79ff51b6e03b0f34f95f4f7c701aac3 version 1.15.3


    # 33f0092d5c859d8589b99c6e9a77abc58c093648 version 1.14.1
    # sha256-sha256-VNw5OdxkP8GltYU2kDYdBUGKWIlPhr3tkzqtW+3MigE=

    src = fetchFromGitHub {
      owner = "MorseMicro";
      repo = "morse_driver";
      rev = "6d1b281ae79ff51b6e03b0f34f95f4f7c701aac3";
      sha256 = "sha256-dKs4CRpzJxzUVls1ovqgm9oBEwQxTS1vLz5x2xpbiSw=";
      fetchSubmodules = true;
    };

    nativeBuildInputs = [ gnumake ];
    buildInputs = [ kernel.dev ];

    # Disable -Werror to ignore type mismatch warning
    postPatch = ''
      substituteInPlace Makefile \
        --replace "-Werror" ""
    '';

    makeFlags = [
      "V=1"
      "KBUILD_VERBOSE=1"
      "quiet=" # Disable quiet mode
      "Q=" # Disable @ prefix
      "MORSE_TRACE_PATH=."
      "ARCH=arm64"
      "KERNEL_SRC=${kernel.dev}/lib/modules/${kernel.modDirVersion}/build"
      "CONFIG_WLAN_VENDOR_MORSE=m"
      "CONFIG_MORSE_SPI=y"
      "CONFIG_MORSE_USER_ACCESS=y"
      "CONFIG_MORSE_VENDOR_COMMAND=y"
      "CONFIG_MORSE_MONITOR=y"
      "CONFIG_MORSE_ENABLE_TEST_MODES=y"
      "CONFIG_ANDROID=n"
    ] ++ lib.optionals shouldCross [
      # nix-repl> outputs.legacyPackages.aarch64-linux.stdenv.hostPlatform.config
      # "aarch64-unknown-linux-gnu"
      "CROSS_COMPILE=${stdenv.hostPlatform.config}-"
    ];

    installPhase = ''
      # Install kernel modules into the "modules" output
      mkdir -p "$modules/lib/modules/${kernel.modDirVersion}/extra"
      cp morse.ko "$modules/lib/modules/${kernel.modDirVersion}/extra/"
      cp dot11ah/dot11ah.ko "$modules/lib/modules/${kernel.modDirVersion}/extra/"

      mkdir -p "$out"
      echo "morse_driver ${version}" > $out/README
    '';
  };

my morse fw blobs are here:

let
  inherit version;
  fw_bins = fetchurl {
    # url = "https://github.com/MorseMicro/firmware_binaries/releases/download/v${version}/morsemicro-fw-rel_1_14_1_2024_Dec_05.tar";
    url = "https://github.com/MorseMicro/firmware_binaries/releases/download/v1.15.3/morsemicro-fw-rel_1_15_3_2025_Apr_16.tar";
    # hash = "sha256-VufNPsal8TIbm+PTVRyWW3di1PGA+kFaOKNl3bL0PNE=";
    hash = "sha256-3n4+nxt7l6/51jaHxNPpn+JUT8/H5VZ98asNoZRlZGk=";
  };

  quectel_bcf = fetchFromGitHub {
    owner = "MorseMicro";
    repo = "morse-firmware";
    # v14.1
    # rev = "b5f4c765eb90524e8b1d378ac9ef155a54f73ffa";
    # sha256 = "sha256-LZj7pPKtEt27470JVN/dG4bKBkuw8yLqcGs8DemzrR8=";

    # v15.3
    rev = "b5a18499b605cfb3e783eccbb64a210d88d1ee69";
    sha256 = "sha256-m1Z40oIAj1IKW5KhCYVRBqtqBiWdEkVrmpyHzkOe+po=";
  };
in
stdenv.mkDerivation {
  pname = "morse_fw_blobs";
  inherit version mm_module;

  unpackPhase = "true";
  patchPhase = "true";
  configurePhase = "true";
  buildPhase = "true";
  checkPhase = "true";

  installPhase = ''
    mkdir -p $out/lib/firmware/morse
    tar -xf ${fw_bins} -C $out/lib/firmware/morse --strip-components=3 --wildcards '*/${mm_module}.bin'
    cp ${quectel_bcf}/bcf/quectel/bcf_fgh100mhaamd.bin $out/lib/firmware/morse/bcf_quectel_fgh100mhaamd.bin
  '';
}

and the wpa_supplicant I use has the following config:

stdenv.mkDerivation rec {
  pname = "wpa_supplicant_s1g";
  inherit version;

  src = fetchFromGitHub {
    owner = "MorseMicro";
    repo = "hostap";
    # v14.1
    # rev = "415d1757c25357e0d5423fe5f025e4384be7cb1b";
    # sha256 = "sha256-NusuRWG99v6AJUOpfsKY7xFSikfrt8ke8eSmH/eQ0sI=";
    # v15.3
    rev = "e8b2e339ac11fdf8861930f2a7b0f1f67d9a82f2";
    sha256 = "sha256-IOJore8wkMGcNFZ+87QuEZLJOmf2yo33jE2zhKTCaKE=";
  };


  nativeBuildInputs = [ gnumake pkg-config ];
  buildInputs = [ openssl libnl ];

  preBuild = ''
    cd wpa_supplicant
    cat > .config <<EOF
    CONFIG_DRIVER_NL80211=y
    CONFIG_CTRL_IFACE=y
    CONFIG_SME=y
    CONFIG_AP=y
    CONFIG_P2P=y
    CONFIG_IEEE80211AH=y
    CONFIG_SAE=y
    EOF
  '';

  installPhase = ''
    mkdir -p "$(dirname "$out/wpa_supplicant_s1g")"
    cp wpa_supplicant_s1g $out/wpa_supplicant_s1g
    
    mkdir -p "$(dirname "$out/wpa_supplicant_s1g.conf")"
    cp ${./wpa_supplicant_s1g.conf} $out/wpa_supplicant_s1g.conf
  '';
}

I ran the spi test mode as suggested here: what-is-maximum-speed-of-mm6108

[   22.402469] Morse Micro Dot11ah driver registration. Version 0-rel_1_15_3_2025_Apr_16
[   22.416465] morse micro driver registration. Version 0-rel_1_15_3_2025_Apr_16
[   22.416717] morse_spi spi0.0: morse_of_probe: Reading gpio pins configuration from device tree
[   22.416789] uaccess char driver major number is 234
[   22.416930] morse_io: Device node '/dev/morse_io' created successfully
[   22.425562] morse_spi spi0.0: Loaded firmware from mm6108.bin, size 444304, crc32 0x1c6a0f92
[   22.426193] morse_spi spi0.0: Loaded BCF from bcf_default.bin, size 1251, crc32 0x941b2a82
[   22.836275] morse_spi spi0.0: Bus IO write estimator
[   22.836289] morse_spi spi0.0:     packet size (bytes): 1460
[   22.836296] morse_spi spi0.0:     overhead (bytes):    102
[   22.836301] morse_spi spi0.0:     padding (bytes):     2
[   22.836307] morse_spi spi0.0:     batch(es):           16
[   22.836312] morse_spi spi0.0:     rounds:              10
[   22.954776] morse_spi spi0.0:     Wrote 233600 bytes in 118 ms
[   22.954821] morse_spi spi0.0:     Estimated IO upper bound: 15832 kbps
[   22.954860] morse_spi spi0.0: Bus timing profiler
[   22.954881] morse_spi spi0.0:     packet size (bytes): 1460
[   22.954900] morse_spi spi0.0:     overhead (bytes):    102
[   22.954919] morse_spi spi0.0:     padding (bytes):     2
[   22.954938] morse_spi spi0.0:     rounds:              16
[   22.988964] morse_spi spi0.0:     timing (us)
[   22.989032] morse_spi spi0.0:     bus claim  :    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[   22.989125] morse_spi spi0.0:     bus release:    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[   22.989166] morse_spi spi0.0:     read 32    :   56   47   44   43   43   43   43   43   44   44   43   45   45   51   50   50
[   22.989204] morse_spi spi0.0:     read bulk  : 1086 1069 1071 1070 1057 1067 1142 1223 1062  936 1070 1064 1044 1115 1057 1085
[  OK  ] Finished Load Kernel Modules.
[   22.989242] morse_spi spi0.0:     write 32   :   85   78   76   76   76   79   76  217   85  110   73   63   63   62   62   61
[   22.989279] morse_spi spi0.0:     write bulk :  699  659  844  916  910  908 1137  918  910  900  906  894 1136  914 1064  920
         Starting Apply Kernel Variables...
[   22.989468] morse_spi spi0.0: SKB allocation profiler (100 skbs w/ 1562 bytes)
[   22.989492] morse_spi spi0.0:     alloc: 72 us
[   22.989510] morse_spi spi0.0:     free:  79 us

how can I interpret these results.
I see: [ 22.954821] morse_spi spi0.0: Estimated IO upper bound: 15832 kbps

I also saw a note about the mmrc_table. I checked that here:

sudo cat /sys/kernel/debug/ieee80211/phy0/morse/mmrc_table

Morse Micro S1G RC Algorithm Statistics
Peer: 94:bb:43:dc:f8:6a
                     --------------Rate--------------  Throughput Probability ---------Last--------- ---------Total--------- ----MPDU-----
 BW   Guard Evidence Selection MCS   SS Index Airtime  Max   Avg      Average Retry Success  Attempt     Success     Attempt Success  Fail 
 1MHz   LGI        0           MCS0   1     0   34880  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS0   1     1   31371  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS0   1     2   16000  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS0   1     3   14390  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS0   1     4    7680  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS0   1     5    6907  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS0   1     6    3520  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   SGI        0       C   MCS0   1     7    3165  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   LGI        0           MCS1   1     8   24000  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS1   1     9   21585  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS1   1    10   10640  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS1   1    11    9569  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS1   1    12    5120  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS1   1    13    4605  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS1   1    14    2360  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   SGI        0           MCS1   1    15    2122  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   LGI        0           MCS2   1    16   17440  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS2   1    17   15685  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS2   1    18    8000  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS2   1    19    7195  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS2   1    20    3840  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS2   1    21    3453  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS2   1    22    1760  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   SGI        0           MCS2   1    23    1582  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   LGI        0           MCS3   1    24   11600  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS3   1    25   10433  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS3   1    26    5320  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS3   1    27    4784  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS3   1    28    2520  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS3   1    29    2266  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS3   1    30    1160  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   SGI        0           MCS3   1    31    1043  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   LGI        0           MCS4   1    32    8720  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS4   1    33    7842  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS4   1    34    4000  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS4   1    35    3597  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS4   1    36    1880  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS4   1    37    1690  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS4   1    38     880  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   SGI        0        P  MCS4   1    39     791 19.50 19.49         100     0       0        0          23          24       0     0
 1MHz   LGI        0           MCS5   1    40    5800  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS5   1    41    5216  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS5   1    42    2640  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS5   1    43    2374  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS5   1    44    1240  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS5   1    45    1115  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS5   1    46     560 23.40 22.69          97     0       0        0         464         492       0     0
 8MHz   SGI        0      B    MCS5   1    47     503 27.56 24.43          94     0       0        0       26367       27598       0     0
 1MHz   LGI        0           MCS6   1    48    4360  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS6   1    49    3921  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS6   1    50    2000  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS6   1    51    1798  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS6   1    52     920  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS6   1    53     827  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0         L MCS6   1    54     440 26.32 26.32         100     0       0        0        2006        2566       0     0
 8MHz   SGI        0     A     MCS6   1    55     395 30.71 30.12         100     0       0        0        7008        8525       0     0
 1MHz   LGI        0           MCS7   1    56    3840  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS7   1    57    3453  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   LGI        0           MCS7   1    58    1760  0.00  0.00           0     0       0        0           0           0       0     0
 2MHz   SGI        0           MCS7   1    59    1582  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   LGI        0           MCS7   1    60     840  0.00  0.00           0     0       0        0           0           0       0     0
 4MHz   SGI        0           MCS7   1    61     755  0.00  0.00           0     0       0        0           0           0       0     0
 8MHz   LGI        0           MCS7   1    62     360 29.25  0.00           0     0       0        0         367         697       0     0
 8MHz   SGI        0           MCS7   1    63     323 32.50 32.49         100     0       0        0        1609        3864       0     0
 1MHz   LGI        0           MCS10  1    64   64000  0.00  0.00           0     0       0        0           0           0       0     0
 1MHz   SGI        0           MCS10  1    65   57562  0.00  0.00           0     0       0        0           0           0       0     0

 Amount of packets sent: 43596 including: 170 look-around packets

and my device tree is configured as follows:

diff --git a/arch/arm64/boot/dts/rockchip/overlay/rk3588-mm6108-spi0.dts b/arch/arm64/boot/dts/rockchip/overlay/rk3588-mm6108-spi0.dts
new file mode 100644
index 000000000000..fbc22e91eb8f
--- /dev/null
+++ b/arch/arm64/boot/dts/rockchip/overlay/rk3588-mm6108-spi0.dts
@@ -0,0 +1,27 @@
+/dts-v1/;
+/plugin/;
+
+/ {
+    fragment@0 {
+        target = <&spi0>;
+
+        __overlay__ {
+            status = "okay";
+            #address-cells = <1>;
+            #size-cells = <0>;
+            pinctrl-names = "default";
+            pinctrl-0 = <&spi0m2_cs0 &spi0m2_pins>;
+            max-freq = <50000000>;  // Max supported bus frequency (controller limit)
+
+            morse_wifi@0 {
+                compatible = "morse,mm610x-spi";
+                reg = <0>;                    // Chip Select 0
+                spi-max-frequency = <50000000>; // Morse Wi-Fi device max freq = 31.25 MHz
+                reset-gpios = <&gpio1 7 0>;
+                power-gpios = <&gpio3 17 0>, <&gpio1 15 0>;
+                spi-irq-gpios = <&gpio1 4 0>;
+                status = "okay";
+            };
+        };
+    };
+};

here are some iperf results I ran:

[nix-shell:~]$ iperf3 -c 192.168.13.226 -u -b 20M -t 10
Connecting to host 192.168.13.226, port 5201
[  5] local 192.168.13.175 port 42900 connected to 192.168.13.226 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  1.14 MBytes  9.60 Mbits/sec  830  
[  5]   1.00-2.00   sec  1.18 MBytes  9.85 Mbits/sec  851  
[  5]   2.00-3.00   sec  1.07 MBytes  8.97 Mbits/sec  774  
[  5]   3.00-4.00   sec  1.09 MBytes  9.11 Mbits/sec  786  
[  5]   4.00-5.00   sec  1.09 MBytes  9.11 Mbits/sec  786  
[  5]   5.00-6.00   sec  1.17 MBytes  9.81 Mbits/sec  847  
[  5]   6.00-7.00   sec  1.09 MBytes  9.11 Mbits/sec  787  
[  5]   7.00-8.00   sec  1.16 MBytes  9.71 Mbits/sec  838  
[  5]   8.00-9.00   sec  1.09 MBytes  9.16 Mbits/sec  791  
[  5]   9.00-10.01  sec  1.17 MBytes  9.73 Mbits/sec  845  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.01  sec  11.2 MBytes  9.42 Mbits/sec  0.000 ms  0/8135 (0%)  sender
[  5]   0.00-10.06  sec  11.2 MBytes  9.34 Mbits/sec  1.086 ms  0/8110 (0%)  receiver

iperf Done.

[nix-shell:~]$ iperf3 -c 192.168.13.226 -b 20M -t 10
Connecting to host 192.168.13.226, port 5201
[  5] local 192.168.13.175 port 33346 connected to 192.168.13.226 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.50 MBytes  20.9 Mbits/sec    3    272 KBytes       
[  5]   1.00-2.00   sec  2.38 MBytes  19.9 Mbits/sec    0    263 KBytes       
[  5]   2.00-3.00   sec  2.38 MBytes  19.9 Mbits/sec    0    277 KBytes       
[  5]   3.00-4.00   sec   896 KBytes  7.34 Mbits/sec    0    280 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0    280 KBytes       
[  5]   5.00-6.00   sec  1.75 MBytes  14.7 Mbits/sec    0    266 KBytes       
[  5]   6.00-7.00   sec  1.62 MBytes  13.6 Mbits/sec    0    263 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0    260 KBytes       
[  5]   8.00-9.00   sec  1.62 MBytes  13.6 Mbits/sec    0    288 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0    255 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  13.1 MBytes  11.0 Mbits/sec    3            sender
[  5]   0.00-10.09  sec  9.50 MBytes  7.89 Mbits/sec                  receiver

iperf Done.

[nix-shell:~]$ iperf3 -c 192.168.13.226 -b 15M -t 10
Connecting to host 192.168.13.226, port 5201
[  5] local 192.168.13.175 port 48426 connected to 192.168.13.226 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.88 MBytes  15.7 Mbits/sec   47    209 KBytes       
[  5]   1.00-2.00   sec  1.75 MBytes  14.7 Mbits/sec   32    252 KBytes       
[  5]   2.00-3.00   sec  1.75 MBytes  14.7 Mbits/sec    0    263 KBytes       
[  5]   3.00-4.00   sec  1.88 MBytes  15.7 Mbits/sec    0    257 KBytes       
[  5]   4.00-5.00   sec  1.75 MBytes  14.7 Mbits/sec    0    255 KBytes       
[  5]   5.00-6.00   sec  1.25 MBytes  10.5 Mbits/sec    0    255 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    0    269 KBytes       
[  5]   7.00-8.00   sec  1.75 MBytes  14.7 Mbits/sec    0    269 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0    263 KBytes       
[  5]   9.00-10.00  sec  1.62 MBytes  13.6 Mbits/sec    0   5.66 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  13.6 MBytes  11.4 Mbits/sec   79            sender
[  5]   0.00-10.06  sec  9.50 MBytes  7.92 Mbits/sec                  receiver

iperf Done.

[nix-shell:~]$ iperf3 -c 192.168.13.226 -u -b 15M -t 10
Connecting to host 192.168.13.226, port 5201
[  5] local 192.168.13.175 port 49385 connected to 192.168.13.226 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  1.15 MBytes  9.62 Mbits/sec  832  
[  5]   1.00-2.00   sec  1.17 MBytes  9.82 Mbits/sec  848  
[  5]   2.00-3.00   sec  1.08 MBytes  9.04 Mbits/sec  780  
[  5]   3.00-4.00   sec  1.26 MBytes  10.5 Mbits/sec  910  
[  5]   4.00-5.00   sec  1.16 MBytes  9.76 Mbits/sec  842  
[  5]   5.00-6.00   sec  1.16 MBytes  9.73 Mbits/sec  840  
[  5]   6.00-7.00   sec  1.09 MBytes  9.10 Mbits/sec  786  
[  5]   7.00-8.00   sec  1.15 MBytes  9.66 Mbits/sec  833  
[  5]   8.00-9.00   sec  1.08 MBytes  9.08 Mbits/sec  784  
[  5]   9.00-10.00  sec  1018 KBytes  8.32 Mbits/sec  720  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  11.3 MBytes  9.47 Mbits/sec  0.000 ms  0/8175 (0%)  sender
[  5]   0.00-10.06  sec  11.3 MBytes  9.42 Mbits/sec  0.965 ms  0/8175 (0%)  receiver

iperf Done.

here are the params that I had before I loaded the test_mode=6 version:

[nix-shell:~]$ dmesg | grep morse
[   22.489356] morse micro driver registration. Version 0-rel_1_15_3_2025_Apr_16
[   22.490272] morse_spi spi0.0: morse_of_probe: Reading gpio pins configuration from device tree
[   22.490721] morse_io: Device node '/dev/morse_io' created successfully
[   22.509987] morse_spi spi0.0: Loaded firmware from mm6108.bin, size 444304, crc32 0x1c6a0f92
[   22.513107] morse_spi spi0.0: Loaded BCF from bcf_default.bin, size 1251, crc32 0x941b2a82
[   23.222189] morse_spi spi0.0: Driver loaded with kernel module parameters
[   23.222209] morse_spi spi0.0:     slow_clock_mode                         : 0
[   23.222216] morse_spi spi0.0:     enable_1mhz_probes                      : Y
[   23.222222] morse_spi spi0.0:     enable_sched_scan                       : Y
[   23.222228] morse_spi spi0.0:     enable_hw_scan                          : Y
[   23.222234] morse_spi spi0.0:     enable_pv1                              : N
[   23.222239] morse_spi spi0.0:     enable_page_slicing                     : N
[   23.222245] morse_spi spi0.0:     log_modparams_on_boot                   : Y
[   23.222251] morse_spi spi0.0:     enable_mcast_rate_control               : N
[   23.222257] morse_spi spi0.0:     enable_mcast_whitelist                  : Y
[   23.222262] morse_spi spi0.0:     ocs_type                                : 1
[   23.222268] morse_spi spi0.0:     enable_wiphy                            : 0
[   23.222274] morse_spi spi0.0:     enable_auto_mpsw                        : Y
[   23.222280] morse_spi spi0.0:     duty_cycle_probe_retry_threshold        : 2500
[   23.222286] morse_spi spi0.0:     duty_cycle_mode                         : 0
[   23.222292] morse_spi spi0.0:     enable_auto_duty_cycle                  : Y
[   23.222298] morse_spi spi0.0:     dhcpc_lease_update_script               : /morse/scripts/dhcpc_update.sh
[   23.222305] morse_spi spi0.0:     enable_ibss_probe_filtering             : Y
[   23.222311] morse_spi spi0.0:     enable_dhcpc_offload                    : N
[   23.222316] morse_spi spi0.0:     enable_arp_offload                      : N
[   23.222322] morse_spi spi0.0:     enable_bcn_change_seq_monitor           : 0
[   23.222327] morse_spi spi0.0:     enable_cac                              : 0
[   23.222333] morse_spi spi0.0:     max_mc_frames                           : 10
[   23.222340] morse_spi spi0.0:     tx_max_power_mbm                        : 2200
[   23.222346] morse_spi spi0.0:     enable_twt                              : Y
[   23.222351] morse_spi spi0.0:     enable_mac80211_connection_monitor      : N
[   23.222357] morse_spi spi0.0:     enable_airtime_fairness                 : N
[   23.222362] morse_spi spi0.0:     enable_raw                              : Y
[   23.222368] morse_spi spi0.0:     max_aggregation_count                   : 0
[   23.222373] morse_spi spi0.0:     max_rate_tries                          : 1
[   23.222379] morse_spi spi0.0:     max_rates                               : 4
[   23.222384] morse_spi spi0.0:     enable_watchdog_reset                   : N
[   23.222390] morse_spi spi0.0:     watchdog_interval_secs                  : 30
[   23.222396] morse_spi spi0.0:     enable_watchdog                         : Y
[   23.222402] morse_spi spi0.0:     country                                 : US
[   23.222408] morse_spi spi0.0:     enable_cts_to_self                      : N
[   23.222413] morse_spi spi0.0:     enable_rts_8mhz                         : N
[   23.222419] morse_spi spi0.0:     enable_trav_pilot                       : Y
[   23.222424] morse_spi spi0.0:     enable_sgi_rc                           : Y
[   23.222429] morse_spi spi0.0:     enable_mbssid_ie                        : N
[   23.222435] morse_spi spi0.0:     virtual_sta_max                         : 0
[   23.222441] morse_spi spi0.0:     thin_lmac                               : 0
[   23.222446] morse_spi spi0.0:     enable_dynamic_ps_offload               : Y
[   23.222452] morse_spi spi0.0:     enable_ps                               : 2
[   23.222457] morse_spi spi0.0:     enable_subbands                         : 2
[   23.222463] morse_spi spi0.0:     enable_survey                           : Y
[   23.222468] morse_spi spi0.0:     mcs10_mode                              : 0
[   23.222474] morse_spi spi0.0:     mcs_mask                                : 1023
[   23.222480] morse_spi spi0.0:     no_hwcrypt                              : 0
[   23.222485] morse_spi spi0.0:     enable_ext_xtal_init                    : N
[   23.222492] morse_spi spi0.0:     enable_otp_check                        : 1
[   23.222497] morse_spi spi0.0:     bcf                                     : 
[   23.222503] morse_spi spi0.0:     serial                                  : default
[   23.222509] morse_spi spi0.0:     debug_mask                              : 8
[   23.222515] morse_spi spi0.0:     tx_status_lifetime_ms                   : 15000
[   23.222521] morse_spi spi0.0:     tx_queued_lifetime_ms                   : 1000
[   23.222527] morse_spi spi0.0:     max_txq_len                             : 32
[   23.222534] morse_spi spi0.0:     default_cmd_timeout_ms                  : 600
[   23.222540] morse_spi spi0.0:     hw_reload_after_stop                    : 5
[   23.222546] morse_spi spi0.0:     enable_short_bcn_as_dtim_override       : -1
[   23.222552] morse_spi spi0.0:     fw_bin_file                             : 
[   23.222557] morse_spi spi0.0:     sdio_reset_time                         : 400
[   23.222564] morse_spi spi0.0:     macaddr_suffix                          : 00:00:00
[   23.222570] morse_spi spi0.0:     macaddr_octet                           : 255
[   23.222576] morse_spi spi0.0:     max_total_vendor_ie_bytes               : 514
[   23.222583] morse_spi spi0.0:     coredump_include                        : 1
[   23.222588] morse_spi spi0.0:     coredump_method                         : 1
[   23.222594] morse_spi spi0.0:     enable_coredump                         : Y
[   23.222599] morse_spi spi0.0:     spi_use_edge_irq                        : N
[   23.222605] morse_spi spi0.0:     spi_clock_speed                         : 40000000
[   23.222611] morse_spi spi0.0:     enable_mm_vendor_ie                     : Y
[   23.222617] morse_spi spi0.0:     fixed_guard                             : 0
[   23.222622] morse_spi spi0.0:     fixed_ss                                : 1
[   23.222628] morse_spi spi0.0:     fixed_bw                                : 2
[   23.222633] morse_spi spi0.0:     fixed_mcs                               : 4
[   23.222639] morse_spi spi0.0:     enable_fixed_rate                       : N

also tested with 20mhz and got these results:

[   23.410909] Morse Micro Dot11ah driver registration. Version 0-rel_1_15_3_2025_Apr_16
[   23.424820] morse micro driver registration. Version 0-rel_1_15_3_2025_Apr_16
[   23.425085] morse_spi spi0.0: morse_of_probe: Reading gpio pins configuration from device tree
[   23.425163] uaccess char driver major number is 234
[   23.425305] morse_io: Device node '/dev/morse_io' created successfully
[   23.433873] morse_spi spi0.0: Loaded firmware from mm6108.bin, size 444304, crc32 0x1c6a0f92
[   23.434477] morse_spi spi0.0: Loaded BCF from bcf_default.bin, size 1251, crc32 0x941b2a82
[   23.864914] morse_spi spi0.0: Bus IO write estimator
[   23.864929] morse_spi spi0.0:     packet size (bytes): 1460
[   23.864935] morse_spi spi0.0:     overhead (bytes):    102
[   23.864942] morse_spi spi0.0:     padding (bytes):     2
[   23.864948] morse_spi spi0.0:     batch(es):           16
[   23.864953] morse_spi spi0.0:     rounds:              10
[   24.023660] morse_spi spi0.0:     Wrote 233600 bytes in 158 ms
[   24.023695] morse_spi spi0.0:     Estimated IO upper bound: 11824 kbps
[   24.023714] morse_spi spi0.0: Bus timing profiler
[   24.023726] morse_spi spi0.0:     packet size (bytes): 1460
[   24.023738] morse_spi spi0.0:     overhead (bytes):    102
[   24.023756] morse_spi spi0.0:     padding (bytes):     2
[   24.023767] morse_spi spi0.0:     rounds:              16
[   24.054471] morse_spi spi0.0:     timing (us)
[   24.054513] morse_spi spi0.0:     bus claim  :    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[   24.054535] morse_spi spi0.0:     bus release:    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[   24.054558] morse_spi spi0.0:     read 32    :   58   67   48   48   47   47   49   81   50   48   48   68   49   39   81   50
[  OK  ] Finished Load Kernel Modules.
[   24.054581] morse_spi spi0.0:     read bulk  :  893  888  915  915 1004  909  877  891  886  870  889  895  875  882  886  901
[   24.054602] morse_spi spi0.0:     write 32   :   47   46   46   46   49   46   45   46   46   46   46   47   46   52   46   46
[   24.054624] morse_spi spi0.0:     write bulk :  927  899  905  914  883  944  928  916  920  917  921  912  914  917  912  917
[   24.054741] morse_spi spi0.0: SKB allocation profiler (100 skbs w/ 1562 bytes)
[   24.054756] morse_spi spi0.0:     alloc: 49 us
         Starting Apply Kernel Variables...
[   24.054767] morse_spi spi0.0:     free:  46 us

I figured out the spi clock issues and am now able to use 50mhz.

but still not able to get more than 10mbps.

is there something wrong with my config:
I’ve tried different mcs hardcoded settings as well.

here are my morse_cli stats:

[nix-shell:~]$ sudo morse_cli stats
System uptime (usec)                            : 130853192
Retry table
    Retry    Count    Avg Time
    =====    =====    ========
    0        24539    1935
    1        3519     3988
    2        595      9674
    3        95       17841
    4        18       32713
    5        6        57824
    6        1        65916
    7        0        0
    8        0        0
    9        0        0
    10       0        0
    11       0        0
    12       0        0
Commands received                               : 50
Command responses                               : 46
Commands repeated                               : 3
Commands failed                                 : 0
Command responses failed                        : 0
Commands pending                                : 0
Commands late                                   : 0
Last host ID                                    : 752
Total From Host page IRQs                       : 28788
Total From Host pages                           : 28788
Total To Host page IRQs                         : 1649
Total To Host pages                             : 1679
Aon clock long calibration                      : 0
Aon clock frequency                             : 27567
Aon sleep duration adjusted for clock tolerance : 0
PS sleeps                                       : 0
Woke from PS (timer)                            : 0
Woke from PS (pin)                              : 0
PS cumulative sleep time (usec)                 : 0
Pager From Host reaped                          : 614
Pager To Host reaped                            : 73
To Host pages dropped (host in Standby)         : 0
Invalid pages received                          : 0
Events to host sent                             : 4
Events to host failed                           : 0
No page for offload response                    : 0
ARP requests detected                           : 0
ARP requests answered                           : 0
ARP requests dropped                            : 0
ARP requests let through                        : 0
ARP table updated                               : 0
ARP gratuitous generated                        : 0
Periodic ARP frames sent                        : 0
TX Status none to send                          : 442
TX Status no page                               : 0
TX Status dropped                               : 0
TX Status buffer flushed                        : 1468
TX Status transmission corrupted                : 0
TX Status pages reaped                          : 0
Wake action frames handled                      : 0
Host main task unused stack (words)             : 204
Host timer task unused stack (words)            : 64
RTOS heap free bytes                            : 5112
IPv4 packets received                           : 1
TCP packets received                            : 0
UDP packets received                            : 1
Whitelist packets ignored                       : 0
Whitelist packets dropped                       : 0
Whitelist packets allowed                       : 0
ICMP packets received                           : 0
ICMP requests detected                          : 0
ICMP requests answered                          : 0
ICMP requests dropped                           : 0
DHCP packets received                           : 0
DHCP discoveries sent                           : 0
DHCP requests sent                              : 0
DHCP lease updates sent                         : 0
DHCP leases fully expired                       : 0
DHCP leases obtained                            : 0
DHCP leases renewed/rebound                     : 0
TCP keepalives answered                         : 0
TCP keepalives to host                          : 0
TCP keepalives generated                        : 0
TCP keepalive connection lost                   : 0
Standby wakeup frame RX                         : 0
Standby status frame (standby) TX               : 0
Standby status frame (awake) TX                 : 0
To MCU wake pin toggle                          : 0
Standby SA Query request TX                     : 0
RX corrupt action frames                        : 0
BSS keepalive TX                                : 0
BSS keepalive too many TX failures              : 0
TWT failed to flush TX status before UMAC pause : 0
HW scan started                                 : 1
HW scan completed                               : 1
HW scan aborted                                 : 0
HW scan no page for probe                       : 0
HW scan no page to store config                 : 0
HW scan no channels to scan                     : 0
Apps CPU utilisation (tenths of a percent)      : 20
Apps CPU instructions per 1000 cycles           : 547
Max time to enter suspend and wake (usec)       : 0
Suspend aborted                                 : 0
Fragmented RX frames ignored for offload processing: 0
Pageset stats
    Pageset 0
        Allocated                               : 30
        Total                                   : 30
    Pageset 1
        Allocated                               : 26
        Total                                   : 26
No A-MPDU candidate                             : 0
Commands expired in MAC req                     : 0
Commands expired in MAC resp                    : 0
DCF STF fired                                   : 8812
DCF LTF fired                                   : 5
DCF energy detect fired                         : 26025
DCF aborted                                     : 42065
DCF granted                                     : 33877
DCF medium busy before TX                       : 2
DCF phy blocked before TX                       : 0
Beacon loss                                     : 0
Beacon changed                                  : 0
Beacon RSSI changed                             : 15
AID in TIM                                      : 0
TIM contains AID but STA awake                  : 0
TIM group address                               : 0
TIM nothing                                     : 1112
TIM populated                                   : 0
TSF resync requests                             : 0
DTIM beacons RX                                 : 542
CTS sleep RX                                    : 0
CTS PS frame arrived late                       : 0
AGG A-MPDUs                                     : 28773 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AGG TX rate is NULL                             : 0
AGG TX param mismatch                           : 0
AGG TX max TXOP exceeded                        : 0
AGG TX length exceeded                          : 0
AGG crosses TBTT                                : 4974
AGG crosses TWT SP end                          : 0
AGG crosses RAW slot                            : 0
AGG crosses scheduled TX ts                     : 0
RX total                                        : 3112
RX pass FCS                                     : 2331
RX signal field error                           : 2665
RX buffer unavailable                           : 0
RX MAC was too slow                             : 0
TX Total                                        : 33875
TX Revoked                                      : 48708
TX non-contending NDP revoked                   : 0
TX lifetime expired                             : 0
TX returned to UMAC due to power save           : 0
TX flushed                                      : 0
TX malformed frame                              : 0
TX QoS NULL                                     : 0
TX ACK valid                                    : 28738
TX ACK timeout                                  : 5102
TX CTS timeout                                  : 0
Longest delayed TX ACK (usec)                   : 129
TX ACK already finished                         : 6
TX ACK invalid (FCS)                            : 6
TX ACK invalid (scrambler)                      : 18
TX ACK lost                                     : 369
TX CTS lost                                     : 0
TX fragment                                     : 0
TX BlockAck                                     : 0
RX BlockAck                                     : 1758
TX NDP ACK                                      : 62
TX round-trip success (%)                       : 84
TX requests                                     : 75938
TX non contend request failed                   : 0
TX average backoff slots                        : 4
TX encryptable pkts                             : 28733
TX reencryptable pkts                           : 52205
TX Unencryptable pkts                           : 0
RX transaction(s) dropped                       : 4365
RX packet(s) dropped                            : 0
RX Decryptable pkts                             : 18
RX Undecryptable pkts                           : 0
RX MPDU delimiters invalid                      : 111737
RX MPDU delimiters                              : 152
Beacons RX                                      : 1124
Beacons TX                                      : 0
Beacons TX delayed                              : 0
Beacons missing from host                       : 0
Beacons late from host delay (average usec)     : 0
Beacons TX lifetime expired                     : 0
Beacons late from host max delay (usec)         : 0
Beacons late from host count (packets)          : 0
Beacons TX failed count                         : 0
TX NDP probe request                            : 0
RX NDP probe request                            : 11
TX RTS/CTS max attempts reached                 : 0
TX RTS                                          : 0
TX CTS                                          : 2
RX RTS                                          : 0
RX CTS                                          : 10
RX CTS for RTS                                  : 0
RX CTS invalid (orphaned)                       : 6
TX RTS/CTS success                              : 0
RX no key found in key inspection               : 57
MPE IRQ count                                   : 44614
RX transaction(s) total                         : 7457
RX NDP total                                    : 32810
RX NDP not for us                               : 4066
RX A-MPDU/S-MPDU                                : 3613
RX Non-A-MPDU/Non-S-MPDU                        : 3844
Total RX MPDUs found by MPE                     : 0
Total RX empty delimiters                       : 256
RX no valid first frame                         : 0
RX MPDUs with FCS fail                          : 781
RX MPDUs with MIC fail                          : 0
RX MPDUs with unknown MAC header                : 0
RX PV1 MPDUs                                    : 0
RX transactions without any response            : 7395
RX MPDUs not for us                             : 358
RX AMPDU bitmap                                 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
RX multicast packets received                   : 11
RX multicast packets filtered                   : 10
RAW
    RAW Assignments
        Valid                                   : 0 0 0 0 0 0 0 0
        Truncated by TBTT                       : 0
        Invalid                                 : 0
        Already past                            : 0
    Delayed due to RAW
        From ACI queue                          : 0
        From BC/MC queue                        : 0
        From absolute time queue                : 0
        Frame crosses slot                      : 0
Calibration
    Quiet calibration granted                   : 2
    Quiet calibration rejected                  : 0
    Quiet calibration cancelled                 : 2
    Non-quiet calibration granted               : 0
    Calibration complete                        : 2
Duty Cycle
    Duty Cycle Target (%)                       : 100.00
    Duty Cycle TX on (usec)                     : 0
    Duty Cycle TX off (blocked) (usec)          : 0
    Duty Cycle max time off (usec)              : 0
    Duty Cycle early frames                     : 0
Duty Cycle illegal transmission                 : 0
Duty Cycle traffic dropped (NDP)                : 0
Duty Cycle traffic dropped                      : 0
WUP QosNull not acked                           : 0
QosNull resent                                  : 0
QosNull queued behind another                   : 0
QosNull TX backing off                          : 0
QosNull allocation failed                       : 0
MAC state
    RX state                                    : 0
    TX state                                    : 0
    Channel config                              : 0
    Managed calibration state                   : 0
    Powersave enabled                           : 1
    Dynamic powersave offload enabled           : 1
    STA PS state                                : 0
    Waiting on dynamic powersave timeout        : 0
    TX blocked by host cmd                      : 0
    Waiting for medium sync                     : 0
    Packets in QoS queues                       : 0
Stale AID removed                               : 0
MAC main task unused stack (words)              : 288
MAC timer task unused stack (words)             : 81
TWT SP entered                                  : 0
TWT enter SP too early                          : 0
TWT missed SP (already over)                    : 0
TWT missed start of SP (already started)        : 0
TWT SP already active on PS wake                : 0
TWT announcement delay from SP start (avg usec) : 0
TWT operation halted                            : 1
TWT operation resumed                           : 1
TSF for station has been (re)set                : 0
Standby exit from wakeup frame                  : 0
Standby exit for association                    : 0
Standby exit from userspace                     : 0
Standby exit for HW scan not enabled            : 0
Standby exit for HW scan failed to start        : 0
Standby SA Query response RX                    : 0
Standby SA Query response timeouts              : 0
Standby deep sleep enter                        : 0
Standby deep sleep exit (assoc)                 : 0
Standby deep sleep exit (BSS moved)             : 0
Standby deep sleep exit (no assoc)              : 0
Vendor IEs matching OUI filter                  : 0
PS scheduled after beacon miss                  : 0
PS scheduled after pin wake                     : 0
PS scheduled after TIM set with no traffic      : 0
PS max delivery delay after TIM set (usec)      : 0
MAC CPU utilisation (tenths of a percent)       : 135
MAC CPU instructions per 1000 cycles            : 496
Standby exit from external input                : 0
Standby exit from whitelisted packet            : 0
Standby exit TCP connection lost                : 0
Standby beacon loss with qos ack failure        : 0
OCS QoS Null failed to enqueue                  : 0
Unsupported response indication                 : 0
Standby beacons matched                         : 0
Standby probes matched                          : 0
Temperature (C)                                 : 39
Vbat (mV)                                       : 3303
PMU Switched to Vbat setting (mV)               : 3300
RC Calibration frequency (Hz)                   : 0
Vbuck Trim                                      : 5
Current RF frequency (Hz)                       : 908000000
Current Operating BW (MHz)                      : 8
Current Primary Channel BW (MHz)                : 2
DCF medium went busy after go                   : 5
DCF medium went busy during start               : 889
DCF time in past                                : 0
MPE RX overflow                                 : 85
MPE RX underflow                                : 0
RX detected less than guard IFS                 : 0
TX Scrambler initial value                      : 32
RX Scrambler decoded byte                       : 30
Signal field error                              : 5035
Packet Detect fired STF                         : 45344
Packet Detect fired LTF                         : 42751
Packet Detect timestamp (usec)                  : 130835078
Non-NDP PSDU RX                                 : 7457
Coarse frequency offset                         : -83
Final frequency offset                          : -405
Frequency offset (Hz)                           : 0
Detected sub-band                               : 0
AGC gain code                                   : 3
AGC FEM state                                   : 0
Inband power (dB)                               : 0
Received power (dBm)                            : -66
Detected channel BW (Hz)                        : 0
STF status                                      : 0
Detected RX mode                                : 0
Oper BW indicator                               : 0
Corr2S energy                                   : 0
Corr4S energy                                   : 0
Noise metric 2                                  : 0
Signal metric 2                                 : 0
Range select                                    : 0
TX Aborts                                       : 0
TX Stale                                        : 7784
TX Underflows                                   : 0
RX filtered                                     : 0
TX traveling pilots                             : 58857
Max stack used                                  : 848
Min noise estimation                            : 1000
DC locked state I phase path                    : 0
DC locked state Q phase path                    : 0
LTF correction power                            : 0
STF number of peaks                             : 0
Average ring oscillator frequency (mHz)         : 0
Ring oscillator 0 count                         : 0
Ring oscillator 1 count                         : 0
Ring oscillator 2 count                         : 0
Ring oscillator 3 count                         : 0
Ring oscillator 4 count                         : 0
TX subband centre frequency (Hz)                : 0
TX subband operating BW (MHz)                   : 0
TX subband primary channel BW (MHz)             : 0
TX subband primary index                        : 0
TX subband channel changed count                : 0
Total data packets sent in subband              : 0
Signal field fail bad STF                       : 41
Signal field fail bad LTF                       : 2554
Signal field fail CRC error                     : 2481
Servo gain                                      : 0
I phase path trim                               : 0
Q phase path trim                               : 0
Energy meter reading                            : 0
Energy meter threshold                          : 4000
RSSI counter 3                                  : 0
DC auto lock fired                              : 0
Packet boundary index                           : 0
PHY CPU utilisation (tenths of a percent)       : 983
PHY CPU instructions per 1000 cycles            : 281
STF early detetction                            : 1305
PHY noise state reset                           : 5
Energy threshold mode                           : 0
Energy threshold (dBm)                          : -80
Energy threshold correction factor freq. (dB)   : 0
RX Diversity: Listen antenna                    : 1
RX Diversity: Antenna 1 selection count         : 0
RX Diversity: Antenna 2 selection count         : 0
RX Diversity: AGC adjustment triggered          : 0
RX Diversity: Power difference (dB)             : 0
RX Diversity: Bypass count                      : 0
RX Diversity: Timeout count                     : 0
TX Diversity: TX antenna                        : 1
Energy detect triggered                         : 0
AGC LNA status                                  : 0
AGC LNA bypass                                  : 0
AGC interference detected                       : 0
Non-WLAN energy detected                        : 0
Non-WLAN energy reading                         : 0
Narrowband interference detected                : 0
Narrowband interference index                   : 0
Narrowband interference count                   : 118
Narrowband interference power (dBm)             : -74
Narrowband interference SIR (dB)                : 100
Noise (dBm)                                     : -78
Interference overall presence ratio (%)         : 0
Interference index                              :
Interference ratio (%)                          :
Interference power (dBm)                        :
Jammer detected and handled                     : 0

is there anything that looks abnormal.
@ajudge are you able to comment on this?

Hi @Luisschubert,

Thanks for all the data dumps. What is the estimated IO upper bound in the bus profiler tests now that you are at 50MHz? Hopefully more than 15Mbps as measured at 40.
Note that this estimate is likely to be higher than your iperf results due to protocol overheads in the IP layer, but we’ve tried to get it as close as possible.

Also, I may have missed it, but what are you using for the host processor? Not all host SPI controllers are built the same, and despite being a simple physical bus protocol, we have found some host processors can not achieve full throughput over the SPI bus due to memory bus limitations, DMA quirks or other framing/context delays.

Thankfully, looking at your iperf results my gut tells me this is a higher level throughput limitation. Can you share some results of some udp iperf sessions as well?

Because your throughput is quite bursty with TCP, it’s possible your throughput issues are related to choice of congestion control algorithm or queuing discipline.
Our eval kits typically use BBR for congestion control rather than cubic which is often the Linux default.
For the queueing disciplines, a number of systems will default to fq_codel. This is often a sensible choice for high bandwidth links, but as HaLow is a relatively low bandwidth protocol and packets have a longer transmission time, many of the parameters for fq_codel such as the target queue delay and interval are set too aggressive. Suggest setting to pfifo or pfifo_fast to test.

If you can get a logic analyzer capture of the SPI lines during an iperf session it can help to identify any issues at the physical bus - though these may not be solvable.

EDIT - I did miss it, device tree mentions you’re using an RK3588.

Hi @ajudge ,

I’m collaborating with Luis on this problem. I’ve run my own tests independently and I’ve attached a zip file with a presentation of results, morse_cli stats and o-scope captures. Unfortunately, I still do not have an answer for you regarding host processor or queuing discipline. @Luisschubert can you please comment on these.

I’ve also encountered a new issue, which is that data throughput is non symmetric for a wired link sending UDP traffic between a Halowlink1 AP and a Quectel FGH100M-H + Orange Pi STA. Hoping that this issue is a related symptom of the above problem and can help diagnose. I see that TX round trip success is 99% at 13mbps throughput when transmitting from the Quectel, and 82% (7mbps) in the other direction. Looking at the logs, TX AGG Param mismatch seems to be elevated. The MCS rate also seems to vary. This is replicable on multiple units. I do not see any of these issues when communicating between Halowlinks in the same test bench and geographic area. Please let me know if I can provide any further clarity on test setup or results

Quectel Orange Pi Debug 102125 v1.zip (2.5 MB)

I’ve posted before here on this Topic Quectel FGH100M-H throughput and SPI Test Mode Results

@ajudge I collected extensive data to get an understanding of where my limitations lie.

Short summary is that even with good link metrics I am unable to transmit more than 15ish mpbs. Since I think this limitation comes from the SPI bus this is where I have put efforts into in understanding it better.

Topics addressed:

  • Profiling SPI traffic during “idle”, “iperf3 -u -R”, “iperf3 -u”
  • SPI Throughput capability of R3588 SPI0
  • Understanding RK3288 hacks in morse spi driver

Profiling SPI traffic during “idle”, “iperf3 -u -R”, “iperf3 -u”

Idle Period Transfers.

Generally I see batches of interactions triggered by an IRQ every 100ms with two bursts 25ms apart.

IRQ Activity

                   25ms                     75ms
───────────┐  ┌───────────┐  ┌──────────────────────────────────┐  ┌───────────┐  ┌─────────
           │  │           │  │                                  │  │           │  │         
           │  │           │  │                                  │  │           │  │         
           │  │           │  │                                  │  │           │  │         
           └──┘           └──┘                                  └──┘           └──┘         

There is at least one other type of burst that happens periodically that I have not found a consistent pattern on.

When the IRQ triggers and interactions occur they usually come in pairs of two.

IRQ Handling

SPI CLK/MOSI/MISO
                                A1       A2            B1      B2              C1       C2  
                                ┌┐       ┌┐            ┌┐      ┌─┐             ┌┐       ┌┐  
                                ││       ││            ││      │ │             ││       ││  
                                ││       ││            ││      │ │             ││       ││  
                                ││       ││            ││      │ │             ││       ││  
────────────────────────────────┘└───────┘└────────────┘└──────┘ └─────────────┘└───────┘└──
                                                                                            
                                                                                            
SPI IRQ                                                                                     
──────────┐                               ┌─────────────────────────────────────────────────
          │                               │                                                 
          │                               │                                                 
          │                               │                                                 
          └───────────────────────────────┘                                                 
A1 is 24 bytes
A2 is 26 bytes
B1 is 24 bytes
B2 is 113, 114, 161, 233, 245 bytes
C1 is 24 bytes
C2 is 26 bytes

The way I am interpreting this interaction is that A1 and A2 is the reading and clearing of the IRQ STS1 Register.

Confirmed this by looking at the data from MOSI which is:

name type start_time duration mosi miso
SPI enable 6.177174634 0.000000002
SPI result 6.177177478 0.000000152 0xFF 0xFF
SPI result 6.17717764 0.00000015 0x75 0xFF
SPI result 6.1771778 0.000000152 0x14 0xFF
SPI result 6.177177962 0.000000152 0xC0 0xFF
SPI result 6.177178124 0.000000152 0xA0 0xFF
SPI result 6.177178286 0.000000152 0x04 0xFF
SPI result 6.177178446 0.000000152 0x89 0xFF
SPI result 6.177178608 0.000000152 0xFF 0xFF
SPI result 6.177178772 0.00000015 0xFF 0x00
SPI result 6.177178932 0.000000152 0xFF 0x00
SPI result 6.177179092 0.000000154 0xFF 0xFF
SPI result 6.177179254 0.000000154 0xFF 0xFE
SPI result 6.177179416 0.000000152 0xFF 0x04
SPI result 6.177179578 0.000000152 0xFF 0x00
SPI result 6.17717974 0.000000152 0xFF 0x00
SPI result 6.177179902 0.000000152 0xFF 0x00
SPI result 6.177180062 0.000000154 0xFF 0xCA
SPI result 6.177180224 0.000000152 0xFF 0xF1
SPI result 6.177180386 0.000000152 0xFF 0xFF
SPI result 6.177180548 0.000000152 0xFF 0xFF
SPI result 6.17718071 0.000000152 0xFF 0xFF
SPI result 6.17718087 0.000000152 0xFF 0xFF
SPI result 6.177181032 0.000000152 0xFF 0xFF
SPI result 6.177181194 0.000000152 0xFF 0xFF
SPI disable 6.177223896 0.000000002

The relevant data here is:

mosi miso
0xFF 0xFF
0x75 0xFF
0x14 0xFF
0xC0 0xFF
0xA0 0xFF
0x04 0xFF

0x7514C0A004

left shifting this by 9 and masking it with 0x1FFFF gives the address that is being accessed.

0x7514C0A004 >> 9
is: 0x3A8A6050

0x3A8A6050 & 0x1FFFF
is: 6050

and 6050 is the base of the INT1_STS register.

applying that same pattern to the second message A2.

mosi miso
0xFF 0xFF
0x75 0xFF
0x94 0xFF
0xC0 0xFF
0xB0 0xFF
0x04 0xFF
0xCD 0xFF

A2 has a “Command” of 0x7594C0B004 which results in getting the addres 6058.
6058 is the address of INT1_CLR.

the other bitfields denote the direction, the fn (guessing read and write) and the number size (always 4 here).

I haven’t really been able to decode what other other common messages that I see mean:

0x1842800
0x1862000
0x17E8000
0x1822000

MM6108 message decoding references

static int morse_spi_mem_read(struct morse_spi *mspi, u32 address, u8 *data, u32 size)
{

// ... omitted ...

    address &= 0xFFFF;	/* remove base and keep offset */

// ... omitted ...

		ret = morse_spi_cmd53_read(mspi, func_to_use, next_addr,
					   data + blks_done * MMC_SPI_BLOCKSIZE, bytes, 0);

// ... omitted ...
}

// ... omitted ...

static int morse_spi_put_cmd53(u8 fn, u32 address, u8 *data, u16 count, u8 write, bool block)
{
	u32 arg = 0;
	u8 cmd = 0;
	u8 opcode = 1;

	/* SDIO_CMD53 format as per PartE1_SDIO_Specification
	 * Start bit - 0
	 * Direction bit- 1
	 * Command Index(6bit) - SD_IO_RW_EXTENDED
	 * rw bit - 0: read, 1: write
	 * Function(3 bits) - func 1 only supported now
	 * Block mode bit - 0 is byte mode, 1 is block mode
	 * OP Code bit - 0 is fixed addr, 1 is incr addr
	 * address - up to 17 bits
	 * Byte/Blockcount - up to 9 bits
	 * CRC- 7bit
	 * stop bit - Always 1
	 */
	cmd |= 0x40;		/* Direction , 1 = towards device, 0 = towards host */
	cmd |= (SD_IO_RW_EXTENDED & 0x3f);

	arg |= (write & 0x1) << 31;
	arg |= (fn & 0x7) << 28;
	arg |= (opcode & 0x1) << 26;
	arg |= (address & 0x1ffff) << 9;	/* 17bit address */
	arg |= (count & 0x1ff);
	arg |= (block & 0x1) << 27;

	data[1] = 0x40 | cmd;
	put_unaligned_be32(arg, data + 2);
	data[6] = crc7_be(0, data + 1, 5) | 0x01;

	return SPI_COMMAND_SIZE;
}

morse_driver/mm6108.c

#define MM6108_REG_INT_BASE			0x100a6050

// ... omitted ...

static const struct morse_hw_regs mm6108_regs = {
	/* Register address maps */
	.irq_base_address = MM6108_REG_INT_BASE,

// ... omitted ...

};

morse_driver/hw.h

#define MORSE_REG_INT_BASE(mors)	((mors)->cfg->regs->irq_base_address)
#define MORSE_REG_INT1_STS(mors)	(MORSE_REG_INT_BASE(mors) + 0x00)
#define MORSE_REG_INT1_SET(mors)	(MORSE_REG_INT_BASE(mors) + 0x04)
#define MORSE_REG_INT1_CLR(mors)	(MORSE_REG_INT_BASE(mors) + 0x08)

morse_driver/hw.c

int morse_hw_irq_handle(struct morse *mors)
{

// ... omitted ...

	morse_reg32_read(mors, MORSE_REG_INT1_STS(mors), &status1);

// ... omitted ...

	morse_reg32_write(mors, MORSE_REG_INT1_CLR(mors), status1);

// ... omitted ...

}

Morse Receiving Data

Receiving

SPI CLK/MOSI/MISO 
  ┌────────────┐  ┌┐       ┌┐    ┌────────────┐       ┌┐   ┌┐  ┌┐    ┌────────────┐         
  │            │  ││       ││    │            │       ││   ││  ││    │            │         
  │            │  ││       ││    │            │       ││   ││  ││    │            │         
  │            │  ││       ││    │            │       ││   ││  ││    │            │         
──┘            └──┘└───────┘└────┘            └───────┘└───┘└──┘└────┘            └─────────
                                                                                            
  |--------------- A ------------|                                                          
                                                                                            

The delta between start and start of "major transactions" is anywhere between 1 - 2.3 milliseconds

Morse Sending Data

Sending

SPI CLK/MOSI/MISO 
  ┌────────────┐  ┌┐   ┌┐    ┌────────────┐    ┌┐   ┌┐    ┌────────────┐  ┌┐   ┌┐  ┌────────
  │            │  ││   ││    │            │    ││   ││    │            │  ││   ││  │        
  │            │  ││   ││    │            │    ││   ││    │            │  ││   ││  │        
  │            │  ││   ││    │            │    ││   ││    │            │  ││   ││  │        
──┘            └──┘└───┘└────┘            └────┘└───┘└────┘            └──┘└───┘└──┘        
                                                                                            
  |------------ B -----------|                                                          
                                                                                            

The delta between start and start of "major transactions" is anywhere between 600 - 800 microseconds.

Morse Transaction Analysis

In terms of throughput I’m seeing roughly half of STA → AP on AP → STA.
Where STA is the morse micro on the rk3588 and the AP is an Halowlink1.
12-15Mbps Sending
6-7 Mbps Receiving

testing was performed with

$ iperf3 -c 192.168.13.226 -p 5202 -u -b 20M -t 0 -R
$ iperf3 -c 192.168.13.226 -p 5202 -u -b 20M -t 0

SPI Throughput capability of R3588 SPI

RK3588 SPI Driver Details

The SPI driver has an upper limit of 50MHZ, a max transaction size of 0x10000, and a minimum transaction size for DMA usage.

There a couple of idiosynchrocies to point out below.
The driver states that the max transaction length is 0xffff (65535).
Note: In practice I can’t recommend allowing for the transactions of this size due to performance degradation.

The usage of DMA is actively disabled for transaction size of FIFO_LENGTH-1 bytes or less.
In the case of the RK3588 that means any transactions of size 63 or less will not use DMA.

drivers/spi/spi-rockchip.c

/*
 * SPI_CTRLR1 is 16-bits, so we should support lengths of 0xffff + 1. However,
 * the controller seems to hang when given 0x10000, so stick with this for now.
 */
#define ROCKCHIP_SPI_MAX_TRANLEN		0xffff

// ... omitted ...

static u32 get_fifo_len(struct rockchip_spi *rs)
{
	switch (rs->version) {
	case ROCKCHIP_SPI_VER2_TYPE1:
	case ROCKCHIP_SPI_VER2_TYPE2:
		return 64;
	default:
		return 32;
	}
}

// ... omitted ...

static int rockchip_spi_probe(struct platform_device *pdev)
{

// ... omitted ...

	rs->fifo_len = get_fifo_len(rs);

// ... omitted ...

	if (ctlr->dma_tx && ctlr->dma_rx) {
		rs->dma_addr_tx = mem->start + ROCKCHIP_SPI_TXDR;
		rs->dma_addr_rx = mem->start + ROCKCHIP_SPI_RXDR;
		ctlr->can_dma = rockchip_spi_can_dma;
	}

// ... omitted ...

}

// ... omitted ...

static bool rockchip_spi_can_dma(struct spi_controller *ctlr,
				 struct spi_device *spi,
				 struct spi_transfer *xfer)
{
	struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
	unsigned int bytes_per_word = xfer->bits_per_word <= 8 ? 1 : 2;

	/* if the numbor of spi words to transfer is less than the fifo
	 * length we can just fill the fifo and wait for a single irq,
	 * so don't bother setting up dma
	 */
	return xfer->len / bytes_per_word >= rs->fifo_len;
}

// ... omitted ...

static int rockchip_spi_transfer_one(
		struct spi_controller *ctlr,
		struct spi_device *spi,
		struct spi_transfer *xfer)
{

// ... omitted ...

	rs->n_bytes = xfer->bits_per_word <= 8 ? 1 : 2;
	rs->xfer = xfer;
	if (rs->poll) {
		xfer_mode = ROCKCHIP_SPI_POLL;
	} else {
		use_dma = ctlr->can_dma ? ctlr->can_dma(ctlr, spi, xfer) : false;
		if (use_dma)
			xfer_mode = ROCKCHIP_SPI_DMA;
		else
			xfer_mode = ROCKCHIP_SPI_IRQ;
	}

// ... omitted ...

}

spidev-test Results

Spidev on the rockchip defaults to a max transaction length of 4096.

$ sudo spidev_test -D /dev/spidev0.0 -s 50000000 -S 4096 -I 10000 
spi mode: 0x0
bits per word: 8
max speed: 50000000 Hz (50000 kHz)
rate: tx 16003.9kbps, rx 16003.9kbps
rate: tx 18946.5kbps, rx 18946.5kbps
rate: tx 18926.8kbps, rx 18926.8kbps
total: tx 40000.0KB, rx 40000.0KB

$ sudo spidev_test -D /dev/spidev0.0 -s 50000000 -S 8192 -I 10 
spi mode: 0x0
bits per word: 8
max speed: 50000000 Hz (50000 kHz)
can't send spi message: Message too long
Aborted

In order to test the limits of the spi peripheral spidev was loaded as module instead.

config option:

- CONFIG_SPI_SPIDEV=y
+ CONFIG_SPI_SPIDEV=m
$ cat /sys/module/spidev/parameters/bufsiz
4096
$ sudo rmmod spidev
$ sudo modprobe spidev bufsiz=1048576
$ cat /sys/module/spidev/parameters/bufsiz
1048576

This buffer is ungodly large but that’s beside the point. The spi-rockchip driver for the spi peripheral won’t accept anything larger than 0xffff anyway.

Result Summary:

typical mm6108 spi transfer sizes

Transfer Size (bytes) Average Throughput (kbps) Average Throughput (Mbps)
26 1739.7 1.74
56 1693.8 1.69
64 4399.5 4.40
114 4360.9 4.36
233 8016.7 8.02
245 8049.6 8.05
2303 10041.1 10.04

8 byte aligned transfers

Transfer Size (bytes) Average Throughput (kbps) Average Throughput (Mbps)
4096 18297.7 18.30
8192 19698.0 19.70
16384 20765.1 20.77
32768 21192.8 21.19
65528 25165.6 25.17

non-8 byte aligned transfers

Transfer Size (bytes) Average Throughput (kbps) Average Throughput (Mbps)
32769 12793.0 12.79
65535 15663.0 15.66

This essential tells us that:

  • anything that is less than the minimum 64 bytes does not use DMA.
  • something in the chain between spi driver, dma and spi peripheral does not like non-8 byte aligned transactions.

Understanding RK3288 and Raspberry Pi hacks in morse spi driver

Transaction sizing

morse_driver/spi.c

#define MMC_SPI_BLOCKSIZE		(512)

#define MM610X_BUF_SIZE			(8 * 1024U)

// ... omitted ...

/* SW-5611:
 *
 * The value of SPI_MAX_TRANSACTION_SIZE was increased from 4096 to 8192
 * This will reduce the overhead of inter transaction delay to increase throughput
 *
 */
/** Maximum number of bytes per RPi SPI transaction */
#define SPI_MAX_TRANSACTION_SIZE	(8192)
/** Maximum number of bytes per SPI read/write */
#define SPI_MAX_TRANSFER_SIZE		(64 * 1024)

// ... omitted ...

#define SPI_DEFAULT_MAX_INTER_BLOCK_DELAY_BYTES	250

Which of these are hard limits by the MM6108 and which of these are “device specific”.

I’ve tried changing some of these values and ran into varying degrees of failures.

I guess i’ll think through these one more time…

MMC_SPI_BLOCKSIZE (512 bytes)

okay this is used in many different places to compute a whole bunch of things. Not immediately clear what.
My guess here is that this is fundamental limit on the SDIO hw block on the MM6108 side. shouldn’t attempt to change that.

MM610X_BUF_SIZE (8192 bytes)

Guessing this is the buffer used for interacting with the mm6108, sending or receiving data.

SPI_MAX_TRANSACTION_SIZE (8192 bytes)

Apparently changed from 4096 to 8192 for RPi reasons. I mean it makes sense the larger your individual transactions the less inter frame delay. But why not more? different hosts have different limitations on rk3588 this can be up to 65535 (although not advisable. 32768 would be better).

SPI_MAX_TRANSFER_SIZE (65536 bytes)

Don’t understand the difference between SPI_MAX_TRANSACTION_SIZE and SPI_MAX_TRANSFER_SIZE.
SPI_MAX_TRANSFER_SIZE seems to be used for “direct memory writes”. But since that value exceeds the max limit of the rk3588 spi driver. but even sure how this works.

SPI_COMMAND_BUF_SIZE (20/30 bytes)

Abritrarily increased for rk3288. Wonder if this can be further increased to handle the dma issues on the rk3588.

morse_driver/spi.c

/* Size of the buffer required for SPI commands without data blocks (e.g. CMD0, CMD52) */
#ifdef CONFIG_MORSE_SPI_RK3288
/* Some Rockchip devices are sometimes slow to put the response on the MISO line */
/* TODO check if this is still the case with JTAG issues resolved. */
#define SPI_COMMAND_BUF_SIZE		(30)
#else
#define SPI_COMMAND_BUF_SIZE		(20)
#endif

Does this hack relate to my earlier mention about RK3588 spi driver min transaction size and usage of FIFO?
It does mention “put the response on the MISO line” so likely rk was a slave device here and this does not apply.

Hack to shift bits

morse_driver/spi.c

/* Hack to shift bits for problematic SPI controllers */
static void morse_shift_buffer(u8 *data, unsigned int len, u8 right_shift_bits)
{
// ... omitted ...
}

static int morse_spi_xfer(struct morse_spi *mspi, unsigned int len)
{

// ... omitted ...

	if (is_rk3288)
		morse_shift_buffer(mspi->data, len, 1);

	return ret;
}

So I’m not entirely certain on this but I think this could also be solved with appropriate rx sample delay compensation.
Don’t recall where I found this but the MM6108 specifies its max delay between clk active and MOSI active. computed the delay needed based on that and my spi clock frequency.

/ {
    fragment@0 {
        target = <&spi0>;

        __overlay__ {
            status = "okay";

// ... omitted ...

			rx-sample-delay-ns = <10>;

// ... omitted ...

            morse_wifi@0 {
                compatible = "morse,mm610x-spi";

// ... omitted ...

                status = "okay";
            };
        };
    };
};

Asks for Morse Micro

  • How can i make better use of the spi driver on the rk3588?
  • Which parameters can I change to increase the minimum transfer size to 64 bytes to always use the DMA by just adding padding?
  • Is it feasible to align all transfers to the 8 byte boundary to ensure no slow downs occur on the SPI bus?
  • Is it possible to increase the size of the buffers used to transmit data, is 8192 a hard limit on could this theoritically be larger?
  • What are the other types of interactions between host and morse that are shorter, aside from the IRQ setting and clearing?

Logic Analyzer Traces.

Whole trace Overview

Idle period Overview

Idle period Interrupt Pattern

Idle period Single Interrupt Overview

Idle period Single Interrupt INT1 STS and CLR

Idle period Single Interrupt 2nd TXN Overview

Idle period Single Interrupt 3rd TXN Overview

RX Overview

RX Period 1 sec zoom

RX Period 200 milli sec zoom

RX Period 60 milli sec zoom

RX Period 10 milli sec zoom

RX Period 6 milli sec zoom

RX Period Single TXN

RX Period Single TXN CS active

Hello,

We are integrating the Quectel FGH100M-H into a custom PCB, with the Orange Pi 5 Plus and RK3588 as the host processor, communicating over SPI. Our board (Orange Pi) is configured as the STA and we use the 8MHz, 916MHz channel in a wired setup for transmission. We achieve only 13mbps in the uplink (to Halowlink1 AP) , and 7mbps in the downlink while requesting 20mbps UDP. Commands used are iperf3 -c 192.168.69.1 -u -b 20M -t 0 -p5201 (-R). The MMRC on the AP shows very low success rates across all MCS rates when communicating to our custom hardware. The MAC Address of our board is 68:24:99:44:6A:61. The table shows other connected devices as well, which are Halowlink1 STAs and Quectel EVK FGH100-M boards, that do not have any of the above issues.

I’ve also included an MMRC table for our PCB transmitting to the AP, which has high success rates.
Generally seeing high counts of AGG TX param mismatch: what specifically is it, what causes it and how to address it?

The 13mbps limit seems to be coming from SPI limitations as addressed here:

https://community.morsemicro.com/t/mm6108-quectel-fgh100m-h-spi-protocol-level-optimizations/979/3

I’ve tried fixing the MCS rate on the AP as well, and did not see any change in improv

Quectel Orange Pi Debug 102125 v1.zip (2.5 MB)

ement in downlink throughput either. This is replicable across multiple units.

I’ve included morse_cli stats on each device as well, reset before iperf test and collected after. I want to note that one log shows a very low received power (-77dBm) , that was because it was collected after the end of active transmission. Received power from the AP is generally -52dBm.

Please advise on how best to debug/proceed.

Will post some more updates when I have them. But the main issue here might really boil down to rockchip’s dma controller stalling. Unless I can find a more clean fix will have to modify all the Morse micro transactions to align to 8 bytes where possible.

Hi @alexb, @Luisschubert

Thanks for capturing and sharing the volume of data you have collected! I’ve merged your threads into one so it is easier for us to find all the relevant information, and we will try to get an answer to you shortly!

SPI Driver quirks RK3588

Topics Addressed:

  • RK3588 Spi driver:
    • Variable delays after transaction transmit until CS inactive
    • Long delays after transaction transmit until CS inactive
    • Clock delays during spi transaction of non-aligned sizes.
  • Morse Bus Interaction:
    • Delays in communication seemingly triggered by MM6108 interrupt

TLDR

Even with some changes to optimize the morse driver’s usage of the rockchip spi driver I am still not getting as much throughput as I would like.
I think there is still some more delays in the rockchip spi driver that can be burned down, specifically the “Long delays after transaction transmit until CS inactive”. (But not sure how much of a problem this truly is)

The other delays that are unexplained to me are noted in “Delays in communication seemingly triggered by MM6108 interrupt”.

Overall my UDP iperf3 throughput has improved from RX being at 6-7 to 10-11 and TX being at 10-12 to 15-16

Variable delays after transaction transmit until CS inactive

Use rockchip,autosuspend-delay dt property

first test i did was enabling the autosuspend-delay:

diff --git a/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts b/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
index d3843ab3407f..0d9e9dadbb7b 100644
--- a/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
+++ b/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
@@ -12,6 +12,7 @@ __overlay__ {
                        pinctrl-names = "default";
                        pinctrl-0 = <&spi0m2_cs0 &spi0m2_pins>;
                        max-freq = <50000000>;
+                       rockchip,autosuspend-delay-ms = <100>;
 
                        spidev@0 {
                                compatible = "rockchip,spidev";

This didn’t yield much benefits at all.

Create rockchip,always-on dt property

So took a more aggressive attempt.

diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c
index b627663667a6..2061542951b5 100644
--- a/drivers/spi/spi-rockchip.c
+++ b/drivers/spi/spi-rockchip.c
@@ -229,6 +229,7 @@ struct rockchip_spi {
 
        /* quirks */
        u32 max_baud_div_in_cpha;
+       bool always_on_pm; /* Keep device always on for max throughput */
 };
 
 static inline void spi_enable_chip(struct rockchip_spi *rs, bool enable)
@@ -287,8 +288,9 @@ static void rockchip_spi_set_cs(struct spi_device *spi, bool enable)
                return;
 
        if (cs_asserted) {
-               /* Keep things powered as long as CS is asserted */
-               pm_runtime_get_sync(rs->dev);
+               /* Skip PM if always-on mode is enabled */
+               if (!rs->always_on_pm)
+                       pm_runtime_get_sync(rs->dev);
 
                if (spi->cs_gpiod)
                        ROCKCHIP_SPI_SET_BITS(rs->regs + ROCKCHIP_SPI_SER, 1);
@@ -300,8 +302,9 @@ static void rockchip_spi_set_cs(struct spi_device *spi, bool enable)
                else
                        ROCKCHIP_SPI_CLR_BITS(rs->regs + ROCKCHIP_SPI_SER, BIT(spi->chip_select));
 
-               /* Drop reference from when we first asserted CS */
-               pm_runtime_put(rs->dev);
+               /* Skip PM if always-on mode is enabled */
+               if (!rs->always_on_pm)
+                       pm_runtime_put(rs->dev);
        }
 
        rs->cs_asserted[spi->chip_select] = cs_asserted;
@@ -980,6 +983,7 @@ static int rockchip_spi_probe(struct platform_device *pdev)
        struct pinctrl *pinctrl = NULL;
        const struct rockchip_spi_quirks *quirks_cfg;
        u32 val;
+       bool disable_auto_runtime_pm = false;
 
        slave_mode = of_property_read_bool(np, "spi-slave");
 
@@ -1107,6 +1111,9 @@ static int rockchip_spi_probe(struct platform_device *pdev)
        if (quirks_cfg)
                rs->max_baud_div_in_cpha = quirks_cfg->max_baud_div_in_cpha;
 
+       /* Check for always-on mode for max throughput */
+       rs->always_on_pm = device_property_read_bool(&pdev->dev, "rockchip,always-on-pm");
+
        if (!device_property_read_u32(&pdev->dev, "rockchip,autosuspend-delay-ms", &val)) {
                if (val > 0) {
                        pm_runtime_set_autosuspend_delay(&pdev->dev, val);
@@ -1114,10 +1121,19 @@ static int rockchip_spi_probe(struct platform_device *pdev)
                }
        }
 
+       if (rs->always_on_pm) {
+               /* Disable framework PM for max throughput */
+               ctlr->auto_runtime_pm = false;
+       } else {
+               if (!device_property_read_bool(&pdev->dev, "rockchip,disable-auto-runtime-pm"))
+                       ctlr->auto_runtime_pm = true;
+               else
+                       ctlr->auto_runtime_pm = false;
+       }
+
        pm_runtime_set_active(&pdev->dev);
        pm_runtime_enable(&pdev->dev);
 
-       ctlr->auto_runtime_pm = true;
        ctlr->bus_num = pdev->id;
        ctlr->mode_bits = SPI_CPOL | SPI_CPHA | SPI_LOOP | SPI_LSB_FIRST;
        if (slave_mode) {
@@ -1238,6 +1254,10 @@ static int rockchip_spi_probe(struct platform_device *pdev)
        dev_info(rs->dev, "probed, poll=%d, rsd=%d, cs-inactive=%d, ready=%d\n",
                 rs->poll, rs->rsd, rs->cs_inactive, rs->ready ? 1 : 0);
 
+       /* Keep device always powered if always-on mode enabled */
+       if (rs->always_on_pm)
+               pm_runtime_get_noresume(rs->dev);
+
        return 0;
 
 err_free_dma_rx:

and the corresponding device tree:

diff --git a/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts b/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
index 0d9e9dadbb7b..034dd4604426 100644
--- a/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
+++ b/arch/arm64/boot/dts/rockchip/overlay/rk3588-spi0-m2-cs0-spidev.dts
@@ -12,7 +12,7 @@ __overlay__ {
                        pinctrl-names = "default";
                        pinctrl-0 = <&spi0m2_cs0 &spi0m2_pins>;
                        max-freq = <50000000>;
-                       rockchip,autosuspend-delay-ms = <100>;
+                       rockchip,always-on-pm;
 
                        spidev@0 {
                                compatible = "rockchip,spidev";

ran some spidev-tests with this setting and found more consistentcy in cs active periods.

Results:

Tx size (bytes) always-on-pm enabled mean active period duration (µs) stddev active period duration (µs)
26 false 91.31 89.86
26 true 53.24 6.90
128 false 147.45 70.76
128 true 103.72 19.18

Test with 128 byte transfers

no pm disabled:

% python3 compute_cs_active_duration.py ~/rk3588\ spidev\ test\ 128\ bytes.csv
Parsing CSV file: /Users/luisschubert/rk3588 spidev test 128 bytes.csv
Found 20297 SPI events
Calculated 10148 CS active durations

CS Active Duration Statistics:
  Count:     10148
  Min:       0.000053460 seconds
  Max:       0.000727180 seconds
  Mean:      0.000147448 seconds
  Std Dev:   0.000070762 seconds

In microseconds:
  Min:       53.46 µs
  Max:       727.18 µs
  Mean:      147.45 µs
  Std Dev:   70.76 µs

Running Averages (non-overlapping windows of 1000 samples):
  Number of windows: 11
  Window 1: 65.54 µs
  Window 2: 69.04 µs
  Window 3: 87.80 µs
  Window 4: 158.15 µs
  Window 5: 156.56 µs
  Window 6: 196.20 µs
  Window 7: 160.46 µs
  Window 8: 193.95 µs
  Window 9: 187.52 µs
  Window 10: 192.70 µs
  Window 11: 191.75 µs

with pm_disabled:

% python3 compute_cs_active_duration.py ~/rk3588\ spidev\ test\ 128\ bytes\ no_pm.csv 
Parsing CSV file: /Users/luisschubert/rk3588 spidev test 128 bytes no_pm.csv
Found 40000 SPI events
Calculated 20000 CS active durations

CS Active Duration Statistics:
  Count:     20000
  Min:       0.000045216 seconds
  Max:       0.000470476 seconds
  Mean:      0.000103716 seconds
  Std Dev:   0.000019179 seconds

In microseconds:
  Min:       45.22 µs
  Max:       470.48 µs
  Mean:      103.72 µs
  Std Dev:   19.18 µs

Running Averages (non-overlapping windows of 1000 samples):
  Number of windows: 20
  Window 1: 121.24 µs
  Window 2: 124.76 µs
  Window 3: 101.62 µs
  Window 4: 101.41 µs
  Window 5: 100.81 µs
  Window 6: 101.55 µs
  Window 7: 101.65 µs
  Window 8: 101.59 µs
  Window 9: 101.56 µs
  Window 10: 101.96 µs
  Window 11: 101.55 µs
  Window 12: 101.81 µs
  Window 13: 101.57 µs
  Window 14: 101.48 µs
  Window 15: 101.78 µs
  Window 16: 101.62 µs
  Window 17: 101.84 µs
  Window 18: 101.45 µs
  Window 19: 101.57 µs
  Window 20: 101.48 µs

Test with 26 byte transfers

with pm enabled

% python3 compute_cs_active_duration.py ~/rk3588\ spidev\ test\ 26\ bytes.csv     
Parsing CSV file: /Users/luisschubert/rk3588 spidev test 26 bytes.csv
Found 200000 SPI events
Calculated 100000 CS active durations

CS Active Duration Statistics:
  Count:     100000
  Min:       0.000022460 seconds
  Max:       0.000790072 seconds
  Mean:      0.000091312 seconds
  Std Dev:   0.000089861 seconds

In microseconds:
  Min:       22.46 µs
  Max:       790.07 µs
  Mean:      91.31 µs
  Std Dev:   89.86 µs

Running Averages (non-overlapping windows of 1000 samples):
  Number of windows: 100
  Window 1: 39.22 µs
  Window 2: 42.68 µs
  Window 3: 46.94 µs
  Window 4: 45.74 µs
  Window 5: 46.01 µs
  Window 6: 46.01 µs
  Window 7: 44.82 µs
  Window 8: 45.67 µs
  Window 9: 44.40 µs
  Window 10: 46.39 µs
  Window 11: 46.00 µs
  Window 12: 46.61 µs
  Window 13: 45.59 µs
  Window 14: 48.88 µs
  Window 15: 45.09 µs
  Window 16: 46.85 µs
  Window 17: 45.01 µs
  Window 18: 46.05 µs
  Window 19: 44.24 µs
  Window 20: 45.96 µs
  Window 21: 46.28 µs
  Window 22: 46.71 µs
  Window 23: 47.45 µs
  Window 24: 45.13 µs
  Window 25: 43.71 µs
  Window 26: 43.61 µs
  Window 27: 43.57 µs
  Window 28: 43.74 µs
  Window 29: 43.64 µs
  Window 30: 43.73 µs
  Window 31: 43.62 µs
  Window 32: 43.77 µs
  Window 33: 43.72 µs
  Window 34: 43.71 µs
  Window 35: 43.91 µs
  Window 36: 43.99 µs
  Window 37: 43.65 µs
  Window 38: 44.96 µs
  Window 39: 44.00 µs
  Window 40: 44.15 µs
  Window 41: 48.02 µs
  Window 42: 44.08 µs
  Window 43: 43.73 µs
  Window 44: 43.73 µs
  Window 45: 43.70 µs
  Window 46: 44.04 µs
  Window 47: 43.69 µs
  Window 48: 43.86 µs
  Window 49: 44.00 µs
  Window 50: 43.69 µs
  Window 51: 45.26 µs
  Window 52: 43.75 µs
  Window 53: 43.70 µs
  Window 54: 43.65 µs
  Window 55: 43.67 µs
  Window 56: 44.00 µs
  Window 57: 43.68 µs
  Window 58: 44.14 µs
  Window 59: 43.86 µs
  Window 60: 43.70 µs
  Window 61: 43.67 µs
  Window 62: 43.69 µs
  Window 63: 43.67 µs
  Window 64: 44.84 µs
  Window 65: 44.65 µs
  Window 66: 40.75 µs
  Window 67: 69.72 µs
  Window 68: 174.65 µs
  Window 69: 251.18 µs
  Window 70: 241.01 µs
  Window 71: 245.50 µs
  Window 72: 227.01 µs
  Window 73: 236.98 µs
  Window 74: 255.29 µs
  Window 75: 244.81 µs
  Window 76: 269.90 µs
  Window 77: 239.31 µs
  Window 78: 241.33 µs
  Window 79: 239.26 µs
  Window 80: 260.47 µs
  Window 81: 241.23 µs
  Window 82: 97.43 µs
  Window 83: 72.54 µs
  Window 84: 129.96 µs
  Window 85: 251.16 µs
  Window 86: 233.18 µs
  Window 87: 243.15 µs
  Window 88: 243.24 µs
  Window 89: 254.56 µs
  Window 90: 256.69 µs
  Window 91: 116.14 µs
  Window 92: 72.59 µs
  Window 93: 75.54 µs
  Window 94: 72.57 µs
  Window 95: 71.62 µs
  Window 96: 73.39 µs
  Window 97: 72.46 µs
  Window 98: 71.56 µs
  Window 99: 75.35 µs
  Window 100: 271.93 µs

with pm disabled

% python3 compute_cs_active_duration.py ~/rk3588\ spidev\ test\ 26\ bytes\ no_pm.csv
Parsing CSV file: /Users/luisschubert/rk3588 spidev test 26 bytes no_pm.csv
Found 200000 SPI events
Calculated 100000 CS active durations

CS Active Duration Statistics:
  Count:     100000
  Min:       0.000027016 seconds
  Max:       0.001440572 seconds
  Mean:      0.000053236 seconds
  Std Dev:   0.000006904 seconds

In microseconds:
  Min:       27.02 µs
  Max:       1440.57 µs
  Mean:      53.24 µs
  Std Dev:   6.90 µs

Running Averages (non-overlapping windows of 1000 samples):
  Number of windows: 100
  Window 1: 48.78 µs
  Window 2: 52.20 µs
  Window 3: 53.25 µs
  Window 4: 53.41 µs
  Window 5: 54.26 µs
  Window 6: 53.24 µs
  Window 7: 53.16 µs
  Window 8: 53.30 µs
  Window 9: 53.32 µs
  Window 10: 53.33 µs
  Window 11: 53.40 µs
  Window 12: 53.16 µs
  Window 13: 53.45 µs
  Window 14: 53.18 µs
  Window 15: 53.86 µs
  Window 16: 53.50 µs
  Window 17: 53.06 µs
  Window 18: 53.19 µs
  Window 19: 53.48 µs
  Window 20: 53.21 µs
  Window 21: 54.29 µs
  Window 22: 53.37 µs
  Window 23: 53.10 µs
  Window 24: 53.02 µs
  Window 25: 53.09 µs
  Window 26: 53.08 µs
  Window 27: 53.03 µs
  Window 28: 53.28 µs
  Window 29: 53.06 µs
  Window 30: 52.87 µs
  Window 31: 53.22 µs
  Window 32: 53.40 µs
  Window 33: 53.10 µs
  Window 34: 53.13 µs
  Window 35: 53.33 µs
  Window 36: 53.10 µs
  Window 37: 53.54 µs
  Window 38: 52.93 µs
  Window 39: 53.15 µs
  Window 40: 53.15 µs
  Window 41: 53.19 µs
  Window 42: 53.31 µs
  Window 43: 52.99 µs
  Window 44: 53.07 µs
  Window 45: 53.37 µs
  Window 46: 53.48 µs
  Window 47: 53.21 µs
  Window 48: 53.46 µs
  Window 49: 53.19 µs
  Window 50: 53.12 µs
  Window 51: 53.41 µs
  Window 52: 53.63 µs
  Window 53: 52.85 µs
  Window 54: 53.23 µs
  Window 55: 53.02 µs
  Window 56: 53.05 µs
  Window 57: 53.42 µs
  Window 58: 53.96 µs
  Window 59: 53.28 µs
  Window 60: 53.28 µs
  Window 61: 53.31 µs
  Window 62: 53.09 µs
  Window 63: 53.63 µs
  Window 64: 53.07 µs
  Window 65: 53.22 µs
  Window 66: 53.00 µs
  Window 67: 53.18 µs
  Window 68: 53.36 µs
  Window 69: 54.03 µs
  Window 70: 53.10 µs
  Window 71: 53.29 µs
  Window 72: 53.69 µs
  Window 73: 53.15 µs
  Window 74: 53.25 µs
  Window 75: 53.04 µs
  Window 76: 53.14 µs
  Window 77: 53.10 µs
  Window 78: 53.19 µs
  Window 79: 53.55 µs
  Window 80: 54.73 µs
  Window 81: 53.21 µs
  Window 82: 53.09 µs
  Window 83: 53.25 µs
  Window 84: 53.44 µs
  Window 85: 53.19 µs
  Window 86: 53.05 µs
  Window 87: 53.04 µs
  Window 88: 53.07 µs
  Window 89: 53.64 µs
  Window 90: 53.72 µs
  Window 91: 53.06 µs
  Window 92: 53.07 µs
  Window 93: 52.90 µs
  Window 94: 53.13 µs
  Window 95: 53.81 µs
  Window 96: 53.02 µs
  Window 97: 53.12 µs
  Window 98: 53.06 µs
  Window 99: 52.96 µs
  Window 100: 54.12 µs

Long delays after transaction transmit until CS inactive

Seeing this across the board on rockchip transactions. Not sure where this delay is coming from and how to improve, but will attempt to track down.

Clock delays during spi transaction of non-aligned sizes

The change I made to the morse spi driver is below. Which only gets enabled after complete the spi probe to not interfer with the initialization sequence.

Two changes are in here to address rockchip quirks.

  • force all transactions to a minimum size of 64 bytes
  • ensure all transactions larger than 64 bytes are aligned to 8 bytes.

I don’t think this is foolproof and I’ve seen issues where the driver completely errors out.
But have not reliably reproduced it. This might be too agressive altogether so need to rethink.

morse_driver/spi.c

diff --git a/spi.c b/spi.c
index 19f29ab..df582d6 100644
--- a/spi.c
+++ b/spi.c
@@ -55,6 +55,8 @@ struct morse_spi {
        u16 inter_block_delay_bytes;
        /* Maximum number of blks to write per SPI transaction */
        u8 max_block_count;
+       /* whether to enable spi bus quirks */
+       bool enable_spi_bus_quirks;
 };
 
 #ifdef CONFIG_MORSE_USER_ACCESS
@@ -234,6 +236,31 @@ static int morse_spi_xfer(struct morse_spi *mspi, unsigned int len)
                WARN_ON(1);
                return -EIO;
        }
+       
+       // 
+       // need to enforce 8 byte alignment but need to make sure we don't exceed the buffer size
+       // also need to make sure that txn less that 64 bytes are padded to 64 bytes.
+       if (mspi->enable_spi_bus_quirks)
+       {
+               if (len < 64)
+               {
+                       int extra_bytes = 64 - len;
+                       if (len + extra_bytes > MM610X_BUF_SIZE)
+                       {
+                               len = MM610X_BUF_SIZE;
+                       }
+                       len += extra_bytes;
+               }
+               else if (len % 8)
+               {
+                       int extra_bytes = 8 - (len % 8);
+                       if (len + extra_bytes > MM610X_BUF_SIZE)
+                       {
+                               len = MM610X_BUF_SIZE;
+                       }
+                       len += extra_bytes;
+               }
+       }
 
        mspi->t.len = len;
        ret = spi_sync_locked(mspi->spi, &mspi->m);
@@ -1328,7 +1355,8 @@ static int morse_spi_probe(struct spi_device *spi)
        struct morse_spi *mspi;
        const struct of_device_id *match;
        struct morse_chip_series *mors_chip_series;
-
+       /* disable spi bus quirks by default until we have finished initializing the chip */
+       
        match = of_match_device(of_match_ptr(morse_spi_of_match), &spi->dev);
        if (match)
                mors_chip_series = (struct morse_chip_series *)match->data;
@@ -1353,6 +1381,8 @@ static int morse_spi_probe(struct spi_device *spi)
 
        /* preallocate dma buffers */
        mspi = (struct morse_spi *)mors->drv_priv;
+       MORSE_SPI_INFO(mors, "SPI bus quirks disabled");
+       mspi->enable_spi_bus_quirks = false;
        mspi->data = kmalloc(MM610X_BUF_SIZE, GFP_KERNEL);
        if (!mspi->data) {
                MORSE_SPI_ERR(mors, "%s Failed to allocate DMA buffers (size=%d bytes)\n",
@@ -1549,6 +1579,10 @@ static int morse_spi_probe(struct spi_device *spi)
                goto err_irq;
        }
 
+       // now that we have finished initializing we should enable the spi bus quirks functionality.
+       mspi->enable_spi_bus_quirks = true;
+       MORSE_SPI_INFO(mors, "SPI bus quirks enabled");
+
 #ifdef CONFIG_MORSE_ENABLE_TEST_MODES
        if (test_mode == MORSE_CONFIG_TEST_MODE_BUS)
                ret = morse_bus_test(mors, "SPI");

Delays in communication seemingly triggered by MM6108 interrupt

This seems unrelated to my changes because after noticing and going back to some earlier traces I can find the same patterns there as well.


SPI CLK/MOSI/MISO
                                                delay                                       
                                |--------------------------------------|                    
              A1   A2         A3                                       B1    B2       B3    
              ┌┐   ┌┐         ┌┐                                       ┌┐    ┌┐       ┌┐    
              ││   ││         ││                                       ││    ││       ││    
              ││   ││         ││                                       ││    ││       ││    
              ││   ││         ││                                       ││    ││       ││    
──────────────┘└───┘└─────────┘└───────────────────────────────────────┘└────┘└───────┘└────
                                                                                            
                                                                                            
SPI IRQ                                                                                     
──────────┐          ┌────────────────────────────────────────────┐          ┌──────────────
          │          │                                            │          │              
          │          │                                            │          │              
          │          │                                            │          │              
          └──────────┘                                            └──────────┘              

The delay in this specific instance is 2.6ms.
other handpicked samples (in milliseconds) are:

1.3, 0.8, 4.25, 6.61, 1.8, 2.8, 1.7, 6.7, 6.0, 1, 1.4, 2.2, 6, 4, 6.8, 12.3, 18.9, 17.9, 12.8, 18.9, 8

These seem significant enough to make an impact on throughput.

A1

mosi miso
0xFF 0xFF
0x75 0xFF
0x14 0xFF
0xC0 0xFF
0xA0 0xFF
0x04 0xFF
0x89 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFE
0xFF 0x04
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0xCA
0xFF 0xF1
0xFF 0xFF

all 0xFF for the rest of the txn

A2

mosi miso
0xFF 0xFF
0x75 0xFF
0x94 0xFF
0xC0 0xFF
0xB0 0xFF
0x04 0xFF
0xCD 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFE 0xFF
0x04 0xFF
0x00 0xFF
0x00 0xFF
0x00 0xFF
0xCA 0xFF
0xF1 0xFF
0xFF 0xFF
0xFF 0xE5
0xFF 0x0F
0xFF 0xFF

all 0xFF for the rest of the txn

A3

mosi miso
0xFF 0xFF
0x75 0xFF
0x15 0xFF
0x84 0xFF
0x28 0xFF
0x04 0xFF
0x3F 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFE
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0xFF

all 0xFF for the rest of the txn

B1

mosi miso
0xFF 0xFF
0x75 0xFF
0x14 0xFF
0xC0 0xFF
0xA0 0xFF
0x04 0xFF
0x89 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFE
0xFF 0x04
0xFF 0x00
0xFF 0x00
0xFF 0x00
0xFF 0xCA
0xFF 0xF1
0xFF 0xFF

all 0xFF for the rest of the txn

B2

mosi miso
0xFF 0xFF
0x75 0xFF
0x94 0xFF
0xC0 0xFF
0xB0 0xFF
0x04 0xFF
0xCD 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFF 0xFF
0xFE 0xFF
0x04 0xFF
0x00 0xFF
0x00 0xFF
0x00 0xFF
0xCA 0xFF
0xF1 0xFF
0xFF 0xFF
0xFF 0xE5
0xFF 0x1F
0xFF 0xFF

all 0xFF for the rest of the txn

B3

mosi miso
0xFF 0xFF
0x75 0xFF
0x15 0xFF
0x84 0xFF
0x28 0xFF
0x04 0xFF
0x3F 0xFF
0xFF 0xFF
0xFF 0x00
0xFF 0x00
0xFF 0xFF
0xFF 0xFE
0xFF 0x60
0xFF 0x7A
0xFF 0xE1
0xFF 0x61
0xFF 0x19
0xFF 0x1F
0xFF 0xFF

all 0xFF for the rest of the txn

@ajudge any chance you can help me make sense of these transactions to understand what is introducing the multi millisecond delays in communication.

Does Morse happen to have traces of raspberry pi with a spi MM6108 that I can compare to see how the transaction patterns differ.

Lastly is it possible to increase the density of the packets sent between morse and host. I guess aggregating more packets rather than sending them one at a time, which seems to be what is happening.

@alexb your last slide mentions the following:

Can I confirm this is an RK3588 evaluation kit? Was it also SPI attached here?
If the host processor is equivalent to what you’re using on the OrangePi?

@Luisschubert

Generally I see batches of interactions triggered by an IRQ every 100ms with two bursts 25ms apart.

At a glance, these could be the chip reacting to beacons (transmitted by an AP every 102.4ms). You could put a device in monitor mode and capture traffic on the air with wireshark/tcp dump - identify if there are packets following the IRQ patterns you’ve identified.

How can i make better use of the spi driver on the rk3588?

Be prepared to “not be able too”. There is significant variability in SPI host controllers and we can’t guarantee that every system will be able to achieve full throughput.

Which parameters can I change to increase the minimum transfer size to 64 bytes to always use the DMA by just adding padding?

MMC_SPI_BLOCKSIZE is an SDIO protocol definition, and can’t be changed. You can try tweaking SPI_MAX_TRANSACTION_SIZE and SPI_MAX_TRANSFER_SIZE, with the latter just being a generic upper maximum for the total bytes read or written in a single transfer. A single transfer may consist of multiple transactions.
You could tweak SPI_COMMAND_BUF_SIZE further.

It probably won’t help in this situation, but you could also look at tweaking inter_block_delay_bytes, I have a patch I can share which turns this into a module parameter - will dig that up shortly. The inter block delay bytes are used to compensate transaction buffer sizes to account for MM6108 processing delays and ensure the SPI response is captured in a full duplex transaction. If you have “gaps” between frames on the wire, you may see some improvement in reducing this value.

Is it feasible to align all transfers to the 8 byte boundary to ensure no slow downs occur on the SPI bus?

Perhaps. If you examine sdio.c, we have some configuration available for .bulk_alignment via CONFIG_MORSE_SDIO_ALIGNMENT. This was to address a similar problem with some SDIO host controllers desiring a larger memory buffer alignment than our default of 2 bytes. Without increasing this field we would see some host controllers utilise a “bounce buffer”, to copy data into an aligned buffer before transfer. Without having dug too deep yet myself, you might be able to carry over some equivalent logic into spi.c, to ensure the allocated skbs are appropriately aligned.

Is it possible to increase the size of the buffers used to transmit data, is 8192 a hard limit on could this theoritically be larger?

You can experiment with increasing this.

What are the other types of interactions between host and morse that are shorter, aside from the IRQ setting and clearing?

General command transactions - eg we have a health check which runs periodically. Expect there to always be some level of short and long transactions on the bus.

Long delays after transaction transmit until CS inactive

This seems to be fairly common for other SPI host controllers. When using cs-gpios, you may also see delay between CS assertion and transaction start as well. In prior investigations, these seem to be related to various DMA configuration delays.

any chance you can help me make sense of these transactions to understand what is introducing the multi millisecond delays in communication.

These may be normal, we have multi ms delays in captures from our SPI EKH01 (RPi 4b) kits as well. The frequency may be lower, however. Will continue looking.

Does Morse happen to have traces of raspberry pi with a spi MM6108 that I can compare to see how the transaction patterns differ.

I’ve attached a Salae logic analyser capture of an iperf on a SPI configured EKH01. These are capable of 20+Mbps symmetric.
ekh01-spi-iperf.sal (19.9 MB)

See bus-test results as below

[356206.412126] morse_spi spi0.0: clock=50 MHz, delay bytes=250, max block count=10
[356206.419682] morse_spi spi0.0: Bus IO write estimator
[356206.424733] morse_spi spi0.0:     packet size (bytes): 1460
[356206.430393] morse_spi spi0.0:     overhead (bytes):    102
[356206.435958] morse_spi spi0.0:     padding (bytes):     2
[356206.441353] morse_spi spi0.0:     batch(es):           16
[356206.446832] morse_spi spi0.0:     rounds:              10
[356206.453273] morse_spi spi0.0: morse_spi_set_func_address_base: fn[2] addr:0x80100000 access:4b
[356206.545391] morse_spi spi0.0:     Wrote 233600 bytes in 93 ms
[356206.551225] morse_spi spi0.0:     Estimated IO upper bound: 20088 kbps
[356206.557841] morse_spi spi0.0: Bus timing profiler
[356206.562624] morse_spi spi0.0:     packet size (bytes): 1460
[356206.568280] morse_spi spi0.0:     overhead (bytes):    102
[356206.573844] morse_spi spi0.0:     padding (bytes):     2
[356206.579242] morse_spi spi0.0:     rounds:              16
[356206.601191] morse_spi spi0.0:     timing (us)
[356206.605637] morse_spi spi0.0:     bus claim  :    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[356206.617117] morse_spi spi0.0:     bus release:    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
[356206.628596] morse_spi spi0.0:     read 32    :   71   59   59   58   58   59   59   59   58   59   58   58   58   58   59   59
[356206.640076] morse_spi spi0.0:     read bulk  :  455  454  454  454  454  454  454  460  454  454  454  454  454  454  453  454
[356206.651554] morse_spi spi0.0:     write 32   :   60   59   59   59   59   59   59   59   59   59   59   59   59   59   59   60
[356206.663032] morse_spi spi0.0:     write bulk :  455  452  452  455  454  454  454  454  454  454  454  454  454  454  454  454
[356206.674569] morse_spi spi0.0: SKB allocation profiler (100 skbs w/ 1562 bytes)
[356206.681878] morse_spi spi0.0:     alloc: 36 us
[356206.686401] morse_spi spi0.0:     free:  24 us

Lastly is it possible to increase the density of the packets sent between morse and host. I guess aggregating more packets rather than sending them one at a time, which seems to be what is happening.

There was some stats shared before, possibly by @alexb which showed the following:

AGG A-MPDUs : 28773 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This indicates that no aggregation is occuring! Can you configure a device in monitor mode and get a “sniffer trace” of the association sequence?

There are other captures where this is fine, and the system is aggregating correctly.

The high AGG TX param mismatch likely indicates an issue. When this occurs - is the MCS rate fixed? If so, go back to allowing MMRC to determine the rate and see if that reduces the param mismatches.
A sniffer trace may also help here.

Hi Arien,

Thanks for getting back to us. I’ll speak on the parts I can, and have Luis fill in the rest.

Regarding adding the Quectel FGH100M-H on a Rockchip EVK, this is the RK3568-WF EVB (which is controlled by RK3568), and it uses SDIO to communicate with the FGH100M-H module. This data was collected to provide a secondary benchmark, as the RF module is aligned with our design (whereas the Halowlink1 use the azure wave module).

The stats that show no A-MPDU was data taken by Luis a month ago; currently all the data we take shows some form of aggregation.

The high AGG TX param mismatch occurs both with and without a fixed MCS rate. It actually seems to be lower for a fixed MCS rate. See table below from the slides. What specifically causes TX param mismatch, and how can we alleviate it? I’ve attached a sniffer during which tx param is elevated: filename: 103025_AP_HALOW_TX_to_VSN_OPI_7mbps

I’ve also attached sniffers during association for a halowlink1 to halowlink1 as benchmark, and then our custom board to Halowlink1 AP.

Is there any reasonable explanation for the MMRC table of the AP from the previous data package? That under same channel conditions, the MCS rate varies between 5-7 MCS, as well as in MCS7 splitting half the packets between short guard interval and long guard interval.

I’ve noticed that in the sniffer captures, the 802.11ah radio information header that the datarate is 67.5mbps, and that the frame is the last part of an A-MPDU when it is not. So I’ve been generally treating these captures with a grain of salt. Also ACKS seem to be represented as WLAN 33 byte frames, and control frames are not present. Unsure of the cause.

103025_VSN_SPI_Debug_Wireshark.zip (18.4 MB)

Hi @alexb,

Thanks for the wireshark captures, I was interested in seeing the Block Acknowledgement negotiation, but forgot to mention that information is encrypted. If possible, can you recapture with the HaLow network configured for No Encryption (open).

I’ve had a dig through the stats provided in your zip. As you mention, the standout stats are the high AGG TX param mismatches, a significant number of retries, and an increase in ACK timeouts. My suspicions are that the high number of retries and increase in ACK timeouts are more likely to be caused by decode errors.
We would typically see high AGG TX param mismatches when the rates selected for aggregation no longer match, or rates are otherwise out of sync. Still trying to determine what might cause TX param mismatch when the MCS is fixed though, I’m hoping BA session information in the capture can tell me more.

For the test setup you’ve shown me, the mmrc table is a little concerning. It’s showing only about a 40% success rate at 8MHz MCS 7 - perhaps there is some narrow host board noise coupling into the radio or otherwise desensing?
Otherwise, I wonder if rate control is somehow just struggling due to the SPI bus issues.

I would be interested to see what happens when you increase the attenuation - I saw you had a variable attenuator in the setup. Are you able to build a Rate vs Attenuation chart starting the VA at ~30dB and increasing until the station disassociates.

If possible, please share a capture from an open network so I can inspect the BA sessions.

Hi @ajudge ,

I will work on getting the block ack negotiation Wireshark captures.

In addition, is there any to clear the mmrc table beyond reloading the morse kernel module? And on the Halowlink1, how can I set the module parameters? (/sys/module/morse/parameters).

I was also able to recreate the zero aggregation that you had mentioned previously. With a firmware build using the 15.3 morse driver and the quectel module, if I fixed the MCS rate at 7, there was no aggregation. I would request 20mbps and only receive 5mbps. I’ve included the MMRC table and stats for a 8MHz BW and 4MHz BW case. With a 4MHz BW, I would receive 3mbps. Setting a fixed rate of MCS 6 and below resolved this issue. The files here are prefixed with “Master”.

We noticed this because Luis had optimized the SPI driver to achieve higher throughput yet we saw even stranger behavior running at MCS 7 with that branch. When setting a fixed MCS7 rate, and only requesting 1mbps throughput, we would have no aggregation but also 6 retries before success. I’ve included MMRC table, stats and o-scope screenshots of this behavior. This issue also went away with any fixed rate of MCS6 and below.

Hopefully the block ack negotiation captures will help, but as I understand it, the 8.4.1.14 Block Ack Parameter Set field in ADDBA frames is not MCS dependent?

I will look into desense as well. For the rate vs attenuation chart, is there a specific MCS rate/ BW that you would want that set to? Or auto is fine.?

We noticed this on boot as well:

22.878193] morse_spi spi0.0: AMPDU Minimum start spacing: 7
[ 22.878210] morse_spi spi0.0: Morse Minimum Start Spacing offset: 2
[ 22.878227] morse_spi spi0.0: Beamformee STS Capability: 0
[ 22.878243] morse_spi spi0.0: Number of Sounding Dimensions: 0
[ 22.878260] morse_spi spi0.0: Maximum AMPDU Length Exponent: 3

Are there any irregularities present here? AMPDU minimum start spacing seems to be on the longer side of things, but the length exponent is correct.

103125 Debug.zip (383.0 KB)

On HL1 (and all our OpenWrt based builds) you can set most module parameters via uci.

e.g.

uci set wireless.radio0.somemodparam=’1’
uci commit
reload_config

To confirm that the module param is being successfully applied, you can check sysfs or /etc/modules.d/morse. You can also have a look at the MM_MOD_* vars in /lib/netifd/wireless/morse.sh to see exactly what’s supported.

With some further experimentation, it seems that the AGG going to zero on a fixed MCS7 was not caused by the rate, but by the process of reloading the module. Performing sudo rmmod morse, followed by modprobe morse and passing the module parameters (with or without rmmod dot11ah!) would set the correct MCS rate (as confirmed by the MMRC table) but would sometimes turn off aggregation. This is hard to reproduce consistently.

During a period where I could set a fixed MCS 7 rate(without turning off aggregation) I collected the block ack negotiation logs for our setup with the improved SPI driver. I captured both with a fixed rate and with default rate control, and included MMRC, stats, and dmesg logs for both instances.

Fixed rate was set by:

unprovisioned-sensornode:~$ sudo rmmod morse
unprovisioned-sensornode:~$ sudo rmmod dot11ah
unprovisioned-sensornode:~$ sudo modprobe dot11ah
unprovisioned-sensornode:~$ sudo modprobe morse country=US spi_clock_speed=50000000 enable_ps=0 debug_mask=14 spi_enable_bus_quirks=1 enable_fixed_rate=Y fixed_mcs=7 fixed_ss=1 fixed_bw=3

110325 Block Ack Neg Capture.zip (733.0 KB)

Iperf test requested 20mbps UDP traffic.

Please look over these logs, as we are still unable to achieve more than 18mbps throughput at MCS7.

The aggregation continue to seem to be on the lower end

I will continue attempting to achieve the same wireshark capture with aggregation extraneously turned off.

Solved the previous issue with further optimization of the SPI driver. Still have some lingering questions about MMRC performance that I will post when I have more data.

Interested in knowing what your solutions were!

Correct, I’m looking for any other reason the AGG TX param mismatch may be so elevated. The captures you’ve provided look fine to me at the moment though.

Auto please, we want to see the system making the right decisions in the waterfall curve at the upper end of attenuation - in as clean as an environment as you can control.