IntelliBron provides an option for retailers and small businesses by using the small form factor of computers to detect threats. We use Linux as the operating system. During the integration test for the production machine, one of my colleagues installed the Zerotrust component into the OS. The zerotrust component required a specific kernel version, so it asked to update the kernel automatically.

After the update and reboot, the wireless access point functionality failed. I tried to check the hostapd shows an error related to not being able to find the wlan interface. The iw list command shows nothing. The dmesg shows such messages:

$ sudo dmesg | grep -i mt7921e
[    5.908430] mt7921e 0000:04:00.0: enabling device (0000 -> 0002)
[    5.908744] mt7921e 0000:04:00.0: ASIC revision: 79610010
[    6.112649] mt7921e: probe of 0000:04:00.0 failed with error -110

So basically it successfully enabled the wireless chipset, but failed to probe the interface.

This issue appears to be related to the fix submitted by Quan Zhou from Mediatek and commited by Jakub Kicinski into netdev/net repository, and then commited into mainline linux kernel version v6.5-rc2. From the patch we can see the following information:

Fix init command fail with enabled device

For some cases as below, we may encounter the unpreditable chip stats
in driver probe()
* The system reboot flow do not work properly, such as kernel oops while
rebooting, and then the driver do not go back to default status at
this moment.
* Similar to the flow above. If the device was enabled in BIOS or UEFI,
the system may switch to Linux without driver fully shutdown.

To avoid the problem, force push the device back to default in probe()
* mt7921e_mcu_fw_pmctrl() : return control privilege to chip side.
* mt7921_wfsys_reset() : cleanup chip config before resource init.

The patch basically implements the above: return control privilege to chip side, and then cleanup chip config before resource init.

It removed mt7921_wfsys_reset() from dma and removed mt76_get_field() from mcu, and then add the following code into drivers/net/wireless/mediatek/mt76/mt7921/pci.c:

...
...
+	ret = mt7921e_mcu_fw_pmctrl(dev);
+	if (ret)
+		goto err_free_dev;
+
...
...
+	ret = mt7921_wfsys_reset(dev);
+	if (ret)
+		goto err_free_dev;
+
...
...

drivers/net/wireless/mediatek/mt76/mt7921/pci.c

Since the sensor machine currently consists of several components, I am worried that upgrading to kernel 6.5 version would break some components, or break some tuning and optimisation. So I prefer to stay with version 15.5 for now.

From the patch, I initially assumed that powering down the machine (since rebooting just doesn't work and causes oops) would get the wireless chipset recognised by the kernel again. Unfortunately, this isn't the case. It seems that even when the machine is switched off, the hardware is already enabled by the BIOS. So it's not considered a 'reset' of the chipset. We have to manually force the chipset to reset completely.

As with most hardware-related problems, we need to remove power from the machine completely to force a hardware reset of the microcontroller components. In our case, that's the solution. After unplugging the power, waiting a while and plugging it back in, booting the OS, the problem is solved. Now the OS can probe and initialise the wireless chipset.

$ sudo iw list           
Wiphy phy0
        wiphy index: 0
        max # scan SSIDs: 4
        max scan IEs length: 482 bytes
        max # sched scan SSIDs: 10
        max # match sets: 16
        Retry short limit: 7
        Retry long limit: 4
        Coverage class: 0 (up to 0m)
        Device supports AP-side u-APSD.
        Device supports T-DLS.
        Supported Ciphers:
        ...
        ...

If you have a similar problem with Mediatek wireless chipset and need to stay with a Linux kernel lower than v6.5-rc2, you can try the above solution. Or if the hardware has already been delivered to the customer, the helpdesk can suggest the usual solution: shutdown, unplug and re-plug 😊

Share this post