Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimqtt blocking on .loop() method #148

Closed
lee-matthews opened this issue Jan 27, 2023 · 20 comments
Closed

minimqtt blocking on .loop() method #148

lee-matthews opened this issue Jan 27, 2023 · 20 comments
Labels
bug Something isn't working

Comments

@lee-matthews
Copy link

CircuitPython version

W5500-EVB-Pico board 
adafruit-circuitpython-wiznet_w5500_evb_pico-en_GB-8.0.0-beta.6.uf2
adafruit-circuitpython-bundle-8.x-mpy-20230126

Code/REPL

import board
import busio
from adafruit_wiznet5k.adafruit_wiznet5k import WIZNET5K
import adafruit_wiznet5k.adafruit_wiznet5k_socket as socket
import adafruit_minimqtt.adafruit_minimqtt as MQTT
import digitalio
import time

##SPI0
SPI0_SCK = board.GP18
SPI0_TX = board.GP19
SPI0_RX = board.GP16
SPI0_CSn = board.GP17

##reset
W5x00_RSTn = board.GP20

def on_connect(client, userdata, flags, rc):
    print("MQTT connected to broker")

def main():

    # Setup your network configuration below
    # random MAC, later should change this value on your vendor ID
    MY_MAC = (0x00, 0x01, 0x02, 0x03, 0x04, 0x05)
    # Setup your network configuration below
    IP_ADDRESS = (192, 168, 1, 250)
    SUBNET_MASK = (255, 255, 255, 0)
    GATEWAY_ADDRESS = (192, 168, 1, 1)
    DNS_SERVER = (192, 168, 1, 1)

    ethernetRst = digitalio.DigitalInOut(W5x00_RSTn)
    ethernetRst.direction = digitalio.Direction.OUTPUT

    # For Adafruit Ethernet FeatherWing
    cs = digitalio.DigitalInOut(SPI0_CSn)
    # For Particle Ethernet FeatherWing
    # cs = digitalio.DigitalInOut(board.D5)
    spi_bus = busio.SPI(SPI0_SCK, MOSI=SPI0_TX, MISO=SPI0_RX)

    # Reset W5500 first
    ethernetRst.value = False
    time.sleep(1)
    ethernetRst.value = True

    # Initialize ethernet interface with DHCP
    # eth = WIZNET5K(spi_bus, cs)
    # Initialize ethernet interface without DHCP
    # *****  For DHCP to work correctly. Stop the code properly - STOP button, then click on play
    # otherwise WIZNET5K function call hangs and no IP address is recovered
    eth = WIZNET5K(spi_bus, cs, is_dhcp=False)
    eth.ifconfig = (IP_ADDRESS, SUBNET_MASK, GATEWAY_ADDRESS, DNS_SERVER)

    print("Chip Version:", eth.chip)
    print("MAC Address:", [hex(i) for i in eth.mac_address])
    print("My IP address is:", eth.pretty_ip(eth.ip_address))

    socket.set_interface(eth)

    MQTT.set_socket(socket, eth)
    
    g_mqtt_client = MQTT.MQTT(
        broker="broker ip",
        port=1883,
        username="user",
        password="password"
    )

    g_mqtt_client.on_connect = on_connect
    print("connecting")
    
    g_mqtt_client.connect()
    
    g_mqtt_client.subscribe("HELLO/WORLD")
    
    print("connected")
    while True:
        g_mqtt_client.publish("DETECTOR/ZONE1/ALIVE", "alive")
        print("loop")
        g_mqtt_client.loop()

if __name__ == "__main__":
    main()

Behavior

I'm having some problems with MQTT with circuitpython.

I'm using circuitpython on a W5500-EVB-Pico board to read data from GPIO inputs and then publish that data using MQTT.

I notice that my call to g_mqtt_client.loop() is blocking.

If however I publish a message on the topic "HELLO/WORLD", sometimes the loop() method unblocks and continues until it reaches the loop() method again. If I set the timeout parameter on the loop() method to 1, this doesn't change anything.

Description

No response

Additional information

No response

@lee-matthews lee-matthews added the bug Something isn't working label Jan 27, 2023
@dhalbert
Copy link
Contributor

This may be due to library issues in the Wiznet or the MQTT libraries. I'm going to move this to the MQTT library for now. Do you see the same thing if you use WiFi (on another board, obviously)?

@dhalbert dhalbert transferred this issue from adafruit/circuitpython Jan 29, 2023
@LiveSparks
Copy link

Observed the same issue on my board. Additionally, if no messages are published for a while the following error is received.

Traceback (most recent call last):
  File "main.py", line 70, in <module>
  File "adafruit_minimqtt/adafruit_minimqtt.py", line 835, in loop
  File "adafruit_minimqtt/adafruit_minimqtt.py", line 862, in _wait_for_msg
  File "adafruit_minimqtt/adafruit_minimqtt.py", line 974, in _sock_exact_recv
MMQTTException: Unable to receive 1 bytes within 60 seconds.

I found another reference to a similar issue #70
In that thread the issue was fixed but it was only am implementation for the ESP32:

if not self._backwards_compatible_sock:
    # CPython/Socketpool Impl.
    # .... code
else:  # ESP32SPI Impl.
    # This will timeout with socket timeout (not keepalive timeout)
    # ... code

I tried to force the library to use the implementation under else with the following code in my main file:

setattr(sock, "recv_into", False)
sock = MQTT.set_socket(sock, eth)

But setting the attribute to either True or False doesn't seem to affect anything. The if block is always executed.

@chrisgilldc
Copy link

Seeing this same issue, my projects gets to the loop and then stalls. Setting a timeout doesn't help.
Info: Adafruit CircuitPython 7.1.1 on 2022-01-14; Adafruit Metro M4 Airlift Lite with samd51j19
This is a pretty frustrating showstopper.

@claycooper
Copy link

I've come across what I believe is the same issue on PyPortal but only get a MMQTTException error if I specify a timeout. With the default timeout the loop blocks.

@d-Rickyy-b
Copy link

Same issue here on the Metro M4 airlift. loop is blocking and hence no further code is being executed for ~11 seconds while the M4 constantly receives new MQTT messages. Is there some workaround for this issue or is someone actively investigating this?

@astrobokonon
Copy link

I found myself confused about all of this too, and at least to me I think the various timeouts are confusing at best; for both types of sockets in _sock_exact_recv the read_timeout parameter is explicitly set to the keep_alive parameter and that's used to trigger an MMQTTException down in there, e.g.

if not self._backwards_compatible_sock:
# CPython/Socketpool Impl.
rc = bytearray(bufsize)
mv = memoryview(rc)
recv_len = self._sock.recv_into(rc, bufsize)
to_read = bufsize - recv_len
if to_read < 0:
raise MMQTTException(f"negative number of bytes to read: {to_read}")
read_timeout = self.keep_alive

When I set keep_alive to a shorter number, and make sure to add a timeout= to the loop call, it at least is a more workable experience assuming that I catch the triggered MMQTTException appropriately for the times when no messages are received.

Some overall clarity on whether any of this is the intended functionality would help!

@thenoname
Copy link

Also seeing this with the PyPortal Titano

@cskwrd
Copy link

cskwrd commented May 9, 2023

I believe I found the underlying problem for ESP32SPI users (I'm using a PyPortal). It is documented in #165.

My ESP32SPI workaround is documented there as well.

@EisFrei
Copy link

EisFrei commented Jul 24, 2023

I have a similar issue on "Adafruit CircuitPython 8.2.0 on 2023-07-05; Raspberry Pi Pico W with rp2040".

I was able to change the duration of the time blocked by changing socket_timeout.

As socket_timeout has a minimum/default value of 1 second it really messes up the LED fade timings in my project.

This will show "Looped" every 5 seconds.

mqtt_client = MQTT.MQTT(
    broker="broker.local",
    port=1883,
    socket_pool=pool,
    socket_timeout=5,
)

while True:
    mqtt_client.loop()
    print("Looped")

@adsherman09
Copy link

adsherman09 commented Aug 25, 2023

I am running into a similar issue on the Feather M4 Express on 8.2.4 -- having inspected the code I am not 100% sure how this is supposed to engage with the loop(timeout=0) timeout field.

I think that the problem is that the timeout functionality of MQTT.loop(timeout=0) is being caught behind the MQTT._sock_exact_recv() method that doesn't seem to respect this timeout. When calling something like while True: print(myMQTT.loop(timeout=3)), it blocks until the keep_alive timeout is engaged (as per behavior around line 1120), or a message is recieved. This method should return promptly after the timeout is surpassed, but there seems to be no mechanism to pre-emptively end the MQTT._sock_exact_recv() method. The read_timeout in _sock_exact_recv() needs to take the minimum between the keep_alive and "loop_timeout" variables.

I can't actually test a fix because I don't know how to compile the python code into the .pym files, but I would expect that the MQTT._wait_for_msg() and MQTT._sock_exact_recv() methods should be passing through a timeout argument, and using the minimum timeout available for determining when to return.

The workaround that we have is to shorten keep_alive, which isn't really what it is for, or to be subscribing to and sending a clock packet which is also not optimal.

@FoamyGuy
Copy link
Contributor

@adsherman09 it should be possible to test / develop without compiling code to .mpy format.

plain text .py files should also work and execute just fine on the microcontroller. The mpy compilation step is used just to save extra space, but it's not a requirement for code to execute, the plain text python code can run as well.

Make sure that you are using the latest ESP32SPI library (if you are using an ESP32SPI "Airlift" based device), as the latest release of that does contain changes that could be relevant to the blocking loop issue. I was thinking that when I last tested this with the latest version of that library that there was no more blocking issue occurring. But I tested many variations of different things around that time, perhaps I'm mis-remembering the outcome.

If you've tried with latest versions of the libraries and are up for poking around the code and potentially submitting a PR with the fix that you've outlined, that would be much appreciated. Submitting the PR also does not require compiling to .mpy format, only plain text python code is checked in to the repo. We have an automated actions task that does the mpy compilation whenever a new release is made, so it's not something that is typically done manually.

@adsherman09
Copy link

adsherman09 commented Aug 25, 2023

@FoamyGuy thanks, we tested it. Realized after the fact that the mpy files that we were digging in were in the WizNet5k library anyway, and that this was in py format 🤦. I added the timeout argument to the places where I thought it would matter, and while it functioned as expected, we were surprised to find that publishing still took something like 2 seconds even with timeout=0 (aka single pass). The whole time the device was blocked so we couldn't even continue to take readings alongside to send in cached format.

We're using a Feather M4 + Ethernet Featherwing -- I am honestly not sure if that is an ESP32SPI "Airlift" device, but I believe it is not.

I don't want to submit a PR because I am not confident that my changes wont break other things!

Update: I found the source of our additional hidden delays, and in order to utilize this function correctly the socket_timeout parameter of the MQTT object needs to be set very small. 0 doesn't seem to work correctly, but 0.001 works close enough to as expected. So even with loop(timeout=0), it will have delays that are at least 2x the socket_timeout due to the adafruit_wiznet5k_socket's socket.recv() call. Calling socket.setdefaulttimeout(0) had no apparent effect as it seems to be overwritten by the default value of 1 in adafruit_minimqtt.MQTT().

I have attached my adjusted code in lieu of a PR, you can find my adjustments in line 1007, the method definitions of _wait_for_msg and _sock_exact_recv, and throughout the text of _sock_exact_recv between lines 1096 and 1153.

adafruit_minimqtt.txt

@canuckdev
Copy link

I notice the adafruit-circuitpython-bundle-py-20230307 version of the minimqtt library does not have this problem with the loop method.
adafruit_minimqtt.py.txt

@adsherman09
Copy link

That version of the file definitely has some more timeout propagation, it passes the loop timeout to _wait_for_msg and then calls self._sock.settimeout(timeout), but not until after the first packet is received. If it is kosher to call self._sock.settimeout(timeout) it might be prudent to put it in the MQTT.loop() method itself. Also it doesn't seem to actually function correctly with a true timeout=0.

@Kheirlb
Copy link

Kheirlb commented Jan 4, 2024

Just hit this myself

@buelowp
Copy link

buelowp commented Jan 5, 2024

Not to pile on, but seeing this exactly on a Feather 2040 running. Without loop, as long as I send something more often than ever 60 seconds, it works, which is fine, I am sending some sensor data more often than that. But I can't receive anything right now which I would like to do.

None of the timeout= and socket_timeout= attempts listed above helped.

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
0;🐍code.py | 8.2.9

bundle 20231220

@Kheirlb
Copy link

Kheirlb commented Jan 13, 2024

Looks like there is a PR fix by @rjauquet! Who can review and approve?

@pibees
Copy link

pibees commented Jan 14, 2024

wow, cant wait, been fighting this for some time.

@pibees
Copy link

pibees commented Jan 15, 2024

Looks like there is a PR fix by @rjauquet! Who can review and approve?

where can you see this?

@buelowp
Copy link

buelowp commented Jan 15, 2024

You can find it at PR184. Waiting on someone with merge rights to merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.