Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidStateError in RoborockFuture using roborock client #222

Open
allenporter opened this issue Aug 20, 2024 · 3 comments
Open

InvalidStateError in RoborockFuture using roborock client #222

allenporter opened this issue Aug 20, 2024 · 3 comments

Comments

@allenporter
Copy link

I don't know the exact steps to reproduce this, but noticed this while changing a device settings in HomeAssistant. It seems like this is an internal consistency issue and not something that that the caller should be able to introduce, but i'm not positive.

2024-08-20 03:49:28.365 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback RoborockFuture._resolve((None, None)) (None):   File "/usr/local/lib/python3.12/asyncio/base_events.py", line 639, in run_forever
    self._run_once()
  File "/usr/local/lib/python3.12/asyncio/base_events.py", line 1977, in _run_once
    handle._run()
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.12/asyncio/selector_events.py", line 970, in _read_ready
    self._read_ready_cb()
  File "/usr/local/lib/python3.12/asyncio/selector_events.py", line 1027, in _read_ready__data_received
    self._protocol.data_received(data)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/local_api.py", line 39, in data_received
    self.on_message_received(parser_msg)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/version_1_apis/roborock_client_v1.py", line 444, in on_message_received
    queue.resolve((data.payload, None))
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/roborock_future.py", line 22, in resolve
    self.loop.call_soon_threadsafe(self._resolve, item)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/roborock_future.py", line 19, in _resolve
    self.fut.set_result(item)
asyncio.exceptions.InvalidStateError: invalid state

I believe this happened when i adjusted the same property two times in a row in a short succession (with a short pause between)

@Lash-L
Copy link
Collaborator

Lash-L commented Aug 20, 2024

I have also seen this issue but have not had a good way to recreate or debug. Would be open to any suggestions or even some debug improvements to figure out the cause

@allenporter
Copy link
Author

allenporter commented Sep 29, 2024

My honest take is this code is quite complex and needs to be simplified. I would suggest:

  • making breaking changes to drop all the sync APIs and focus on simpler async APIs.
  • reduce the class hierarhcy so there are not multiple base classes involved, and use composition rather than inheritance.
  • adding logging for each outgoing request seq id and received resopnse seq id could help show the issue happening in more detail.
  • add guards to not send a message with a request id that has already been allocated. One example of a potential bug is that the choosing of a request id can pick an id that already exists since its just picking random numbers.
  • Add an explicit guard here when the future is already completed.
  • Add a guard in _async_response so that it doesn't overwrite an existing future if the request id is already in use
  • There are cases where a response is received and it checks against the queue and pulls out the message, but then if it doesn't already exist it continues anyway. Why would this ever be allowed to happen? seems like a bug or malformed response so it also needs logging rather than being ignored since it could hide concurrency bugs
  • Remove the multiple levels of futures happening between asyncio.ensure_future then a sync method _async_response which calls an async method _wait_response which directly awaits on a future. I don't think all these levels of indirection are needed
  • Don't pass around tuples of (response, exception). Just use the future built in exceptions

@Lash-L
Copy link
Collaborator

Lash-L commented Oct 7, 2024

My honest take is this code is quite complex and needs to be simplified. I would suggest:

  • making breaking changes to drop all the sync APIs and focus on simpler async APIs.
  • reduce the class hierarhcy so there are not multiple base classes involved, and use composition rather than inheritance.
  • adding logging for each outgoing request seq id and received resopnse seq id could help show the issue happening in more detail.
  • add guards to not send a message with a request id that has already been allocated. One example of a potential bug is that the choosing of a request id can pick an id that already exists since its just picking random numbers.
  • Add an explicit guard here when the future is already completed.
  • Add a guard in _async_response so that it doesn't overwrite an existing future if the request id is already in use
  • There are cases where a response is received and it checks against the queue and pulls out the message, but then if it doesn't already exist it continues anyway. Why would this ever be allowed to happen? seems like a bug or malformed response so it also needs logging rather than being ignored since it could hide concurrency bugs
  • Remove the multiple levels of futures happening between asyncio.ensure_future then a sync method _async_response which calls an async method _wait_response which directly awaits on a future. I don't think all these levels of indirection are needed
  • Don't pass around tuples of (response, exception). Just use the future built in exceptions

That's probably all fair. It's just kind of been the nature of constantly reverse engineering and learning about the underlying api and most improvements being iterative opposed to a clean rewrite.

For instance, at first we didn't know how to do local api, then we figured that out. The same thing with the drastically different A01 api vs the V1 api. Things like that where slowly changes were made and the code got more and more complex.

@Lash-L Lash-L mentioned this issue Oct 28, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants