Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unconfigured lifecycle state management #47

Closed
jginesclavero opened this issue Sep 29, 2020 · 5 comments
Closed

Unconfigured lifecycle state management #47

jginesclavero opened this issue Sep 29, 2020 · 5 comments

Comments

@jginesclavero
Copy link

Hi again @norro!

Yesterday, I had a meeting with @chcorbato , and we talked about the case where a lifecycle node transits to ErrorProcessing.
Following the documentation and the lifecycle node diagrams, if a node has an error it transits to ErrorProcessing. Then, based on this processing result, it can go to the Finalized state or Unconfigured state. Do you think that the system_modes must manage the unconfigured state of the lifecycle nodes? This management covers this situation and the start-up situation, where the nodes are in the unconfigured state.

Thank you!

@jginesclavero jginesclavero changed the title Unconfigure lifecycle state management Unconfigured lifecycle state management Sep 29, 2020
@chcorbato
Copy link

Hi again @norro!

Yesterday, I had a meeting with @chcorbato , and we talked about the case where a lifecycle node transits to ErrorProcessing.
Following the documentation and the lifecycle node diagrams, if a node has an error it transits to ErrorProcessing. Then, based on this processing result, it can go to the Finalized state or Unconfigured state. Do you think that the system_modes must manage the unconfigured state of the lifecycle nodes? This management covers this situation and the start-up situation, where the nodes are in the unconfigured state.

Thank you!

This is in the context of our exemplary case of the laser_driver error. We want to elaborate on the layered approach we discussed in the last MROS meeting. This is how I interpret our desired design (please comment if something is not correct or clear):

  1. First the laser_driver code for handling errors tries to recover from the error in the ErrorProcessing transition state.

(from here it is a related but different issue #48)

  1. If it does not succeed (I guess that means node does not transition to Active), the ModeManager tries to recover from the error using the feature/rules. For this, @jginesclavero is adding a rule in the SystemModes file of our system.
  2. If there is no rule, or there is but after applying it the alternative MODE(s) of the laser_driver are not reached either, the ModeManager reports to the Metacontroller that the corresponding (sub)system(s) MODE(s) are not reachable.
    (see issue for the continuation of the handling of errors at the higher layers)

@norro
Copy link
Collaborator

norro commented Sep 30, 2020

I agree with 1. and 2.
However, the mode manager will not actively report that a certain mode is not available. With #43, however, it will be possible for the meta control to get the information, which modes are available.

This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.

@chcorbato
Copy link

This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.

Very good point indeed, so far we are not accounting for timing issues.
How do we include timing constraints for node management? These could be considered metacontrol requirements for the robotic application:

@rsanz
Copy link

rsanz commented Oct 6, 2020

This implies the incorporation of some timestamping and temporal [interval] reasoning. We can incorporate some concepts from e.g. UML2 or UML MARTE.

@chcorbato
Copy link

However, the mode manager will not actively report that a certain mode is not available. With #43, however, it will be possible for the meta control to get the information, which modes are available.

I agree. So the current design proposal is that Mode Manager just inform about available and reachable modes, and Metacontrol is responsible for inferring from that about the success of reconfiguration actions.
See below for how to model that reasoning.

This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.

Very good point indeed, so far we are not accounting for timing issues.

How do we include timing constraints for node management? These could be considered metacontrol requirements for the robotic application:

This implies the incorporation of some timestamping and temporal [interval] reasoning. We can incorporate some concepts from e.g. UML2 or UML MARTE.

@rsanz can you point to the specific concepts?
I think we need to specify some modelling requirements (see below) to evaluate which concepts we need.

Modelling reqs for reconfiguration actions and timing

  • Metacontroller needs info on how long a reconfiguration action can take, to decide its success or failure. This depends on:
    • type of reconfiguration action: mode change, re-mapping, deploy node (we decided all nodes would be deployed, req for mode manager)
    • the node/ susbsystem reconfigured

We could provide this information in the MROS model of the system (Darko's metamodel) as we are doing with the QAs, but I think it is more related to the specific software components that to the application logic.

We could define default values in the MRSO metacontroller to assume when no info is provided.
E.g. assume node mode change takes up to 2secs, and subsystem mode change can take up to 5secs
@jginesclavero @lbajo @marioney @fmrico what numbers are reasonable for navigation2 nodes?

@norro norro closed this as completed Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants