Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race condition in actionlib:action-client #8

Open
airballking opened this issue Dec 5, 2013 · 8 comments
Open

race condition in actionlib:action-client #8

airballking opened this issue Dec 5, 2013 · 8 comments

Comments

@airballking
Copy link
Contributor

I discovered a race condition in the action-client implementation.

What's happening:

  • I am sending out a goal which is cancelled from the client after some feedback messages have been received.
  • Directly afterwards, I'm sending out another goal to that server which is again cancelled after some feedback messages.
  • So far, everything works fine.
  • If I repeat this procedure, the server responds fine. However, after finishing the 2nd goal, the client signals warning that the first goal-handle has been lost.
  • From now-on, the first of the two goal-handles is repeatedly lost. However, the server execution is fine.

I created a minimalistic client provoking this behaviour here:
https://github.com/airballking/actionlib_client_tests

Additionally, I added some printouts to some action-client-internal functions to see what's going on. Here's my printout:

** START PRINTOUT **

*** FIRST CALL WITH NEWLY-STARTED SERVER CLIENT GOES FINE ***
ACTIONLIB-CLIENT-TESTS> (provoke-race-condition)

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289470430000000

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289480790000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000, server-state=PREEMPTED, client-state:DONE
maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000, server-state=PREEMPTED, client-state:DONE
maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000
NIL
:ABORTED

*** SECOND CALL LOSES GOAL-HANDLE ***
ACTIONLIB-CLIENT-TESTS> (provoke-race-condition)

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289611860000000

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289621880000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000, server-state=PREEMPTED, client-state:DONE

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000, server-state=PREEMPTED, client-state:DONE
maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000
[(ACTION-CLIENT STATE) WARN] 1386228968.586: Goal `/fibonacci-lisp-client_GOAL_13862289611860000000' is lost. Terminating goal.

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000, server-state=LOST, client-state:DONE
maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000
NIL
:ABORTED
ACTIONLIB-CLIENT-TESTS>

** END PRINTOUT **

It seems the result for the first goal-handle (in the second run) is never received. As a result, the goal-handle is never removed and hence eventually lost. Echoing the result-topic, I verified that the server did send result messages out.

Anybody has an idea why the result subscriber of action-client never processes the second result message?

Additionally, when aborting the action-client seems to return to the caller before it received status-updates or result. Is this desirable?

@airballking
Copy link
Contributor Author

After further digging: The result-callback of the action-client does receive the result-message. However, the call to (has-goal ... ) inside the callback fails. Seems, the goal has not been or is no longer registered when the result-message comes in. I'll keep looking.

@moesenle
Copy link
Contributor

moesenle commented Dec 5, 2013

In case of cancellation, does your server send a result message? If yes, is
that correct behavior in terms of the action server protocol?

On Thu, Dec 5, 2013 at 10:05 AM, Georg Bartels [email protected]:

After further digging: The result-callback of the action-client does
receive the result-message. However, the call to (has-goal ... ) inside the
callback fails. Seems, the goal has not been or is no longer registered
when the result-message comes in. I'll keep looking.


Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-29881880
.

@airballking
Copy link
Contributor Author

Yes, the server sends a result message after it got into state PREEMPTED. PREEMPTED is a terminal state of the protocol. The protocol requires the server to send a result message once it reached a terminal state. So, all seems fine from the server.

@moesenle
Copy link
Contributor

moesenle commented Dec 5, 2013

And is the client removing the goal from the set of tracked goals on
cancellation? If yes, remove that. Is the cancelled goal still on the
status topic? If no it's a server issue.

On Thu, Dec 5, 2013 at 10:47 AM, Georg Bartels [email protected]:

Yes, the server sends a result message after it got into state PREEMPTED.
PREEMPTED is a terminal state of the protocol. The protocol requires the
server to send a result message once it reached a terminal state. So, all
seems fine from the server.


Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-29884344
.

@airballking
Copy link
Contributor Author

  • cancel-goal only sends out the cancel-message. It does not remove the goal-handle.
  • The cancelled goal is still on the status topic from the server. The server stops publishing it after the succeeding goal has been cancelled.

I get the feeling that my problem are multiple instances of action-clients on the lisp side. If you look into my example, I create a new action-client every time I call (provoke-race-condition). I played around in the REPL and lost goal-handles appeared once I created the second/third action-client. With only one action-client, I can send and cancel as many goals as I want.

@airballking
Copy link
Contributor Author

I have used my applications without problem (and quick send/cancel of goals) for a week now. All fine. So, I guess it's paramount to create just one instance of action-client per action.

Can you think of an easy way of enforcing this? Topics keep a register of subscriptions. We could do the same for action-clients and return an error every time a user wants to re-create a specific action- client. It's limiting the behaviour of the lib but clarifying to the user what the intended usage is.

@moesenle
Copy link
Contributor

I don't like singletons and I multiple action clients for the same action
should definitely possible. Just to be sure: was anything bad happening ore
were there just warnings? Could that even be a bug in roslisp?

On Fri, Dec 13, 2013 at 10:09 AM, Georg Bartels [email protected]:

I have used my applications without problem (and quick send/cancel of
goals) for a week now. All fine. So, I guess it's urgent to create just one
instance of action-client per action.

Can you think of an easy way of enforcing this? Topics keep a register of
subscriptions. We could do the same for action-clients and return an error
every time a user wants to re-create a specific action- client. It's
limiting the behaviour of the lib but clarifying to the user what the
intended usage is.


Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-30495459
.

@airballking
Copy link
Contributor Author

Nothing bad happened. I just got warnings about dropped goal-handles but the goal was sent out, the server performed correctly, and cancel also correctly stopped the server.

I'm not sure where this bug is coming from, to be honest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants