Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a robot queuing system #214

Open
mxgrey opened this issue Jun 9, 2022 · 7 comments
Open

Introduce a robot queuing system #214

mxgrey opened this issue Jun 9, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@mxgrey
Copy link
Contributor

mxgrey commented Jun 9, 2022

There are cases where multiple mobile robots may be simultaneously blocked waiting for the same resource, such as a lift, workcell, or access to a pickup/dropoff point. In those cases it is useful to have a formal queuing system that allows the robots to smoothly and consistently approach the resource 1-by-1 as it becomes available to each of them.

We should develop a queuing system by first developing the reservation system and then integrating the reservation capabilities with the traffic planner. We should also add a feature to the traffic editor that allows system integrators to graphically define what the queuing behavior should be.

@itsiashu
Copy link

Have been looking into some of core designs, features and current issues and challenges.
This is one such important aspect on how to handle traffic negotiation.

I would suggest a simpler design and approach based on my experience that can handle this well.

Idea: Rather than looking at this as a problem of path planning and resolution, I would look at it from perspective of behavior co-ordination.

Design: I would introduce a Coordinator control software(essentially a ROS2 Node pkg, think of it like Traffic Light Agent/Police) .

That would provide a mechanism of keeping an internal state of each robot/agv here. State would be like a "simple state machine" for each participant as and when they appear/approach.

[States could be as below:
enum class {"UNREGISTER", "REGISTER", "ACTIVE", "FAULTED"}]

Along with I will introduce an "Active write" flag.

With designated path/track (a bit floated inside as well) as shown in my rough-sketch. when bot "A" approaches the Zone/Cafe to pick up Coke, it will be registered with Traffic-Coordinator as "Registered".
It will "request" for "active" write permission that will be granted by Traffic-Coordinator by setting the active flag for the Bot.

The other bot, bot "B" approaches in same fashion (perhaps from other entry or similar), will get registered by Traffic-Coordinator and will maintain it's own internal states.
Bot "B"'s request for active participation will be denied and hence Bot will B will stop moving. Its Cycle States could be "SPOTTING_TO_PICKUP" but movement cannot happen as long as it doesn't have an active participation flag.

BOT "A" will pick up, move out and will release "active" flag and after exit will be unregistered by Traffic-Coordinator.
Traffic-Coordinator and BOTs maintain/communicate through HeartBeat messages. That HB message will have "Active request" set, hence, now, Traffic-Coordinator will grant the "Active" flag to Bot "B"

Bot "B" upon receiving active flag will happily start moving to pickup and will not have anymore conflicts.

Scale up:
Approach can be scaled up, when there are multiple PICKUP points in Cafe so, simultaneously there may more than one Active Bots.

Advantage: No Lock contention, simpler resource management, internal states per bot design, better performance

Happy to drill down and discuss more on my design, approach HLD etc, if required @mxgrey @Yadunund

@itsiashu
Copy link

Traffic_Negotiation_rmf

@arjo129
Copy link
Member

arjo129 commented Dec 24, 2024

@itsiashu my understanding is that you are describing some sort of centralized traffic manager. I'll let @mxgrey and @Yadunund respond to that. In the current generation of open-rmf we made certain design decisions based on our needs to deploy in our specific settings. With future generations we would hope to add more flexibility and allow schemes like what you are suggesting.

Reading your proposal one question that jumps to mind is what if a path is permanently blocked by a robot, does that mean other robots don't re-route?

On a side note there is now a reservation system in open-rmf: #325

Im not sure if we want to tackle this queuing behavior in the current generation. It should not be too hard if we extend the reservation node a little. In any case the reservation node does manage queues.

@itsiashu
Copy link

@arjo129
Thank you for a quick eval.

On your question:

Reading your proposal one question that jumps to mind is what if a path is permanently blocked by a robot, does that mean other robots don't re-route?

That comes under Health monitoring and maintenance via Heart Beat. Think of it like a practical scenario,
When can a Bot block a path/zone - only when there is some issue with that BOT, that's a BOT breaking down and/or losing connectivity and/or providing error/fault code for that to report periodic updates to Traffic Monitor.

In such cases, it's not advisable either way, to allow the other traffic on designated path or in zone. The typical use cases and business logic, I have dealt with, is put that Zone/Path in failed condition by reporting "error".

Yes, Zone would publish their own status via Traffic Coordinator service that other BOTs/AGVs are subscribed to.

Next There are two ways to deal with and come out of this:

  1. When BOT is broken, fix/service that BOT
  2. a.) Let operator clear that error out through UI/backend-CLI triggers after that
    b). Or, when broken BOT is repaired, it starts updating with current healthy hearbeats, so, upon receiving Zone/Path can do self healing and traffic will start moving.

Having said that, I implemented both designs and I would say 1). and 2.a) are more safer and allow more control to operators via monitoring service. Implementation and design wouldn't be too difficult, via QT/C++ Simulation and design can be provided for Operator control.

Design itself will be modular and comprehensive to deal all such cases:

  1. Build a Traffic Coordinator service
  2. Bot register the internal states monitored per BOT by Traffic Co-ordinator
  3. Traffic Co-ordinator will be per Zone based (and can be deployed per Zone with different settings like Zone ID etc with different BOTs to handle for that purpose)
  4. Deploy Traffic Co-ordinator service on the same Node/PC or on different Node to distribute task handling

@Yadunund
Copy link
Member

The behavior described is certainly fine for managing a given queue and if we design the APIs to be modular, it can be one type of behaviors users can rely on (ie allow queuing policy/behaviors to be customizable).
Although having a robot reach a zone/queue only for it to learn that the queue is full or it needs to wait in line for a long time is sub-optimal- the robot might have been able to perform another task instead. Incorporating queue policies, sizes and states into plans and updating task allocations dynamically could lead to more optimal operations.

The bigger problems to tackle imo are

  • Describing conventions and policies for the queues (eg. robots can enter and leave the queuing area any time vs FIFO and robots have to remain in the queue) and API designs to support these behaviors.
  • Factoring in waiting times in queues when allocating tasks and re-allocating tasks to accommodate delays or errors.

@itsiashu
Copy link

Agree. We can take a mix of approach here for our business logic.

  • Policy and rules will decide more on enter queue/zone or exit out.
  • Some of the rules can be based on Size/shape/avg speed

A running Queue length can be important too.

  • if there are already sufficient bots in queue, it's advisable to re-purpose the current BOT for another task/Zone Area
  • Queue size can be set based on use cases and can be configured based on dispatch/tasks

If we allow for exiting, perhaps, we can do first for the recent BOTs that were newly added to allow better movement
Else, a designated exit track might be needed.

If we keep queue-size small enough at a time (say e.g. 3), Wait and exit will be easier to handle. This also help to keep Zone not much congested at a time during pick ups.

If we allow Bots to enter zone while there may be one or more Bots, there are few more things to be handled:

  1. Allow safer navigation around each other [considering A is broken and B is moving around given based on policy that it can do so safely]

  2. When a BOT stops it can release "Active flag"
    2 a. First simple version can be with only one "Moving" BOT at a time and that will be active to "SPOT_FOR_PICKING" and next "PICKING". Others will be "PAUSED" and only "REGISTERED".
    This will allow safe and easier movement for co-ordination.

2 b. Next version could be with multiple pick points so, there more be more than "ACTIVE" BOT but each "SPOT_FOR_PICKING" for a different PICK_UP Point(uuid)
This also means, ordering is not important so, based on avg speed and pickup points, they can independently exit out. If they catch up or come closer in zone within observable distance of each other, the one farther from exit point can yield to to other.

Depending on practical considerations, 2b) could be medium to quite tricky but deployment scenarios and business logic will guide.

@itsiashu
Copy link

itsiashu commented Dec 26, 2024

@arjo129

On a side note there is now a reservation system in open-rmf: #325

Thank you for providing the pointer. I reviewed the changes. Good level of details there!!

Yes, with requirements, it will be an extension of handling queuing spots as well either by current Reservation Node or a separate Queuing node for Zone conflict resolution purpose.
Queuing spot reservation handling, request, tickets and claims can be identically similar as well. A good idea to separate this concern with a separate pkg, as ideally Waypoints processing and maintenance of related code will be different.

Im not sure if we want to tackle this queuing behavior in the current generation

Sure, timeline and priorities will drive. And based on that I can be a 2 factor scheme:

  1. Queuing Reservation system for Zone Conflict Resolution
  2. With in Zone, build behavior co-ordination policy

Nice thing about it, that it can be abstracted out as a separate fleet-adapter API for customers, that can be enabled via config --- bool _zone_queing_enable = true/false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants