PPEs are sometimes not populated - solvable changed to 0 #96

Affie · 2022-04-06T19:06:51Z

I have a graph with one of the variables not producing a PPE and the result is skewed.

vars[5] = Dict{String, Any}(
"label" => "x1", 
"ppes" => Any[], 
"variableType" => "RoME.Pose2", 
"tags" => Any["VARIABLE"], 
"timestamp" => "2022-04-04T08:28:01.159Z", 
"_version" => "0.18.1")

My theory is: solvable is changed to 0 and therefore it's not included in the solve, hence no ppe.

Reference sessions:
robotId = "SDKjl_f97d"
sessionId = "Tutorial2_b376"
and
robotId = SDKjl_cdfe
sessionId = Tutorial1_5bfa

The text was updated successfully, but these errors were encountered:

jim-hill-r · 2022-04-06T21:09:06Z

I have seen this once before. @dehann if a solve crashed mid-way would it be possible solve was set to zero and never reverted? This is my first guess at the cause.

dehann · 2022-04-07T06:52:34Z

if the solve crashed then all bets are off. We currently have no safety net for this. I think we should do two fixes for these cases:

Status to user the solve failed and a new one is restarting -- results might be skewed message,
Something in the solver to overcome potential skewness -- this is a little tricky, as the easiest is to always start the solve from an init backup solveKey. But if it crashed, it may well just crash again at the same place.

(not for today) This will naturally lead to the need for the cloud system to compile a crash report with the session name and the call stack error message.

GearsAD · 2022-04-07T10:59:08Z

Do we set solvable=0 inside the solver during solving though? If so, why do we do this?

Was interesting, because I can't think of a reason we would do that

jim-hill-r · 2022-04-07T13:14:07Z

Do we set solvable=0 inside the solver during solving though? If so, why do we do this?

Was interesting, because I can't think of a reason we would do that

I have no idea if we are. It was just a theory and likely a red herring, hence the question.

if the solve crashed then all bets are off.

Solvers writing to the production db need to be transactional/atomic in the cloud. Let's do an engineering session on how to make that possible.

But back to the actual issue at hand. Does the solver write solvable=0 at any point or is that a red herring?

Affie · 2022-04-07T18:52:55Z

But back to the actual issue at hand. Does the solver write solvable=0 at any point or is that a red herring?

I checked and our theory in the discussion looks correct.

If a variable is orphaned, ie with no connected factors, it is marked as solvable=0.

So I think it is as we suspect that a solve is triggered with the factor pushed to the back of the queue.

Affie · 2022-04-07T19:00:07Z

It is also possible to cause an error from the solver. I'll open an issue with IIF.

jim-hill-r · 2022-04-07T19:01:52Z

Nice find Johan. So as long as we fail on waitForCompletion, we should be good for ICRA. I think handling this in Caesar would be the long term fix yes? Not sure the fix but something like ignore the node but leave it solvable...

Affie · 2022-04-07T19:14:34Z

Luckily it wasn't too hard to test, IIF emits a warning:
Warning: solveTree! dissallows solvable variables without any connected solvable factors -- forcing solvable=0 on [:x1]

So as long as we fail on waitForCompletion, we should be good for ICRA.

I think so, yes.

I think handling this in Caesar would be the long term fix yes?

I'll think about it more in the morning. We should also catch islands with no priors on them JuliaRobotics/IncrementalInference.jl#1512.

jim-hill-r · 2022-04-07T21:12:19Z

If there is an issue on orphans I can move the conversation there, but I like the idea of solvable being, "the user wants this to be included" and maybe a new field called "orphaned" that is the solver engine saying, "I can't use this, sorry, maybe next time". As we think through this, we should consider how these fields might be used for "ghost" variables in the context of out of order factor injection. In the end maybe we end up with three fields? Solvable, Orphaned, and Ghost (ie placeholder).

dehann · 2022-04-10T21:33:51Z

maybe a new field called "orphaned" that is the solver engine saying, "I can't use this, sorry, maybe next time"

I'd avoid a dedicated orphan flag and rather use the field i previously developed (not in the new GQL versions yet) which back then was called backendselect. orphaned status is already stored in the graph structure and don't want to duplicate state.

The solve should only touch variables for which the intersect exists, solvable=1 (i.e. what the user wants), and backendset=1 which is atomically flashed to 1 in DB just before solve on all nodes in that solver scope. When we recently started talking about this again, it was a bit more more complicated with the addition of multiple solver on solveKeys. In terms of MVP, we can assume there will be only one solver per session, so the backendset logic will work. When we have two solvers working the graph, more will need to be done, which will pull solveKeys into the conversation.

BTW, being able to solve orphaned graph fragments is a requirement. The issue is if a graph has no priors to gauge/ground the solution. There is no problem with two disconnected graph fragments, each with their own priors. The CSM logic in the tree caters for that case.

Check this out DFG v0.2: https://github.com/JuliaRobotics/DistributedFactorGraphs.jl/blob/9f0834d71e43c9db9ef20adaaa1adc103f76ed29/src/entities/DFGVariable.jl#L93
Also, `ready` later became `solvable`

So as long as we fail on waitForCompletion, we should be good for ICRA.

I think so, yes.

Yes, that would be the best way to work for ICRA.

Do we set solvable=0 inside the solver during solving though? If so, why do we do this?
Was interesting, because I can't think of a reason we would do that
I have no idea if we are. It was just a theory and likely a red herring, hence the question.

Strictly speaking, solvable should only be under the user's control. If the solver is bumping it to 0, then it's a corner case or workaround. However, the solver should NEVER automatically bump solvable=1.

The idea here is the user should be able to peacefully modify the graph with variables and factors, while the solver is working hard on parts of the graph the user has deemed solvable.

if the solve crashed then all bets are off.

Perhaps I should clarify, if there is a corner case where the solver did bump down a solvable=0 (and I'm not immediately remembering if detected priorless graph fragments are bumped), and the solve crashed because there was a priorless graph fragment failed numerically. It is possible (bets off part) that when calling the solve a second time on the graph would then find that some of these bumped nodes are now solvable=0. I'd have to really dig to remember if this is true or not.

However, calling solveGraph on the same graph over and over will either break at the same place, or for the same reason if solvable=0 needs to happen in many places. The idea is that if the user pushed in a bad graph with solvable=1, don't be surprised if the solver is struggling. I agree, good error messages will help. Attempt 1:

Luckily it wasn't too hard to test, IIF emits a warning:
Warning: solveTree! dissallows solvable variables without any connected solvable factors -- forcing solvable=0 on [:x1]

Affie added this to the v0.6.0 milestone Apr 19, 2023

Affie linked a pull request Apr 19, 2023 that will close this issue

SDK v0.6 #221

Merged

12 tasks

Affie added bug Something isn't working downstream labels Apr 19, 2023

Affie self-assigned this Apr 19, 2023

Affie closed this as completed in #221 Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPEs are sometimes not populated - solvable changed to 0 #96

PPEs are sometimes not populated - solvable changed to 0 #96

Affie commented Apr 6, 2022

jim-hill-r commented Apr 6, 2022

dehann commented Apr 7, 2022 •

edited

Loading

GearsAD commented Apr 7, 2022

jim-hill-r commented Apr 7, 2022

Affie commented Apr 7, 2022

Affie commented Apr 7, 2022 •

edited

Loading

jim-hill-r commented Apr 7, 2022

Affie commented Apr 7, 2022 •

edited

Loading

jim-hill-r commented Apr 7, 2022

dehann commented Apr 10, 2022 •

edited

Loading

PPEs are sometimes not populated - solvable changed to 0 #96

PPEs are sometimes not populated - solvable changed to 0 #96

Comments

Affie commented Apr 6, 2022

jim-hill-r commented Apr 6, 2022

dehann commented Apr 7, 2022 • edited Loading

GearsAD commented Apr 7, 2022

jim-hill-r commented Apr 7, 2022

Affie commented Apr 7, 2022

Affie commented Apr 7, 2022 • edited Loading

jim-hill-r commented Apr 7, 2022

Affie commented Apr 7, 2022 • edited Loading

jim-hill-r commented Apr 7, 2022

dehann commented Apr 10, 2022 • edited Loading

dehann commented Apr 7, 2022 •

edited

Loading

Affie commented Apr 7, 2022 •

edited

Loading

Affie commented Apr 7, 2022 •

edited

Loading

dehann commented Apr 10, 2022 •

edited

Loading