Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent NA return from failed subproblems #175

Closed
josherrickson opened this issue Jun 28, 2019 · 5 comments
Closed

Inconsistent NA return from failed subproblems #175

josherrickson opened this issue Jun 28, 2019 · 5 comments

Comments

@josherrickson
Copy link
Collaborator

When there are multiple subproblems, sometimes failed matches return NA and sometimes return 1.NA or some other prefix, which R doesn't recognize as NA. This is not consistent and it breaks matchfailed via subproblemSuccess.

First, things work fine with no subproblems.

> (f1 <- fullmatch(pr ~ cost, data = nuclearplants, min = 5, max = 5))
   H    I    A    J    B    K    L    M    C    N    O    P 
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
   Q    R    S    T    U    D    V    E    W    F    X    G 
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
   Y    Z    d    e    f    a    b    c 
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
> matchfailed(f1)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[12] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[23] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

However, with subproblems, things get weird.

> (f2 <- fullmatch(pr ~ cost, data = nuclearplants, min = 5, max = 5, 
+                 within = exactMatch(pr ~ pt, data = nuclearplants)))
   H    I    A    J    B    K    L    M    C    N    O    P 
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
   Q    R    S    T    U    D    V    E    W    F    X    G 
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
   Y    Z    d    e    f    a    b    c 
<NA> <NA> 1.NA 1.NA 1.NA 1.NA 1.NA 1.NA 
> matchfailed(f2)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[10]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[19]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
[28] FALSE FALSE FALSE FALSE FALSE
> 
> (f3 <- fullmatch(pr ~ cost, data = nuclearplants, min = 60, max = 60, 
+                 within = exactMatch(pr ~ pt, data = nuclearplants)))
   H    I    A    J    B    K    L    M    C    N    O    P 
0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 
   Q    R    S    T    U    D    V    E    W    F    X    G 
0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 0.NA 
   Y    Z    d    e    f    a    b    c 
0.NA 0.NA 1.NA 1.NA 1.NA 1.NA 1.NA 1.NA 
> matchfailed(f3)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[28] FALSE FALSE FALSE FALSE FALSE
@josherrickson
Copy link
Collaborator Author

One way to address this downstream would be to modify subproblemSuccess and replace

 all(is.na(x))

with

all(is.na(x) | grepl("NA$", x))

Given the inconsistency of NA vs #.NA, I'm not sure which is intended. If #.NA is proper, this fix is necessary. If #.NA is a bug, then we should address that upstream.

@benthestatistician
Copy link
Collaborator

benthestatistician commented Jun 28, 2019 via email

@benthestatistician
Copy link
Collaborator

Reviewing this code again now, I'd support the modification to subproblemSuccess that was proposed in Josh's last comment here, namely replacing all(is.na(x)) with all(is.na(x) | grepl("NA$", x)). In other words, move the material aeaa6a2 inserted near the end of fullmatch.matrix into subproblemSuccess.

@josherrickson
Copy link
Collaborator Author

I made these modifications by adjusting subproblemSuccess's logical check and simplifying the bit in fullmatch appropriately.

I also expanded the tests for subproblemSuccess and added tests for matchfailed to hopefully get ahead of any further similar issues.

Leaving the issue open for @markmfredrickson to chime in as the NA/#.NA discrepancy is still concerning.

@benthestatistician
Copy link
Collaborator

Thanks for this follow-up, Josh!

Rather than inferring subproblem failure or success from whether all the matches are NA, or #.NA, it would be better to more directly pass the solver's answer to this question up the call stack and have subproblemSuccess just look at it. The issue54-hinting branch does some things that should enable this: in particular, passing up a table of subproblem information, one column of which notes whether the subproblem was found to be feasible or not. I've opened a new issue (#178) for that purpose, and am going to close out this one (but Mark don't let that stop you from sharing anything you might like to add).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants