Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix checks #316

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Fix checks #316

wants to merge 1 commit into from

Conversation

raffaem
Copy link

@raffaem raffaem commented Feb 4, 2025

Fixes #314

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

I don't know why this test fail:

test_that("different paging methods yield the same result", {
  skip_on_cran()

  w0 <- oa_fetch(
    entity = "works",
    title.search = c("bibliometric analysis", "science mapping"),
    cited_by_count = ">50",
    options = list(select = "id"),
    from_publication_date = "2021-01-01",
    to_publication_date = "2021-12-31",
    verbose = TRUE
  )

  w24 <- oa_fetch(
    entity = "works",
    title.search = c("bibliometric analysis", "science mapping"),
    cited_by_count = ">50",
    options = list(select = "id"),
    from_publication_date = "2021-01-01",
    to_publication_date = "2021-12-31",
    verbose = TRUE,
    pages = c(2, 4:5),
    per_page = 10
  )
  
  expect_equal(
    w0[c(11:20, 31:min(50, nrow(w0))), ],
    w24
  )

})

I cannot reproduce it in a clean environment

data-raw/gen-data.R Outdated Show resolved Hide resolved
Comment on lines +419 to +426
options = list(select = "id"),
from_publication_date = "2021-01-01",
to_publication_date = "2021-12-31",
options = list(select = "id"),
verbose = TRUE,
pages = c(2, 4:5),
per_page = 10,
verbose = TRUE
per_page = 10
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure whether this changes behavior or if it just shuffle arguments around (if latter, could we roll it back?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just shuffle arguments around to have the same argument order of the previous call.

Still it doesn't fix the failure that I cannnot reproduce in a clean environment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will need to look into the failing test separately...

DESCRIPTION Show resolved Hide resolved
Copy link
Collaborator

@yjunechoe yjunechoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I see just a few changes which don't seem related to the failing checks: if they're not relevant, could you roll them back? (or let us know why you feel they're also necessary?)

@yjunechoe
Copy link
Collaborator

I cannot reproduce it in a clean environment

For what it's worth I also can't seem to replicate this...

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

I cannot reproduce it in a clean environment

For what it's worth I also can't seem to replicate this...

But the test fails also for you?

@yjunechoe
Copy link
Collaborator

As in, the test passes for me, so I can't replicate the failure on github actions!

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

As in, the test passes for me, so I can't replicate the failure on github actions!

The test fails for me.

But when I copy paste the code into a clean environment, it passes.

@yjunechoe
Copy link
Collaborator

Could you copy paste the exact message you get? Note that I also pushed a commit to your branch to resolve the warning so you'd want to pull that first

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

Could you copy paste the exact message you get? Note that I also pushed a commit to your branch to resolve the warning so you'd want to pull that first

It's written in #314

── Failure ('test-oa_fetch.R:426:3'): different paging methods yield the same result ──
w0[c(11:20, 31:min(50, nrow(w0))), ] (`actual`) not equal to `w24` (`expected`).

actual vs expected
                                               id
  actual[18, ]   https://openalex.org/W3155479378
  actual[19, ]   https://openalex.org/W3154112332
  actual[20, ]   https://openalex.org/W3185236960
- actual[21, ]   https://openalex.org/W3153953002
+ expected[21, ] https://openalex.org/W3158468941
- actual[22, ]   https://openalex.org/W3158468941
+ expected[22, ] https://openalex.org/W3153953002
  actual[23, ]   https://openalex.org/W3157160312
  actual[24, ]   https://openalex.org/W3210837814
  actual[25, ]   https://openalex.org/W3193436972

actual vs expected
                                               id
  actual[26, ]   https://openalex.org/W4200087920
  actual[27, ]   https://openalex.org/W3181733312
  actual[28, ]   https://openalex.org/W3176309448
- actual[29, ]   https://openalex.org/W3134011764
+ expected[29, ] https://openalex.org/W3120901632
- actual[30, ]   https://openalex.org/W3120901632
+ expected[30, ] https://openalex.org/W3135300045

actual$id[18:25] vs expected$id[18:25]
  "https://openalex.org/W3155479378"
  "https://openalex.org/W3154112332"
  "https://openalex.org/W3185236960"
- "https://openalex.org/W3153953002"
+ "https://openalex.org/W3158468941"
- "https://openalex.org/W3158468941"
+ "https://openalex.org/W3153953002"
  "https://openalex.org/W3157160312"
  "https://openalex.org/W3210837814"
  "https://openalex.org/W3193436972"

actual$id[26:30] vs expected$id[26:30]
  "https://openalex.org/W4200087920"
  "https://openalex.org/W3181733312"
  "https://openalex.org/W3176309448"
- "https://openalex.org/W3134011764"
+ "https://openalex.org/W3120901632"
- "https://openalex.org/W3120901632"
+ "https://openalex.org/W3135300045"

@yjunechoe
Copy link
Collaborator

Could you also paste in the message you get from verbose? When I run the same test and it passes, I see:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 191 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

Could you also paste in the message you get from verbose? When I run the same test and it passes, I see:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 191 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

What do you mean verbose?

That was the message I get from ðevtools::check()

@yjunechoe
Copy link
Collaborator

yjunechoe commented Feb 4, 2025

What do you mean verbose?

As in, because the oa_fetch() code in the test runs with verbose = TRUE, you should be seeing URLs to https://api.openalex.org. That seemed like a reasonable place to start looking at why the tests fail for one of us but not the other (maybe the same oa_fetch() call is constructing different queries between us)

So for example, the URLs that I got on my setup are such that:

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w24_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"

identical(
  w0_url %>% oa_request() %>% works2df(),
  w24_url %>% oa_request() %>% works2df()
)
#> [1] TRUE

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

What do you mean verbose?

As in, because the oa_fetch() code in the test runs with verbose = TRUE, you should be seeing URLs to https://api.openalex.org. That seemed like a reasonable place to start looking at why the tests fail for one of us but not the other (maybe the same oa_fetch() call is constructing different queries between us)

So for example, the URLs that I got on my setup are such that:

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w24_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"

identical(
  w0_url %>% oa_request() %>% works2df(),
  w24_url %>% oa_request() %>% works2df()
)
#> [1] TRUE
── Test failures ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── testthat ────

> library(testthat)
> library(openalexR)
openalexR v2.0.0 introduces breaking changes.
See NEWS.md for details.

To suppress this message, add `openalexR.message = suppressed` to your .Renviron file.
> 
> test_check("openalexR")
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...
Requesting url: https://api.openalex.org/works?search=transformative%20change&select=id%2Cdisplay_name%2Cpublication_date
Using basic paging...
Getting 1 page of results with a total of 20 records...
Requesting url: https://api.openalex.org/works?search=transformative%20change&select=id%2Cdisplay_name%2Cpublication_date
Using basic paging...
Getting 1 page of results with a total of 10 records...
Identifier is missing, please specify filter or search argument.
No citations and no references for W2033458484
[ FAIL 1 | WARN 0 | SKIP 2 | PASS 126 ]

══ Skipped tests (2) ═══════════════════════════════════════════════════════════
• Skipping (1): 'test-oa_ngrams.R:2:3'
• {coro} is not installed (1): 'test-coro.R:2:3'

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-oa_fetch.R:427:3'): different paging methods yield the same result ──
w0[c(11:20, 31:min(50, nrow(w0))), ] (`actual`) not equal to `w24` (`expected`).

actual vs expected
                                               id
  actual[18, ]   https://openalex.org/W3155479378
  actual[19, ]   https://openalex.org/W3154112332
  actual[20, ]   https://openalex.org/W3185236960
- actual[21, ]   https://openalex.org/W3153953002
+ expected[21, ] https://openalex.org/W3157160312
- actual[22, ]   https://openalex.org/W3158468941
+ expected[22, ] https://openalex.org/W3185236960
- actual[23, ]   https://openalex.org/W3157160312
+ expected[23, ] https://openalex.org/W3153953002
  actual[24, ]   https://openalex.org/W3210837814
- actual[25, ]   https://openalex.org/W3193436972
+ expected[25, ] https://openalex.org/W4200087920
- actual[26, ]   https://openalex.org/W4200087920
+ expected[26, ] https://openalex.org/W3193436972
  actual[27, ]   https://openalex.org/W3181733312
and 3 more ...

actual$id[18:30] vs expected$id[18:30]
  "https://openalex.org/W3155479378"
  "https://openalex.org/W3154112332"
  "https://openalex.org/W3185236960"
- "https://openalex.org/W3153953002"
+ "https://openalex.org/W3157160312"
- "https://openalex.org/W3158468941"
+ "https://openalex.org/W3185236960"
- "https://openalex.org/W3157160312"
+ "https://openalex.org/W3153953002"
  "https://openalex.org/W3210837814"
- "https://openalex.org/W3193436972"
+ "https://openalex.org/W4200087920"
- "https://openalex.org/W4200087920"
+ "https://openalex.org/W3193436972"
  "https://openalex.org/W3181733312"
and 3 more ...

[ FAIL 1 | WARN 0 | SKIP 2 | PASS 126 ]
Error: Test failures
Execution halted

1 error ✖ | 0 warnings ✔ | 2 notes ✖
Error: R CMD check found ERRORs
Execution halted
[ble: exit 1][ble: elapsed 74.562s (CPU 26.9%)] R -e "devtools::check()"

@yjunechoe
Copy link
Collaborator

Ok so I'm not 100% sure but it seems that these lines from check() are from the problematic oa_fetch() test for you:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

But those are the same query URLs that I see too! Weirdly, the first URL only fetched 190 records for you, whereas for me that fetches 191 records.

Can you confirm that this is the case by running this code? do you get a different length?

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
#> [1] 191

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

Ok so I'm not 100% sure but it seems that these lines from check() are from the problematic oa_fetch() test for you:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

But those are the same query URLs that I see too! Weirdly, the first URL only fetched 190 records for you, whereas for me that fetches 191 records.

Can you confirm that this is the case by running this code? do you get a different length?

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
#> [1] 191
❯ R
Using 18 cores for parallelisation
> devtools::load_all()
ℹ Loading openalexR
openalexR v2.0.0 introduces breaking changes.
See NEWS.md for details.

To suppress this message, add `openalexR.message = suppressed` to your .Renviron file.
> w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
[1] 190

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

I get a meta->count of 191 if I connect from the webbrowser

image

@yjunechoe
Copy link
Collaborator

yjunechoe commented Feb 4, 2025

Thanks! That's... really weird and I don't have a good intuition. We should probably pluck this out as a separate issue (now #317 - lets catch up there!)

Do you have anything you'd like to add to this PR?

@raffaem
Copy link
Author

raffaem commented Feb 4, 2025

no

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests fail
2 participants