Fix checks #316

raffaem · 2025-02-04T11:49:40Z

Fixes #314

raffaem · 2025-02-04T11:50:08Z

I don't know why this test fail:

test_that("different paging methods yield the same result", {
  skip_on_cran()

  w0 <- oa_fetch(
    entity = "works",
    title.search = c("bibliometric analysis", "science mapping"),
    cited_by_count = ">50",
    options = list(select = "id"),
    from_publication_date = "2021-01-01",
    to_publication_date = "2021-12-31",
    verbose = TRUE
  )

  w24 <- oa_fetch(
    entity = "works",
    title.search = c("bibliometric analysis", "science mapping"),
    cited_by_count = ">50",
    options = list(select = "id"),
    from_publication_date = "2021-01-01",
    to_publication_date = "2021-12-31",
    verbose = TRUE,
    pages = c(2, 4:5),
    per_page = 10
  )
  
  expect_equal(
    w0[c(11:20, 31:min(50, nrow(w0))), ],
    w24
  )

})

I cannot reproduce it in a clean environment

data-raw/coverage-works.csv

data-raw/gen-data.R

yjunechoe · 2025-02-04T16:18:40Z

tests/testthat/test-oa_fetch.R

+    options = list(select = "id"),
    from_publication_date = "2021-01-01",
    to_publication_date = "2021-12-31",
-    options = list(select = "id"),
+    verbose = TRUE,
    pages = c(2, 4:5),
-    per_page = 10,
-    verbose = TRUE
+    per_page = 10
  )
+


Unsure whether this changes behavior or if it just shuffle arguments around (if latter, could we roll it back?)

It just shuffle arguments around to have the same argument order of the previous call.

Still it doesn't fix the failure that I cannnot reproduce in a clean environment

Yes, will need to look into the failing test separately...

DESCRIPTION

yjunechoe

Thanks for the PR! I see just a few changes which don't seem related to the failing checks: if they're not relevant, could you roll them back? (or let us know why you feel they're also necessary?)

yjunechoe · 2025-02-04T16:22:04Z

I cannot reproduce it in a clean environment

For what it's worth I also can't seem to replicate this...

raffaem · 2025-02-04T16:25:40Z

I cannot reproduce it in a clean environment

For what it's worth I also can't seem to replicate this...

But the test fails also for you?

yjunechoe · 2025-02-04T16:26:15Z

As in, the test passes for me, so I can't replicate the failure on github actions!

raffaem · 2025-02-04T16:27:40Z

As in, the test passes for me, so I can't replicate the failure on github actions!

The test fails for me.

But when I copy paste the code into a clean environment, it passes.

yjunechoe · 2025-02-04T16:29:24Z

Could you copy paste the exact message you get? Note that I also pushed a commit to your branch to resolve the warning so you'd want to pull that first

raffaem · 2025-02-04T16:33:56Z

Could you copy paste the exact message you get? Note that I also pushed a commit to your branch to resolve the warning so you'd want to pull that first

It's written in #314

── Failure ('test-oa_fetch.R:426:3'): different paging methods yield the same result ──
w0[c(11:20, 31:min(50, nrow(w0))), ] (`actual`) not equal to `w24` (`expected`).

actual vs expected
                                               id
  actual[18, ]   https://openalex.org/W3155479378
  actual[19, ]   https://openalex.org/W3154112332
  actual[20, ]   https://openalex.org/W3185236960
- actual[21, ]   https://openalex.org/W3153953002
+ expected[21, ] https://openalex.org/W3158468941
- actual[22, ]   https://openalex.org/W3158468941
+ expected[22, ] https://openalex.org/W3153953002
  actual[23, ]   https://openalex.org/W3157160312
  actual[24, ]   https://openalex.org/W3210837814
  actual[25, ]   https://openalex.org/W3193436972

actual vs expected
                                               id
  actual[26, ]   https://openalex.org/W4200087920
  actual[27, ]   https://openalex.org/W3181733312
  actual[28, ]   https://openalex.org/W3176309448
- actual[29, ]   https://openalex.org/W3134011764
+ expected[29, ] https://openalex.org/W3120901632
- actual[30, ]   https://openalex.org/W3120901632
+ expected[30, ] https://openalex.org/W3135300045

actual$id[18:25] vs expected$id[18:25]
  "https://openalex.org/W3155479378"
  "https://openalex.org/W3154112332"
  "https://openalex.org/W3185236960"
- "https://openalex.org/W3153953002"
+ "https://openalex.org/W3158468941"
- "https://openalex.org/W3158468941"
+ "https://openalex.org/W3153953002"
  "https://openalex.org/W3157160312"
  "https://openalex.org/W3210837814"
  "https://openalex.org/W3193436972"

actual$id[26:30] vs expected$id[26:30]
  "https://openalex.org/W4200087920"
  "https://openalex.org/W3181733312"
  "https://openalex.org/W3176309448"
- "https://openalex.org/W3134011764"
+ "https://openalex.org/W3120901632"
- "https://openalex.org/W3120901632"
+ "https://openalex.org/W3135300045"

yjunechoe · 2025-02-04T16:39:35Z

Could you also paste in the message you get from verbose? When I run the same test and it passes, I see:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 191 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

Fixes ropensci#314

raffaem · 2025-02-04T16:52:53Z

Could you also paste in the message you get from verbose? When I run the same test and it passes, I see:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 191 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

What do you mean verbose?

That was the message I get from ðevtools::check()

yjunechoe · 2025-02-04T16:56:20Z

What do you mean verbose?

As in, because the oa_fetch() code in the test runs with verbose = TRUE, you should be seeing URLs to https://api.openalex.org. That seemed like a reasonable place to start looking at why the tests fail for one of us but not the other (maybe the same oa_fetch() call is constructing different queries between us)

So for example, the URLs that I got on my setup are such that:

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w24_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"

identical(
  w0_url %>% oa_request() %>% works2df(),
  w24_url %>% oa_request() %>% works2df()
)
#> [1] TRUE

raffaem · 2025-02-04T16:57:54Z

What do you mean verbose?

As in, because the oa_fetch() code in the test runs with verbose = TRUE, you should be seeing URLs to https://api.openalex.org. That seemed like a reasonable place to start looking at why the tests fail for one of us but not the other (maybe the same oa_fetch() call is constructing different queries between us)

So for example, the URLs that I got on my setup are such that:
w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w24_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"

identical(
  w0_url %>% oa_request() %>% works2df(),
  w24_url %>% oa_request() %>% works2df()
)
#> [1] TRUE

── Test failures ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── testthat ────

> library(testthat)
> library(openalexR)
openalexR v2.0.0 introduces breaking changes.
See NEWS.md for details.

To suppress this message, add `openalexR.message = suppressed` to your .Renviron file.
> 
> test_check("openalexR")
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...
Requesting url: https://api.openalex.org/works?search=transformative%20change&select=id%2Cdisplay_name%2Cpublication_date
Using basic paging...
Getting 1 page of results with a total of 20 records...
Requesting url: https://api.openalex.org/works?search=transformative%20change&select=id%2Cdisplay_name%2Cpublication_date
Using basic paging...
Getting 1 page of results with a total of 10 records...
Identifier is missing, please specify filter or search argument.
No citations and no references for W2033458484
[ FAIL 1 | WARN 0 | SKIP 2 | PASS 126 ]

══ Skipped tests (2) ═══════════════════════════════════════════════════════════
• Skipping (1): 'test-oa_ngrams.R:2:3'
• {coro} is not installed (1): 'test-coro.R:2:3'

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-oa_fetch.R:427:3'): different paging methods yield the same result ──
w0[c(11:20, 31:min(50, nrow(w0))), ] (`actual`) not equal to `w24` (`expected`).

actual vs expected
                                               id
  actual[18, ]   https://openalex.org/W3155479378
  actual[19, ]   https://openalex.org/W3154112332
  actual[20, ]   https://openalex.org/W3185236960
- actual[21, ]   https://openalex.org/W3153953002
+ expected[21, ] https://openalex.org/W3157160312
- actual[22, ]   https://openalex.org/W3158468941
+ expected[22, ] https://openalex.org/W3185236960
- actual[23, ]   https://openalex.org/W3157160312
+ expected[23, ] https://openalex.org/W3153953002
  actual[24, ]   https://openalex.org/W3210837814
- actual[25, ]   https://openalex.org/W3193436972
+ expected[25, ] https://openalex.org/W4200087920
- actual[26, ]   https://openalex.org/W4200087920
+ expected[26, ] https://openalex.org/W3193436972
  actual[27, ]   https://openalex.org/W3181733312
and 3 more ...

actual$id[18:30] vs expected$id[18:30]
  "https://openalex.org/W3155479378"
  "https://openalex.org/W3154112332"
  "https://openalex.org/W3185236960"
- "https://openalex.org/W3153953002"
+ "https://openalex.org/W3157160312"
- "https://openalex.org/W3158468941"
+ "https://openalex.org/W3185236960"
- "https://openalex.org/W3157160312"
+ "https://openalex.org/W3153953002"
  "https://openalex.org/W3210837814"
- "https://openalex.org/W3193436972"
+ "https://openalex.org/W4200087920"
- "https://openalex.org/W4200087920"
+ "https://openalex.org/W3193436972"
  "https://openalex.org/W3181733312"
and 3 more ...

[ FAIL 1 | WARN 0 | SKIP 2 | PASS 126 ]
Error: Test failures
Execution halted

1 error ✖ | 0 warnings ✔ | 2 notes ✖
Error: R CMD check found ERRORs
Execution halted
[ble: exit 1][ble: elapsed 74.562s (CPU 26.9%)] R -e "devtools::check()"

yjunechoe · 2025-02-04T17:02:33Z

Ok so I'm not 100% sure but it seems that these lines from check() are from the problematic oa_fetch() test for you:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

But those are the same query URLs that I see too! Weirdly, the first URL only fetched 190 records for you, whereas for me that fetches 191 records.

Can you confirm that this is the case by running this code? do you get a different length?

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
#> [1] 191

raffaem · 2025-02-04T17:05:31Z

Ok so I'm not 100% sure but it seems that these lines from check() are from the problematic oa_fetch() test for you:

Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Getting 1 page of results with a total of 190 records...
Requesting url: https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id
Using basic paging...
Getting 3 pages of results with a total of 30 records...

But those are the same query URLs that I see too! Weirdly, the first URL only fetched 190 records for you, whereas for me that fetches 191 records.

Can you confirm that this is the case by running this code? do you get a different length?

w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
#> [1] 191

❯ R
Using 18 cores for parallelisation
> devtools::load_all()
ℹ Loading openalexR
openalexR v2.0.0 introduces breaking changes.
See NEWS.md for details.

To suppress this message, add `openalexR.message = suppressed` to your .Renviron file.
> w0_url <- "https://api.openalex.org/works?filter=title.search%3Abibliometric%20analysis%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2021-01-01%2Cto_publication_date%3A2021-12-31&select=id"
w0_url %>% oa_request() %>% length()
[1] 190

raffaem · 2025-02-04T17:11:19Z

I get a meta->count of 191 if I connect from the webbrowser

yjunechoe · 2025-02-04T17:11:26Z

Thanks! That's... really weird and I don't have a good intuition. We should probably pluck this out as a separate issue (now #317 - lets catch up there!)

Do you have anything you'd like to add to this PR?

raffaem · 2025-02-04T17:13:42Z

no

yjunechoe reviewed Feb 4, 2025

View reviewed changes

data-raw/coverage-works.csv Outdated Show resolved Hide resolved

yjunechoe reviewed Feb 4, 2025

View reviewed changes

data-raw/gen-data.R Outdated Show resolved Hide resolved

yjunechoe reviewed Feb 4, 2025

View reviewed changes

DESCRIPTION Show resolved Hide resolved

yjunechoe requested changes Feb 4, 2025

View reviewed changes

raffaem force-pushed the fix_checks branch from 02c18ae to 8d7f551 Compare February 4, 2025 16:24

Fix checks

fba896a

Fixes ropensci#314

raffaem force-pushed the fix_checks branch from 8d7f551 to fba896a Compare February 4, 2025 16:51

yjunechoe mentioned this pull request Feb 4, 2025

oa_request() results length discrepancy between systems #317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix checks #316

Fix checks #316

raffaem commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe Feb 4, 2025

raffaem Feb 4, 2025

yjunechoe Feb 4, 2025

yjunechoe left a comment

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025 •

edited

Loading

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025 •

edited

Loading

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025 •

edited

Loading

raffaem commented Feb 4, 2025

Fix checks #316

Are you sure you want to change the base?

Fix checks #316

Conversation

raffaem commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe Feb 4, 2025

Choose a reason for hiding this comment

raffaem Feb 4, 2025

Choose a reason for hiding this comment

yjunechoe Feb 4, 2025

Choose a reason for hiding this comment

yjunechoe left a comment

Choose a reason for hiding this comment

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025 • edited Loading

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025

raffaem commented Feb 4, 2025 • edited Loading

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025 • edited Loading

raffaem commented Feb 4, 2025

yjunechoe commented Feb 4, 2025 •

edited

Loading

raffaem commented Feb 4, 2025 •

edited

Loading

yjunechoe commented Feb 4, 2025 •

edited

Loading