Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AresIndexer::augmentConceptFiles fails with "! object 'CONCEPT_ID' not found" #30

Open
lav-patel opened this issue Mar 2, 2023 · 3 comments

Comments

@lav-patel
Copy link

Code which is inpired from ares docs

# run achilles
Achilles::achilles(cdmVersion = cdmVersion,
    connectionDetails = connectionDetails,
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema,
    #numThreads=numThreads,
    #sqlOnly = sqlOnly,
    #createIndices = createIndices
)
# obtain the data source release key (naming convention for folder structures)
releaseKey <- AresIndexer::getSourceReleaseKey(connectionDetails, cdmDatabaseSchema)
datasourceReleaseOutputFolder <- file.path(aresDataRoot, releaseKey)

# run data quality dashboard and output results to data source release folder in ares data folder
dqResults <- DataQualityDashboard::executeDqChecks(
    connectionDetails = connectionDetails,
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema,
    vocabDatabaseSchema = cdmDatabaseSchema,
    cdmVersion = cdmVersion,
    cdmSourceName = cdmSourceName,
    outputFile = "dq-result.json",
    outputFolder = datasourceReleaseOutputFolder
    #numThreads = numThreads,
    #sqlOnly = sqlOnly,
    # verboseMode = verboseMode,
    # writeToTable = writeToTable,
    # checkLevels = checkLevels,
    # checkNames = checkNames
)

# inspect logs
#ParallelLogger::launchLogViewer(logFileName = file.path(outputFolder, 
#                                                      sprintf("log_DqDashboard_%s.txt", cdmSourceName)))

# export the achilles results to the ares folder
Achilles::exportAO(
    connectionDetails = connectionDetails,
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema,
    vocabDatabaseSchema = cdmDatabaseSchema,
    outputPath = aresDataRoot
)

# perform temporal characterization
outputFile <- file.path(datasourceReleaseOutputFolder, "temporal-characterization.csv")
Achilles::performTemporalCharacterization(
    connectionDetails = connectionDetails,
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema,
    outputFile = outputFile
)

# augment concept files with temporal characterization data
AresIndexer::augmentConceptFiles(releaseFolder = file.path(aresDataRoot, releaseKey))

The error does not mention which cdm_table_name it is referring to:

Error in `count()`:
ℹ In argument: `CONCEPT_ID`.
Caused by error:
! object 'CONCEPT_ID' not found
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/dplyr:::mutate_error>
Error in `count()`:
ℹ In argument: `CONCEPT_ID`.
Caused by error:
! object 'CONCEPT_ID' not found
---
Backtrace:
  1. AresIndexer::augmentConceptFiles(...)
  4. dplyr:::count.data.frame(., CONCEPT_ID, tolower(CDM_TABLE_NAME))
  6. dplyr:::group_by.data.frame(x, ..., .add = TRUE, .drop = .drop)
  7. dplyr::group_by_prepare(.data, ..., .add = .add, error_call = current_env())
  8. dplyr:::add_computed_columns(.data, new_groups, error_call = error_call)
  9. dplyr:::mutate_cols(...)
 11. dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
 12. mask$eval_all_mutate(quo)
 13. dplyr (local) eval()
Run `rlang::last_trace()` to see the full context.

can Someone help? What is the reason for it failure?

@alondhe
Copy link

alondhe commented Mar 12, 2023

Looks like the DQD JSON file uses camelcase for the column names, and AresIndexer is set up to rely on upper case snake case.

xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
xitology added a commit to TuftsCTSI/AresIndexer that referenced this issue Apr 26, 2023
@alondhe
Copy link

alondhe commented Jul 7, 2023

DataQualityDashboard's new convertJsonResultsFileCase function can help convert results files from camel to snake (and vice versa):

DataQualityDashboard::convertJsonResultsFileCase(jsonFilePath = "~/Downloads/dq-result.json",
                                                 writeToFile = TRUE,
                                                 outputFolder = "~/Downloads",
                                                 targetCase = "snake")

@fdefalco
Copy link
Contributor

I'll be pushing an update to AresIndexer that switches to the new DQD naming conventions before symposium. I'll be pushing an update to ARES that accounts for this update at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants