Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] datahub validation fails for v1.0-pre2 #228

Open
Brilator opened this issue Nov 20, 2023 · 14 comments
Open

[BUG] datahub validation fails for v1.0-pre2 #228

Brilator opened this issue Nov 20, 2023 · 14 comments
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.

Comments

@Brilator
Copy link
Member

Describe the bug

The DataHUB validation pipeline fails for arcs created with https://github.com/nfdi4plants/ARCCommander/releases/download/v1.0.0-preview.2/arc_osx-x64

This is before running "metadata tests".

Screenshot 2023-11-20 at 10 31 30

Screenshot 2023-11-20 at 10 29 42

To Reproduce

arc_osx-x64 init
arc_osx-x64 assay add -s v1pre2-Study -a v1pre2-Assay
arc_osx-x64 i person register --lastname LastName --firstname FirstName --email [email protected] --affiliation DataPLANT
arc_osx-x64 investigation update -i v1pre2 --description "Description v1pre2" --title "Title v1pre2"
arc_osx-x64 sync -f -r https://git.nfdi4plants.org/<userName>/v1pre2 -m "v1pre2 test"
@HLWeil
Copy link
Member

HLWeil commented Nov 20, 2023

Was this different when pushing with other tools/tool-versions?

The error message does not really say anything about output the ARCCommander would produce.

Maybe @omaus or @j-bauer have an idea about this?

@Brilator
Copy link
Member Author

This does not happen with ARC commander v0.0.5 using the same commands above.

@omaus
Copy link
Collaborator

omaus commented Nov 20, 2023

Maybe @omaus or @j-bauer have an idea about this?

I'd need the whole console output to investigate this. It's most certainly due to deprecated validation pipeline (i.e., arc-validate project).

@HLWeil
Copy link
Member

HLWeil commented Nov 20, 2023

This does not happen with ARC commander v0.0.5 using the same commands above.

Ah okay I see. Then it is probably a mismatch between ARCCommander and validation pipeline version, as @omaus suggested.

The tests failed in a way, that no xml file was created. Maybe we could still get some mechanic for retreiving the reason for this in future cases, e.g. wrapping the complete pipeline call into a try .. with?

@j-bauer
Copy link

j-bauer commented Nov 20, 2023

Here is the relevant output of the arc-validate command:

$ bash /opt/arc-validate/arc-validate.sh; ret=$?
+ arc-validate
Internal Error:                         
Cannot modify readonly container        
"   at System.IO.Packaging.Package.ThrowIfReadOnly()
   at System.IO.Packaging.Package.CreatePart(Uri partUri, String contentType, CompressionOption compressionOption)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.CreateMetroPart(Uri partUri, String contentType)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPart.CreateInternal(OpenXmlPackage openXmlPackage, OpenXmlPart parent, String contentType, String targetExt)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType, String id)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPartInternal[T]()
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPart[T]()
   at FsSpreadsheet.ExcelIO.Spreadsheet.getOrInitSharedStringTablePart(SpreadsheetDocument spreadsheetDocument)
   at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheet(Sheet sheet, SpreadsheetDocument spreadsheetDocument)
   at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheetID(String sheetID, SpreadsheetDocument spreadsheetDocument)
   at [email protected](Sheet xlsxSheet)
   at Microsoft.FSharp.Collections.Internal.IEnumerator.map@99.DoMoveNext(b& curr) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 102
   at Microsoft.FSharp.Collections.Internal.IEnumerator.MapEnumerator`1.System.Collections.IEnumerator.MoveNext() in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 84
   at Microsoft.FSharp.Collections.SeqModule.Fold[T,TState](FSharpFunc`2 folder, TState state, IEnumerable`1 source) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 872
   at FsSpreadsheet.ExcelIO.FsExtensions.FsWorkbook.fromXlsxFile.Static(String filePath)
   at ArcValidation.Configs.ArcConfig.get_InvestigationStudies() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 27
   at ArcValidation.Configs.ArcConfig.get_StudyPathsAndIds() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 33
   at ArcValidation.TestGeneration.Critical.Arc.FileSystem.generateArcFileSystemTests(ArcConfig arcConfig) in /opt/arc-validate/src/ArcValidation/TestGeneration/Critical/ArcFileSystem.fs:line 18
   at ARCValidate.main(String[] argv) in /opt/arc-validate/src/arc-validate/Program.fs:line [29](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L29)"

Resulting in another error later on, since arc-validate did not create the arc-validate-results.xml:

$ /opt/arc-validate/create-badge.py
Traceback (most recent call last):
  File "/opt/arc-validate/create-badge.py", line 9, in <module>
    xml = JUnitXml.fromfile(xml_path)
  File "/usr/local/lib/python3.9/dist-packages/junitparser/junitparser.py", line 751, in fromfile
    tree = etree.parse(filepath)  # nosec
  File "/usr/lib/python3.9/xml/etree/ElementTree.py", line 1229, in parse
    tree.parse(source, parser)
  File "/usr/lib/python3.9/xml/etree/ElementTree.py", line [56](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L56)9, in parse
    source = open(source, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './arc-validate-results.xml'

I thought the arc-validate tool is supposed to always create that XML file in all cases, isn't it?

@omaus
Copy link
Collaborator

omaus commented Nov 20, 2023

Here is the relevant output of the arc-validate command:

$ bash /opt/arc-validate/arc-validate.sh; ret=$?
+ arc-validate
Internal Error:                         
Cannot modify readonly container        
"   at System.IO.Packaging.Package.ThrowIfReadOnly()
   at System.IO.Packaging.Package.CreatePart(Uri partUri, String contentType, CompressionOption compressionOption)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.CreateMetroPart(Uri partUri, String contentType)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPart.CreateInternal(OpenXmlPackage openXmlPackage, OpenXmlPart parent, String contentType, String targetExt)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType, String id)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.InitPart[T](T newPart, String contentType)
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPartInternal[T]()
   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.AddNewPart[T]()
   at FsSpreadsheet.ExcelIO.Spreadsheet.getOrInitSharedStringTablePart(SpreadsheetDocument spreadsheetDocument)
   at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheet(Sheet sheet, SpreadsheetDocument spreadsheetDocument)
   at FsSpreadsheet.ExcelIO.Spreadsheet.getCellsBySheetID(String sheetID, SpreadsheetDocument spreadsheetDocument)
   at [email protected](Sheet xlsxSheet)
   at Microsoft.FSharp.Collections.Internal.IEnumerator.map@99.DoMoveNext(b& curr) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 102
   at Microsoft.FSharp.Collections.Internal.IEnumerator.MapEnumerator`1.System.Collections.IEnumerator.MoveNext() in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 84
   at Microsoft.FSharp.Collections.SeqModule.Fold[T,TState](FSharpFunc`2 folder, TState state, IEnumerable`1 source) in D:\a\_work\1\s\src\FSharp.Core\seq.fs:line 872
   at FsSpreadsheet.ExcelIO.FsExtensions.FsWorkbook.fromXlsxFile.Static(String filePath)
   at ArcValidation.Configs.ArcConfig.get_InvestigationStudies() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 27
   at ArcValidation.Configs.ArcConfig.get_StudyPathsAndIds() in /opt/arc-validate/src/ArcValidation/Configs/ArcConfig.fs:line 33
   at ArcValidation.TestGeneration.Critical.Arc.FileSystem.generateArcFileSystemTests(ArcConfig arcConfig) in /opt/arc-validate/src/ArcValidation/TestGeneration/Critical/ArcFileSystem.fs:line 18
   at ARCValidate.main(String[] argv) in /opt/arc-validate/src/arc-validate/Program.fs:line [29](https://git.nfdi4plants.org/<redacted>/v1pre2/-/jobs/2454#L29)"

Looks 2 me like the Investigation file is read-only. Could you chick this @Brilator. Might also be any of the Study files...

I thought the arc-validate tool is supposed to always create that XML file in all cases, isn't it?

'xactly.
If read-only is the cause of this, I'll keep it in mind for arc-validate V2.

@Brilator
Copy link
Member Author

@omaus can you do me a favor and check this with latest arc commander on windows using the commands above?

If it is read-only, that’s still an arc commander bug.

@omaus
Copy link
Collaborator

omaus commented Nov 21, 2023

@omaus can you do me a favor and check this with latest arc commander on windows using the commands above?

If it is read-only, that’s still an arc commander bug.

Not read-only @ Windows.

@Brilator
Copy link
Member Author

Then that's not the reason. Or did validation work?

@omaus
Copy link
Collaborator

omaus commented Nov 21, 2023

I created a test repo in Gitlab with the commands from above. None of the XLSX files had read-only, yet the pipeline did not work and prints the same error as above.
@HLWeil Any ideas? Might be sth. with a newer FsSpreadsheet version and some alterations in reading XLSX files.

@HLWeil
Copy link
Member

HLWeil commented Nov 21, 2023

Yeah might be.. Does this error occur for all ARCs, @j-bauer @Brilator?

If so it should hopefully be easy to reproduce.

@Brilator
Copy link
Member Author

Still relevant for ARC Commander v1

Run the following to create a minimal ARC that should be valid for invenio.

mkdir arc-v1-test; cd arc-v1-test

arc init
arc assay add -s v1-test-Study -a v1-test-Assay

arc i person register --lastname TestLastName --firstname TestFirstName --email [email protected] --affiliation DataPLANT
arc i update -i v1-test --description "Description v1-test" --title "Title v1-test"

arc export
arc a list
arc s list

arc sync -f -r https://git.nfdi4plants.org/<>/v1-test -m "v1-test"

Fails during validate ARC with

Running with gitlab-runner 16.2.1 (674e0e29)
  on dataplant-runner-0 iAYwqpK5, system ID: r_RntxNI6dNOlh

Preparing the "docker" executor
00:02
Using Docker executor with image ghcr.io/nfdi4plants/arc-validate:main ...
Pulling docker image ghcr.io/nfdi4plants/arc-validate:main ...
Using docker image sha256:31c612d8a4cbd25d26e1ca5263e9699ecb41495a7b9014d96da9c176136b2f0f for ghcr.io/nfdi4plants/arc-validate:main with digest ghcr.io/nfdi4plants/arc-validate@sha256:56352f8074174962e89e6b6367e74901e705092bbe9322057c5772d6d5fca1bf ...

Preparing environment
00:00
Running on runner-iaywqpk5-project-1044-concurrent-0 via 8764d0667e17...

Getting source from Git repository
00:01
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/brilator/v1-test/.git/
Checking out b21e357a as detached HEAD (ref is main)...
Removing arc-summary.md
Removing arc.json
Skipping Git submodules setup

Downloading artifacts
00:02
Downloading artifacts for create ARC JSON (3489)...
Downloading artifacts from coordinator... ok        host=s3.bwsfs.uni-freiburg.de id=3489 responseStatus=200 OK token=64_tMrK_

Executing "step_script" stage of the job script
00:00
Using docker image sha256:31c612d8a4cbd25d26e1ca5263e9699ecb41495a7b9014d96da9c176136b2f0f for ghcr.io/nfdi4plants/arc-validate:main with digest ghcr.io/nfdi4plants/arc-validate@sha256:56352f8074174962e89e6b6367e74901e705092bbe9322057c5772d6d5fca1bf ...
$ echo "Running unit tests... "
Running unit tests... 
$ set +e
$ bash /opt/arc-validate/arc-validate.sh; ret=$?
+ arc-validate
arc-validate failed due to an internal error.
This error did likely NOT occur due to user input.
An empty test result file will be created to reflect this and prevent the validation pipeline from failing.
Run arc-validate with --verbose to see the full error message.
[11:30:14 ERR] arc-validate.arc-validate failed in 00:00:00.0050000. 
arc-validate failed due to an internal error
This error did likely NOT occur due to user input.
An empty test result file will be created to reflect this and prevent the subsequent validation pipeline from failing.
. Actual value was true but had expected it to be false.
   at [email protected](Unit _arg1) in /opt/arc-validate/src/arc-validate/Program.fs:line 14
   at [email protected](Unit unitVar)
   at Microsoft.FSharp.Control.AsyncPrimitives.CallThenInvoke[T,TResult](AsyncActivation`1 ctxt, TResult result1, FSharpFunc`2 part2) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 508
   at Microsoft.FSharp.Control.Trampoline.Execute(FSharpFunc`2 firstAction) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 112 <Expecto>
$ echo "$ret"
3
$ set -e
$ /opt/arc-validate/create-badge.py
$ exit "$ret"

Uploading artifacts for failed job
00:09
Uploading artifacts...
arc-validate-results.xml: found 1 matching artifact files and directories 
arc-quality.svg: found 1 matching artifact files and directories 
Uploading artifacts as "archive" to coordinator... 201 Created  id=3490 responseStatus=201 Created token=64_tMrK_
Uploading artifacts...
arc-validate-results.xml: found 1 matching artifact files and directories 
Uploading artifacts as "junit" to coordinator... 201 Created  id=3490 responseStatus=201 Created token=64_tMrK_

Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 3

@HLWeil
Copy link
Member

HLWeil commented Jan 25, 2024

Hmm not sure whether the validation pipeline is already rolled out for ARC v1.x.x.

@kMutagene @omaus

@kMutagene
Copy link
Member

kMutagene commented Jan 25, 2024

The new package based validation pipelines are not rolled out yet. I think it is the easiest to just ignore these errors until we can move forward next week

@omaus omaus added the Type: Bug Something is not working, and it is confirmed by maintainers to be a bug. label Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.
Projects
Status: No status
Development

No branches or pull requests

5 participants