Test cases for bag-info values #8

acdha · 2017-02-28T19:45:48Z

In discussion with @johnscancella and @reeset on reeset/bagit_sharp#1 had some additions to the duplicate metadata testcase:

Interleaved ordering (e.g. contact-email / contact-name / contact-email / contact-name) should be preserved when the bag is saved
Case-insensitive handling of the keys – e.g. "Contact-Email" and "contact-email". https://tools.ietf.org/html/draft-kunze-bagit-14#page-6 only mandates insensitivity for the reserved names but I think this should be clarified in the spec to mandate case-insensitive access for all values to avoid confusion.

This adds tests to confirm that order is preserved across duplicate keys and for case-insensitivity for custom key values. This is only half of the test since a library would need to confirm the expected output for key access and after saving.

johnscancella · 2017-03-06T18:43:56Z

Case-insensitive handling of the keys – e.g. "Contact-Email" and "contact-email". https://tools.ietf.org/html/draft-kunze-bagit-14#page-6 only mandates insensitivity for the reserved names but I think this should be clarified in the spec to mandate case-insensitive access for all values to avoid confusion.

is this getting into too much detail of the implementation?

acdha · 2017-03-07T13:21:25Z

@johnscancella I'm a bit mixed on that but I've been leaning case-insensitive for everything since the info file is intended to be human-managed and humans tend not to care. It seems like a bug if someone has “Crawl-date” in one bag and “Crawl-Date” in another but a program only sees one of them because it used a case-sensitive library.

What do you think - play it conservatively and add a spec update before finishing this ticket? This might either be partially out of scope or otherwise incompatible with this repo currently since we only collect valid/invalid bags and this would be a discrepancy in how a tool processes the bag rather than a question of validity.

johnscancella · 2017-03-07T13:28:49Z

I vote to say make a spec update and say that keys are case insensitive but the values are case sensitive since there might be some special meaning.

I would also add that implementations should preserve case of the keys as entered. That way you can do something like

read a existing bag
write it out to a different directory
compare file to file - there should be 0 differences

acdha · 2017-03-07T13:41:06Z

Definitely, I’m certain the intention was always that only keys were case-insensitive. I wonder how we should recommend a test process: what we have right now is basic compliance where a bag is valid or invalid. Do you think we should specify that testing has to include a load-save cycle or have that as something like compliance levels where a level 1 implementation could just be something which validates but can’t even do anything else?

johnscancella · 2017-11-16T16:06:47Z

Yeah, we can specify that in the README and use bagit.py and bagit-java as source implementations to look at for testing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test cases for bag-info values #8

Test cases for bag-info values #8

acdha commented Feb 28, 2017 •

edited

Loading

johnscancella commented Mar 6, 2017

acdha commented Mar 7, 2017

johnscancella commented Mar 7, 2017

acdha commented Mar 7, 2017 via email

johnscancella commented Nov 16, 2017

Test cases for bag-info values #8

Test cases for bag-info values #8

Comments

acdha commented Feb 28, 2017 • edited Loading

johnscancella commented Mar 6, 2017

acdha commented Mar 7, 2017

johnscancella commented Mar 7, 2017

acdha commented Mar 7, 2017 via email

johnscancella commented Nov 16, 2017

acdha commented Feb 28, 2017 •

edited

Loading