Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logic to handle missing section breaks in text and volpiano #26

Merged
merged 19 commits into from
Jan 11, 2024

Conversation

dchiller
Copy link
Collaborator

@dchiller dchiller commented Nov 22, 2023

This PR:

  • Introduces _infer_barlines to infer barlines in text or volpiano in cases where there is a mismatch in the number of sections in either string
  • Changes how volpiano is handled during alignment so that it is "syllabified" into the same data structure as the syllabified text is. This makes it easier to compare syllables in the case of missing barlines.
  • Adds a few test cases relevant to missing section breaks in either text or volpiano.
  • Handles cases where "too many" hyphens are added after words in the volpiano (while most of the time errors in the number of hyphens would require more complex logic to automatically fix because of how different numbers of hyphens signal different things, cases where extra hyphens are included between words are relatively straightforward: just treat them as a word break)
  • Adjusts some handling for initial clefs (and removes clefs later in the volpiano string)

Closes #4, Closes #27, Closes #28, Closes #29

@dchiller dchiller changed the title Handle missing barlines Add logic to handle missing section breaks in text and volpiano Nov 22, 2023
@dchiller dchiller marked this pull request as ready for review November 22, 2023 15:30
Copy link
Collaborator

@jacobdgm jacobdgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments on a few topics. the most substantive ones:

  • readability could be improved if you wrote standalone functions for various flattening tasks, rather than doing them imperatively where they come up in your functions
  • I've not tried applying this in any code I've written before, but there are several places where custom type aliases could make your type annotations much clearer/simpler. (I think type aliases is the right term: https://docs.python.org/3/library/typing.html#typing.TypeAlias)
  • I don't fully understand why the _infer_barlines function works the way it does.

volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
@dchiller dchiller requested a review from jacobdgm December 22, 2023 22:04
Harmonizes handling of consecutive missing music sections
in text and volpiano. Reverts a previous change that combined
consecutive texts sections associated with missing music into
a single section, but subsequent review of the volpiano protocols
and currently-existing texts in Cantus DB suggests that they are
encoded as separate sections (ie. with multiple missing music markers)
…ines

Rather than raising an error when text and volpiano have different numbers
of sections but the same number of barlines, we just pad the shorter of the two
(text or volpiano) with additional "empty" sections.
@dchiller
Copy link
Collaborator Author

dchiller commented Jan 2, 2024

The commit 0809ac9 has been tested with every chant that has both volpiano and full text manuscript spelling in Cantus DB. Only 12 chants fail to align, and all contain significant encoding errors. Those chants are:

493596 - Only clef encoded
493617 - Only clef encoded
495427 - Incorrect encoding of missing text within missing music section
495975 - Unmatched close curly brace
497843 - Incorrect encoding of missing word ("#")
254921 - Hyphen typo
670585 - Abnormal text encoding
602605 - Incorrect IPSUM encoding (missing ~)
602629 - Incorrect IPSUM encoding (missing ~)
603511 - Incorrect IPSUM encoding (missing ~)
399268 - Typo (close ")" used instead of "}")
218406 - Unmatched open curly brace

tests/tests.py Outdated Show resolved Hide resolved
tests/tests.py Outdated Show resolved Hide resolved
volpiano_display_utilities/syllabified_section.py Outdated Show resolved Hide resolved
volpiano_display_utilities/volpiano_syllabification.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
volpiano_display_utilities/text_volpiano_alignment.py Outdated Show resolved Hide resolved
@dchiller dchiller requested a review from jacobdgm January 10, 2024 16:33
@dchiller dchiller merged commit 087388f into main Jan 11, 2024
1 check passed
@dchiller dchiller deleted the handle-missing-barlines branch January 11, 2024 14:55
@annamorphism
Copy link

Only 12 chants fail to align, and all contain significant encoding errors. Those chants are:

Just commenting that these 12 problem chants SHOULD all now be fixed (and free of many other proofreading errors besides!).

@dchiller
Copy link
Collaborator Author

Just commenting that these 12 problem chants SHOULD all now be fixed (and free of many other proofreading errors besides!).

Excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants