Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentence Boundary Structure #41

Open
lgessler opened this issue Feb 15, 2024 · 0 comments
Open

Sentence Boundary Structure #41

lgessler opened this issue Feb 15, 2024 · 0 comments
Labels
enhancement A development proposal that extends functionality

Comments

@lgessler
Copy link
Owner

Sentences can in principle be represented by a Span Layer. A couple of ways you could do this:

  1. Sentences are delimited by single spans, each of which indicates the beginning (or end) of a sentence.
  2. Sentences are identified by single spans, each of which contains every token of the sentence

With this in mind, we didn't explicitly include a structure for sentences in the core model. However, it might be more ergonomic for UI programmers to do so. One way you could do this is with a :token/start-of-sentence that indicates that any token that has this set to true is the beginning of a sentence.

The upside of this is that it makes it a bit easier to work with when you're given a document tree (I'd imagine), and there's no additional configuration overhead. The downside is that in (unusual, I expect) cases where you want multiple sentence-like grouping of tokens, it may be confusing to have this as an option next to using span layers. There is additionally the more remote concern that this would be inconsistent with our goal of providing only layers that are strictly necessary structurally.

@lgessler lgessler added the enhancement A development proposal that extends functionality label Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A development proposal that extends functionality
Projects
None yet
Development

No branches or pull requests

1 participant