-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerPath and nested elements (WAS: offset problem...) #486
Comments
The workaround, or fix, depending on your point of view, is to replace this line in the configuration: containerPath: .//text with this: containerPath: ./text I assume this works because it now does everything relative to the top-level In versions of BlackLab prior to 3.x, we simply had |
Yes, that's a good way to address this issue for now. With Arguably we should concatenate I also tried your example with |
Thanks for the reply. I can't reproduce the problem now with |
With a current checkout of the dev branch and Java 11:
$ git describe v4-alpha2-34-g16ef16df $ java -version openjdk version "11.0.20.1" 2023-08-24 OpenJDK Runtime Environment Temurin-11.0.20.1+1 (build 11.0.20.1+1) OpenJDK 64-Bit Server VM Temurin-11.0.20.1+1 (build 11.0.20.1+1, mixed mode)
and attempting to index this tiny TEI-like document boiled down from a much larger real-world example:
with this simple input format configuration file:
I observe the following crash:
If the document doesn't have nested
<text>
elements, the crash goes away. It also goes away if I don't use the saxon processor or if I have a document with only one<w>
element rather than two.I don't know anything about how BlackLab uses offsets, but I infer that some of the time the offset gets caculated relative to the nearest ancestor
<text>
element and sometimes relative to the great-grandparent and the two don't match.I'm attaching a zip file containing the reproducer files quoted above.
bloffsetbug.zip
The text was updated successfully, but these errors were encountered: