Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FGS/Solr Config Precludes "Islandora Collection Search" from Working #213

Open
McFateM opened this issue Mar 28, 2019 · 5 comments
Open
Labels
Bug A bug in one of the images or containers. ISLE-v.1.1.3

Comments

@McFateM
Copy link
Member

McFateM commented Mar 28, 2019

Issue description

I believe that our current FGS/Solr config in ISLE v1.1 does not work properly with the optional "Islandora Collection Search" module. This module introduces a Solr field typically named "ancestors_ms", and this field is used to identify which collection(s) an object belongs too. In a pristine ISLE v1.1 instance there are warnings in the Fedora logs that I believe are indicative of the system's inability to successfully generate this Solr field in newly ingested or re-indexed objects.

For an issue, describe steps to reproduce the issue

Sorry for the length...

Solr RELS-EXT Warnings

Steps to repeat this error are as follows:

0) Made a fork and local clone

First I forked the ISLE to https://github.com/McFateM/ISLE-1, then cloned the ISLE-v1.1 branch of that fork to my staging server, DGDockerX.grinnell.edu.

1) Upgraded my staging server to Docker version 18.09.3, build 774a1f4

2) Cleaned up Docker and confirmed

docker-compose down
docker stop $(docker ps -q)
docker system prune --volumes

Confirmed. No traces of previous Docker elements left.

3) Did a docker-compose up -d on an UN-MODIFIED clone of ISLE v1.1

The Fedora logs (I viewed them using Portainer) showed this warning, but none others:

Containers > isle-fedora-ld > Logs

Warning: Stylesheet module
  file:/usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/islandora_transforms/RELS-EXT_to_solr.xslt is included or imported more than once. This is permitted, but may lead to errors or unexpected behavior

4) Complete the site build

time docker exec -it isle-apache-ld bash /utility-scripts/isle_drupal_build_tools/isle_islandora_installer.sh

5) Checked the Fedora logs again

Having done NOTHING but spin up the new site, the Fedora logs reported one set of warnings (like the group shown below) for each of the 13 "standard" objects now in the repository.

Containers > isle-fedora-ld > Logs

WARN 2019-03-27 23:51:25,689 (OperationsImpl) IndexDocument islandora:personCModel does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-27 23:51:25,730 (OperationsImpl) IndexDocument islandora:sp-audioCModel does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-27 23:51:25,770 (OperationsImpl) IndexDocument fedora-system:ServiceDefinition-3.0 does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-27 23:51:25,808 (JodaAdapter) Failed to transform "info:fedora/islandora:collectionCModel" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-27 23:51:25,808 (JodaAdapter) Failed to transform "info:fedora/islandora:collectionCModel" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-27 23:51:25,808 (JodaAdapter) Failed to transform "info:fedora/islandora:root" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-27 23:51:25,808 (JodaAdapter) Failed to transform "info:fedora/islandora:root" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-27 23:51:25,904 (OperationsImpl) IndexDocument islandora:sp%5Fweb%5Farchive does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex

This is the same set of warnings I've been seeing for the past week.

6) Added a random object to the repository

I selected a .png image and added it to the islandora:sp_basic_image collection. I entered Basic Image Test as the object's title. A new group of warnings was generated:

Containers > isle-fedora-ld > Logs

WARN 2019-03-28 02:09:55,652 (JodaAdapter) Failed to transform "info:fedora/islandora:sp_basic_image_collection" to something Solr understands. PID: "islandora:1" DSID: "RELS-EXT"
WARN 2019-03-28 02:09:55,652 (JodaAdapter) Failed to transform "info:fedora/islandora:sp_basic_image_collection" to something Solr understands. PID: "islandora:1" DSID: "RELS-EXT"
WARN 2019-03-28 02:09:55,663 (JodaAdapter) Failed to transform "info:fedora/islandora:sp_basic_image" to something Solr understands. PID: "islandora:1" DSID: "RELS-EXT"
WARN 2019-03-28 02:09:55,663 (JodaAdapter) Failed to transform "info:fedora/islandora:sp_basic_image" to something Solr understands. PID: "islandora:1" DSID: "RELS-EXT"

The new object appears to behave as it should, and in spite of the warnings, it appears to have a full-complement of data in Solr. However, my concern stems from introduction of the Islandora_Collection_Search module, as it relies on a simple modification to the FGS/Solr config in order to transform and subsequently index 'parent' data gleaned from RELS-EXT. My hypothesis is that these warnings become significant ony if/when the aforementioned module is engaged.

To test this hypothesis I...

7) Added the Islandora_Collection_Search module to my site

[islandora@dgdockerx ISLE-1]$ docker exec -it isle-apache-ld bash
root@ed30a92c1834:/# cd /var/www/html/sites/all/modules/islandora/
root@ed30a92c1834:/var/www/html/sites/all/modules/islandora# git clone https://github.com/discoverygarden/islandora_collection_search.git
Cloning into 'islandora_collection_search'...
remote: Enumerating objects: 195, done.
remote: Total 195 (delta 0), reused 0 (delta 0), pack-reused 195
Receiving objects: 100% (195/195), 44.24 KiB | 1.34 MiB/s, done.
Resolving deltas: 100% (100/100), done.
root@ed30a92c1834:/var/www/html/sites/all/modules/islandora# drush en islandora_collection_search
The following extensions will be enabled: islandora_collection_search
Do you really want to continue? (y/n): y
islandora_collection_search was enabled successfully.                                                                                                                        [ok]
root@ed30a92c1834:/var/www/html/sites/all/modules/islandora#

8) Returned to the site as User 1 (Administrator) and added the Islandora Collection Search block

The path for making this change was: https://dgdockerx.grinnell.edu/#overlay=admin/structure/block
I saved the aformentioned block into Sidebar first and returned home to https://dgdockerx.grinnell.edu
The Islandora Collection Search block appeared in the left-hand sidebar.

9) Used the new search block to look for "test"

This is a Solr search through all collections. It returned one object, my image from Step 6.

10) Configure the Islandora Collection Search

At https://dgdockerx.grinnell.edu/islandora/search/test?type=edismax#overlay=admin/islandora/tools/collection_search I accepted the default Ancestor field: ancestors_ms, set GSearch Endpoint: http://dgdockerx.grinnell.edu:8081/fedoragsearch/rest, set GSearch User: fgsAdmin, set GSearch Password: ild_fgs_admin_2018, and checked ALL of the remaining boxes before concluding by clicking the Configure button to submit the form.

11) Cleared the Drupal cache and searched again

[islandora@dgdockerx ISLE-1]$ docker exec -it isle-apache-ld bash
root@ed30a92c1834:/# cd /var/www/html/sites/default/
root@ed30a92c1834:/var/www/html/sites/default# drush cc all
'all' cache was cleared.   

Then I returned to the site and repeated Step 9 for All collections. Results were the same... my one item was returned.

12) Test again with the object's parent collection selected

Repeating Step 9 but with Basic Image Collection selected. This returned 0 objects! Me thinks I smell a fish.

13) Turn on generation of ancestors_ms in foxmlToSolr.xslt and restart the stack

nano docker-compose.override.yml

Remove the comment markers from the three lines that now read:

fedora:
  volumes:
    - ./config/gsearch/foxmlToSolr.xslt:/usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/foxmlToSolr.xslt

Then...

docker-compose down
docker-compose up -d

14) Check that foxmlToSolr.xslt changes are present

[islandora@dgdockerx ISLE-1]$ docker exec -it isle-fedora-ld grep "true()" /usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/foxmlToSolr.xslt
  <xsl:param name="index_ancestors" select="true()"/>
  <xsl:param name="index_ancestors_models" select="true()"/>

Confirmed.

15) Re-index all content

Visit http://dgdockerx.grinnell.edu:8081/fedoragsearch/rest?operation=updateIndex and click on updateIndex fromFoxmlFiles. The screen should show Updated number of index documents: 14.

16) Check the Fedora logs in Portainer again

Visit https://portainerx.grinnell.edu/#/containers, click on 7 Containers, and the "logs" icon for isle-fedora-ld.
Bad news... more warnings, including:

file:///usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/islandora_transforms/library/traverse-graph.xslt; Line #102; Column #51; Can not load requested doc: The element type "img" must be terminated by the matching end-tag "</img>".
WARN 2019-03-28 02:49:12,447 (OperationsImpl) IndexDocument islandora:personCModel does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-28 02:49:12,510 (OperationsImpl) IndexDocument islandora:sp-audioCModel does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-28 02:49:12,543 (OperationsImpl) IndexDocument fedora-system:ServiceDefinition-3.0 does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex
WARN 2019-03-28 02:49:12,574 (JodaAdapter) Failed to transform "info:fedora/islandora:collectionCModel" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-28 02:49:12,575 (JodaAdapter) Failed to transform "info:fedora/islandora:collectionCModel" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-28 02:49:12,575 (JodaAdapter) Failed to transform "info:fedora/islandora:root" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
WARN 2019-03-28 02:49:12,575 (JodaAdapter) Failed to transform "info:fedora/islandora:root" to something Solr understands. PID: "islandora:newspaper_collection" DSID: "RELS-EXT"
file:///usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/islandora_transforms/library/traverse-graph.xslt; Line #102; Column #51; Can not load requested doc: The element type "img" must be terminated by the matching end-tag "</img>".
WARN 2019-03-28 02:49:12,828 (OperationsImpl) IndexDocument islandora:sp%5Fweb%5Farchive does not contain any IndexFields!!! RepositoryName=FgsRepos IndexName=FgsIndex

While all of these warnings are troublesome, it's my belief that the Failed to transform... warnings are what's keeping the sytem from generating the ancestors_ms field that Solr needs to make Islandora Collection Search work.

I'll continue this story as I try to find the root cause of these warnings and find a fix for this condition. I'll start with trying to identify and eliminate the original "duplicate" include warning... RELS-EXT_to_solr.xslt is included or imported more than once which is still present in the logs.

What's the expected result?

Newly ingested objects, or those re-indexed using FGS/Solr, should have a populated ancestors_ms field so that they show up in Solr searches that are restricted to look only within an appropriate parent collection. This is the primary purpose of the Islandora Collection Search module.

What's the actual result?

The warnings included in the description above, and no ancestors_ms field for the module to key on.

Additional details / screenshots

None at this time.

@noahwsmith
Copy link
Member

ancestors_ms requires specific activation, per https://github.com/discoverygarden/islandora_collection_search#installation

@McFateM
Copy link
Member Author

McFateM commented Mar 28, 2019

Yes, that's true @noahwsmith. However, I believe that in ISLE v1.1 all of the described edits are already in place, but within ./islandora_transforms/research_data_versions.xslt the RELS-EXT_to_solr.xslt is included a 2nd time...it was already included "properly" in the unmodified foxmlToSolr.xslt.

The Fedora logs warn that /usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/islandora_transforms/RELS-EXT_to_solr.xslt is included or imported more than once. This is permitted, but may lead to errors or unexpected behavior

I hate it when that happens. I wish this were an ERROR rather than a WARNing. It really should not be tolerated.

I'm going to reset my staging server, re-introduce my "fix", and see what happens. There is still one small bit of mystery here to unravel.

@noahwsmith
Copy link
Member

Ok, terrific, so it sounds like the issue might be that this line is extraneous
https://github.com/discoverygarden/islandora_transforms/blob/master/research_data_versions.xslt#L9

If this is verifiable we can probably modify that file as it is pulled into the system as we do for these config variables:
https://github.com/Islandora-Collaboration-Group/isle-fedora/blob/f7aa6f9eabcfd586c68278e98b6003a36722d0e3/Dockerfile#L103

Another option would be to submit a patch to DGI and see if they'll update their repo. Cloning an ICG version would just put us on the hook to continue to maintain it.

@noahwsmith noahwsmith added Bug A bug in one of the images or containers. ISLE-v.1.1.2 labels Mar 28, 2019
@McFateM
Copy link
Member Author

McFateM commented Mar 28, 2019

OK, here's the rest of the story, so far...

17) Eureka! Fixing the "...imported more than once..." appears to have fixed this problem

I found that in a pristine ISLE v1.1 stack RELS-EXT_to_solr.xslt is indeed imported TWICE, once in foxmlToSolr.xslt and again in islandora_transforms/research_data_versions.xslt. So, I commented out the import statement within research_data_versions.xslt, placing my updated copy in ./ISLE/tmp/, and introduced the following bind-mount into my docker-compose.override.yml file:

fedora:
  volumes:
     - ./tmp/research_data_versions.xslt:/usr/local/tomcat/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/islandora_transforms/research_data_versions.xslt

Then I repeated Steps 2 through 16, with abbreviated, updated results posted below.

  1. Same as before.
  2. There were NO warnings, zero, nada, zilch, none, this time!
  3. Same as before.
  4. The pesky Failed to transform... warnings are still present! Maybe those are actually NOT part of this issue?
  5. Added the same Basic Image Test object as before. The same four new Failed to transform... warnings are back.
  6. Same as before, plus a drush cc all just for good measure.
  7. Same as before.
  8. Same as before, my one new object was returned.
  9. Same as before.
  10. Same as before, my one new object still returned.
  11. Same as before, NO objects returned when searching the object's parent collection only.
  12. Same as before.
  13. Same as before.
  14. Same as before.
  15. All of the Failed to transform... warnings are still present, but the Line 102, Column 51... warnings are gone.

Now, let's see how our Islandora Collection Search behaves...

  • Searched for "test" in the Basic Image Collection and it works, Solr found my object, and it shows that the object now has TWO values in the ancestors_ms field: islandora:sp_basic_image_collection, islandora:root.

Conclusions

  • Commenting out the import "RELS-EXT_to_solr.xslt" line in research_data_versions.xslt AND changing <xsl:param name="index_ancestors" select="false()"/> to <xsl:param name="index_ancestors" select="true()"/> in foxmlToSolr.xslt has the desired effect. I believe these changes should be introduced into the next release of ISLE and reported to DiscoveryGarden.

  • The Failed to transform... warnings in FGS are still troubling. Who knows what effect they might be having now, or in the future? In the very least, there are so many of them that you have to wonder what else they might obscure, as was really the case in my tests. The key seems to have been the ONE warning buried among dozens of others.

  • Somewhere in the config there's bound to be a logLevel variable that controls how much output we see from FGS (and other services). I recommend identifying that setting and turning it DOWN from INFO to WARN, that way we won't have to sift though all the INFO messages to discover that there might be some WARN or ERROR messages present!

@bseeger
Copy link
Contributor

bseeger commented Apr 22, 2019

I think this might be hard to fix, but it's worth filing a ticket on their repo for it: https://github.com/discoverygarden/islandora_transforms

@McFateM - do you mind summarizing it there so that it gets on their radar? They might have a suggestion or work around, as you're probably not the only on who will run into this.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A bug in one of the images or containers. ISLE-v.1.1.3
Projects
None yet
Development

No branches or pull requests

3 participants