Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented the file use model with access control for file downloads. #129

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

lsitu
Copy link
Member

@lsitu lsitu commented Jul 31, 2017

Fixes #111 ; refs #111

Implemented the file use model with access control for file downloads.

Changes proposed in this pull request:

  • Implemented the file use model to allow ingesting files with other uses into the same fileset as the original file.
  • Added support to index technical metadata as JSON format for all files in a FileSet for UI display.
  • Implemented access control for to restrict the PreservationMasterFile and the TIFF format files for curator-only access.
  • Display additional files like PreservationMasterFile in the FileSet viewer and enabled file download for curators.

@ucsdlib/developers - please review

@lsitu lsitu mentioned this pull request Jul 31, 2017
Copy link
Member

@mcritchlow mcritchlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments/questions. I think my only overall concern is the number of upstream classes that we seem to be overriding. So my questions are largely targeted around trying to understand the large set of overrides.

Perhaps there isn't an alternative, but I admittedly worry a bit about the sustainability/upgrade path. Hoping @VivianChu and @hweng can take a look as well.

This is not to discount the work on this, really great job @lsitu 👍

@@ -0,0 +1,33 @@
# Actions are decoupled from controller logic so that they may be called from a controller or a background job.
class FileSetAttachFilesActor < Hyrax::Actors::FileSetActor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change was needed here that required us overriding the (presumably) upstream FileSetAttachFilesActor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just extend/override the Hyrax::Actors::FileSetActor to accept a relation/file use parameter.

module Hyrax
module Actors
# Creates a work and attaches files to the work
class CreateWithFilesActor < Hyrax::Actors::AbstractActor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change was needed here that required us overriding the (presumably) upstream CreateWithFilesActor ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to override it to extract file uses attributes from the object metadata and extend method attach_files(files, env, file_uses) to accept the file uses parameter for ingest.

module Hyrax
module Actors
# Attaches remote files to the work
class CreateWithRemoteFilesActor < Hyrax::Actors::AbstractActor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change was needed here that required us overriding the (presumably) upstream CreateWithRemoteFilesActor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above to extract file uses from object metadata and extend it to accept the file uses properties for ingest.

# return [binary]
def relation_content(relation)
case relation
when 'preservation_master_file'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could preservation_master_file be a constant instead of a plain string? Perhaps available via the new ExtendedContainedFiles

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit e8696d3. Thanks.

when 'preservation_master_file'
preservation_master_file = asset.attached_files.base.preservation_master_file
return preservation_master_file if preservation_master_file
when 'transcript'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could transcript be a constant instead of a plain string? Perhaps available via the new ExtendedContainedFiles

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also done in commit e8696d3. Thanks.

@@ -0,0 +1,75 @@
# Converts UploadedFiles into FileSets and attaches them to works.
class AttachFilesToFileSetJob < AttachFilesToWorkJob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change was needed here that required us overriding the (presumably) upstream AttachFilesToFileSetJob?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to extend the functionality to accept the file use parameter and perform ingest files with those files use properties provided as well.


def download_file_permission
can :read, ActiveFedora::File do |file|
file.mime_type == 'image/tiff' ? false : true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check specifically targeting image/tiff? What are we trying to prevent here? The download of source/master files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented because Ryan's comment for not allowing public users to download the TIFF files, see #111 (comment),.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So is this the primarily place where we're ensuring original_file's aren't downloaded by non-curators?

Because if so, it seems like we have far more file types than tiff which would fall under this check. My reading of Ryan's comment is tiff was just a (common) example. Not the sole use case or instance where we have this use case. And, it also seems like it would be nice if we could check the file_use instead of the mime_type to determine download-ability? Should we clarify with Ryan?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcritchlow Yes, this is one and there could be the videos/audios as well. But it will require deeper and tricker overriding in the derivative creative and technical metadata extraction layer if we change the strategic to ingest a primary file with File Use other than the OriginalFile, which will make it harder for us to maintain our codes over time. So it seems like checking for mime_type la cleaner solution for us at the moment.

@@ -2,4 +2,54 @@

class FileSetIndexer < Hyrax::FileSetIndexer
self.thumbnail_path_service = ::ThumbnailPathService

def generate_solr_document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was there previously no Solr document content generated for FileSets? Or are we overriding what was there and generating our own here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hyrax does have solr document generate for FileSets but unfortunately it won't support ingest/store multiples files in one fileset at this time. And there are no strategic to store technical metadata for other files other than the original file in the same fileset. So we need to extend it to include technical metadata for all other files in Solr document so that we can expose them for UI displaying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense. Thanks for clarifying 👍

# download that file and put it into Fedora
# Called by AttachFilesToWorkJob (when files are uploaded to s3)
# and CreateWithRemoteFilesActor when files are located in some other service.
class ImportUrlWithFileUseJob < ::ImportUrlJob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change was needed here that required us overriding the (presumably) upstream ImportUrlWithFileUseJob?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main purpose for overriding here is also to extend the functionality to accept the file_use/relation parameter and ingest the file with the file use specified

@lsitu lsitu force-pushed the feature/file_uses branch from a7ccba9 to e8696d3 Compare August 1, 2017 17:45
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 92.58% when pulling e8696d3 on feature/file_uses into 4989221 on develop.

@VivianChu
Copy link
Member

👍

@mcritchlow
Copy link
Member

Noting @lsitu 's post on samvera google group which finally got some discussion going. I'm going to attend the tech call tomorrow to get more feedback from folks. Then we can circle back and discuss next steps with this. So let's hold the merge for now, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants