Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7.x islandora 2046: Allow Islandora Batch to be extended for more fun #105

Open
wants to merge 3 commits into
base: 7.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 31 additions & 18 deletions includes/ingest.batch.inc
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ define('ISLANDORA_BATCH_EMPTY_SET', 2);
* Function to get the average.
*
* @param array $context
* The context
* The context.
*/
function islandora_batch_get_average($context) {
if ($context['results']['count'] > 0) {
Expand All @@ -39,7 +39,7 @@ function islandora_batch_ingest_preprocess($preprocessor, &$context) {
/**
* Get the name of the lock to acquire/release.
*
* @param string|int|NULL $ingest_set
* @param string|int|null $ingest_set
* A string or integer identifying an ingest set to process. NULL indicates
* that it is a general/set-independent batch, processing everything in the
* queue.
Expand Down Expand Up @@ -91,7 +91,7 @@ function islandora_batch_get_lock_timeout($timeout) {
* The general/set-independent batch should not be able to run at the same time
* as any other batch.
*
* @param string|int|NULL $ingest_set
* @param string|int|null $ingest_set
* A string or integer identifying an ingest set to process. NULL indicates
* that it is a general/set-independent batch, processing everything in the
* queue.
Expand Down Expand Up @@ -373,7 +373,6 @@ function test_islandora_batch_process_results($results, $timeout, $lock_timeout,
return TRUE;
}


/**
* Process set of result from the islandora_batch_queue table.
*
Expand All @@ -397,11 +396,11 @@ function islandora_batch_process_results($results, $timeout, $lock_timeout, &$co
) {
$start = timer_read(ISLANDORA_BATCH_TIMER_NAME);
// Process a single object.
$ingest_object = unserialize($object['data']);
$islandora_batch_object = unserialize($object['data']);

if ($object['state'] !== ISLANDORA_BATCH_STATE__DONE) {
// Both a simple state and state with message return are accepted.
$process_results = $ingest_object->batchProcess();
$process_results = $islandora_batch_object->batchProcess();
$object['state'] = is_array($process_results) ? $process_results['state'] : $process_results;
}

Expand All @@ -410,31 +409,45 @@ function islandora_batch_process_results($results, $timeout, $lock_timeout, &$co
// XXX: Due to how the things currently work, the user name and
// password hash gets serialized, so later changes to the password
// can break things... Let's set the password again.
$username = $ingest_object->repository->api->connection->username;
$username = $islandora_batch_object->repository->api->connection->username;
$user = user_load_by_name($username);
if ($ingest_object->repository->api->connection->password != $user->pass) {
$ingest_object->repository->api->connection->password = $user->pass;
if ($islandora_batch_object->repository->api->connection->password != $user->pass) {
$islandora_batch_object->repository->api->connection->password = $user->pass;
}

// Push to backend.
$ingested_object = islandora_add_object($ingest_object);
if ($ingested_object) {
$context['message'] = t('Ingested %pid.', array('%pid' => $ingested_object->id));
$actionresult_object = $islandora_batch_object->batchRepositoryAction();

if ($actionresult_object) {
$context['message'] = t('Successfully %action %pid.',
array(
'%pid' => $actionresult_object->id,
'%action' => $islandora_batch_object->getRepositoryActionDescription(),
));
}
else {
// Failed to ingest... Flag an error.
// Failed to process on backend... Flag an error.
$object['state'] = ISLANDORA_BATCH_STATE__ERROR;
$context['message'] = t('Unknown error: Failed to ingest %pid.', array('%pid' => $ingest_object->id));
$context['message'] = t('Unknown error: Failed to %action %pid.',
array(
'%pid' => $islandora_batch_object->id,
'%action' => $islandora_batch_object->getRepositoryActionDescription(),
));
}
}
catch (Exception $e) {
// Failed to ingest... Flag an error.
// Backend exception happened... Flag an error.
$object['state'] = ISLANDORA_BATCH_STATE__ERROR;
$context['message'] = t('Exception occured: Failed to ingest %pid.', array('%pid' => $ingest_object->id));
$context['message'] = t('Exception occured: Failed to %action %pid because of %message.',
array(
'%pid' => $islandora_batch_object->id,
'%action' => $islandora_batch_object->getRepositoryActionDescription(),
'%message' => $e->getMessage(),
));
}
}
else {
$context['message'] = t('%pid not ready for ingest.', array('%pid' => $ingest_object->id));
$context['message'] = t('%pid not ready for ingest.', array('%pid' => $islandora_batch_object->id));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but should ingest not be replaced by the selected action (if so, perhaps re-wording to '%pid cannot be %action: object not ready.' could help with the grammar...)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, totally. Only issue i had there is verb conjugation/my-lack of wording. I don't know how to convert that phrase to "simple past", since "ingested" is what i was using/suggesting. I could, instead of returning a single "action" description, maybe return an array, with present/past verbs. Ideas? Or other way, change wording of other past-dependant sentences to present?
Also, there are many other mentions (like function names) to ingest, i tried to go for a simple solution for now that allowed the use case to avoid too much refactoring. Reason: every-time i have done big-time refactoring everyone ends using my branch and patch, but i don't get merged into core, too much of a commitment!

Totally open to suggestions

Copy link
Author

@DiegoPino DiegoPino Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe..
'Successfully %action %pid'which leads to "successfully ingested islandora:monster"
could be reworded as
'We managed to successfully ingest islandora:monster' ? or Islandora managed to, or Islandora Batch, etc..
or "Your Ingest of islandora:monster was successful"
What do you think @d-r-p ? That would a single present tense verb to be used to describe the action itself

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think the more compact the message is, the better. On the other hand, leaving past tenses lying around is somehow awkward (by the way, I guess it would be a good idea to wrap whatever we decide on with t()) - let alone present + past tenses. If you do not like my earlier suggestion of reformulating that one message into past tense, how about returning t("Ingest") and alerting the user with '%action action for %pid succeeded.', '%action action for %pid failed.', etc... (or similar)?

Regarding the naming of functions (and files?), perhaps a little bit of refactoring would be good. I think, otherwise, this might cause confusion if picked up after some time. But feel free (you, or anybody) to disagree...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I missed your earlier suggestion! (sorry), yeah, sounds fine to me and i will go for that, thanks. t() and watchdog don't play well... since we are kinda accumulating the messages. I remember having quite some issues with that in the past so will stay away from that now. Also, i always forget our classes are really not generic php classes, they are used inside the weirdness of drupal, so t() can be used...

About naming of functions and files. Refactor can bring fear and riots as said before. Will wait a bit and if someone else feels it is needed right now can do of course. I would need to deprecate some functions and maybe too much. Lets discuss this. This was meant to actually be eventually merged!

}

// Update the info in the database.
Expand All @@ -450,7 +463,7 @@ function islandora_batch_process_results($results, $timeout, $lock_timeout, &$co
);

// Pass hook object and a pass/fail state for the object.
module_invoke_all(ISLANDORA_BATCH_OBJECT_PROCESSED_HOOK, $ingest_object, ($object['state'] === ISLANDORA_BATCH_STATE__DONE ? 1 : 0));
module_invoke_all(ISLANDORA_BATCH_OBJECT_PROCESSED_HOOK, $islandora_batch_object, ($object['state'] === ISLANDORA_BATCH_STATE__DONE ? 1 : 0));

$end = timer_read(ISLANDORA_BATCH_TIMER_NAME);
if (!isset($context['results']['count'])) {
Expand Down
35 changes: 34 additions & 1 deletion includes/islandora_batch_object_base.inc
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@
/**
* Batch interface.
*
* Implementing classes should subclass some version of FedoraObject, so
* Implementing classes should subclass some version of FedoraObject, so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was there previously, but... this isn't quite a full sentence, here... Really... not sure the intent of this comment... just seems to reinforce the extends below... could just remove the comment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally. I was more like waiting for someone to complete the sentence! but yeah, can remove

*/
abstract class IslandoraBatchObject extends IslandoraNewFedoraObject {

/**
* The initial batch state.
*
Expand Down Expand Up @@ -71,6 +72,34 @@ abstract class IslandoraBatchObject extends IslandoraNewFedoraObject {
*/
abstract public function addRelationships();

/**
* The actual back-end action that is run after the batchProcess.
*
* Classes not overriding this will get objects pushed to the back-end.
* (ingested)
*
* @return AbstractObject|false
* Either a newly ingested Abstract Object or false if failed.
*/
public function batchRepositoryAction() {
if (!islandora_object_load($this->id)) {
return islandora_add_object($this);
}
else {
return FALSE;
}
}

/**
* A human readable short description of the repository action.
*
* @return string
* Recomended value is a verb in past tense.
*/
public function getRepositoryActionDescription() {
return "ingested";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manner in which this is implemented/used will lead to messages which are not quite appropriately translatable... as is, may lead to incorrect grammar in some languages.

... Would have to allow each implementation to provide their own complete messages (passing the arguments to receive a full response)? Or just the individual template strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that mean generating full messages for success, exceptions and errors? Issue i see with that is that adding then afterwards arguments coming from the actual batch would make things quite complicated. What do you propose

Copy link

@d-r-p d-r-p Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adam-vessey @DiegoPino Perhaps something along the lines of what I suggested earlier ('%action action for %pid succeeded.', etc...) might be easier to translate, while allowing us to keep batch arguments? [EDIT: Then the above function would return "Ingest".]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As indicated... allow implementations to either provide complete messages or template strings.

Or a particular implementation approach? Kind of leaning towards the template things... Possibly one of:

  • a method passed a constant indicating the type of message, returning the template string, into which we then substitute our few things using format_string(); or,
  • a method returning a mapping keyed by such a constant we then select the message from and similarly throw it at format_string()

... there are a number of ways it could be accomplished.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will work on the template strings idea, probably 3 ones would suffice. Wonder in that case if the full generation t() et all // - hate having t()in a class...- + argument expansion/replacement should happen an implementing class or should stay on batch itself as it is today? ideas @d-r-p @adam-vessey @bryjbrown ? The simpler the better i would say because the more i add the more i will be open other wormholes .. the implementation would be aware that batch itself only provides the %pid really...and some Drupal exception message...

}

/**
* Get the ID of this object in the queue.
*
Expand All @@ -92,6 +121,10 @@ abstract class IslandoraBatchObject extends IslandoraNewFedoraObject {
)));
}
}

}

/**
* An Islandora Batch Exception Class.
*/
class IslandoraBatchNoIdException extends Exception {}
10 changes: 9 additions & 1 deletion islandora_batch.install
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ function islandora_batch_schema() {
'not null' => FALSE,
),
),
'primary key' => array('id'),
'primary key' => array('id', 'sid'),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not... sure this is valid, given all the joins which happen... everything thing that refers to this as a foreign key would be broken, such as the state, resource and message tracking (the islandora_batch_state, islandora_batch_resources and islandora_batch_queue_messages, respectively) are all associated with the unique entry in the queue... it would now be sharing things across sets, which probably isn't desirable... state especially... and might be desirable to reword some of the things: instead of "per-object" messages, "per-queue entry", kind of thing...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"broken" might be the wrong word... more like... consistent in unexpected ways, when navigating the GUI?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adam-vessey look at
https://gist.github.com/DiegoPino/e7819c6f67e50b7a4193e43b2744e1f3 and let me know what do you think there. That also implies changing/adding a few db methods to deal with the fact that we wont be able to trust that result of operations will give is always a single queue element for a given PID

In any case Foreign keys will/won't be broken at all and without any extra stuff functionality stays the same with this change. Drupal makes no use of Foreign keys (see them as annotation properties for future work that won't happen anymore). Best example is that our foreign keys are already wrongly defined (see how the gist changes that), we were not using 'columns' key and still nothing ever broke.

The way i see this is that this primary key is more an internal optimization to A) disregard the integrity table conflict/error and B) make search in the table faster.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a glance, the gist looks fine (a diff/something highlighting the changes would be more readable), but it's not so much the schema itself I'm concerned with, 'cause yeah, foreign keys are just descriptive as you point out; however, they're nice to keep up-to-date for to facilitate these types of discussions...

What I'm more concerned with is more the things like the various state checks:

... The checks on parents as well would likely have to change, since they can no longer be the single columns check... probably safe to constrain to the same set:

->where('c.parent = p.id')

... There would be some views adjustments also in order, to constrain things to items in the given set:

  • 'islandora_batch_queue' => array(
    'left_field' => 'id',
    'field' => 'parent',
    ),
  • protected function getLinks($values) {
    return array(
    array(
    'title' => t('Set item state'),
    'href' => format_string('!base/!item_id/set_state', array(
    '!base' => $this->actionBasePath(),
    '!item_id' => $this->getItemId($values),
    )),
    'query' => drupal_get_destination(),
    ),
    array(
    'title' => t('Delete item'),
    'href' => format_string('!base/!item_id/delete', array(
    '!base' => $this->actionBasePath(),
    '!item_id' => $this->getItemId($values),
    )),
    'query' => drupal_get_destination(),
    ),
    );
    }

... along with adjusting the menu routes used by views to accept the set:

  • $items['islandora_batch/item/%/delete'] = array(
    'title' => 'Delete "@item" from queue',
    'title arguments' => array('@item' => 2),
    'page callback' => 'islandora_batch_delete_item_page_callback',
    'page arguments' => array(2),
    'access callback' => 'islandora_batch_item_access',
    'access arguments' => array(2),
    'file' => 'includes/menu.inc',
    );
    $items['islandora_batch/item/%/set_state'] = array(
    'title' => 'Set "@item" state',
    'title arguments' => array('@item' => 2),
    'page callback' => 'islandora_batch_set_item_state_page_callback',
    'page arguments' => array(2),
    'access callback' => 'islandora_batch_item_access',
    'access arguments' => array(2),
    'file' => 'includes/menu.inc',
    );

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally. I agree with you @adam-vessey . Those changes are the ones i was speaking about to make this pull not only something that does not break existing functionality and allows other cases, but be fully robust on those future cases. If you agree with the general structure of that gist, i can take that and move forward with changing every place there the assumption is a PID is the only key to rule them all to PID, SET based display and functionality. Will see what my agenda allows me this week, but should be pretty trivial, have already annotated all my findings + yours = feel all based would be covered. Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, coming back to this, sorry for the long wait. Will have an update this week.

'indexes' => array(
'parent_index' => array('parent'),
'sid' => array('sid'),
Expand Down Expand Up @@ -358,3 +358,11 @@ function islandora_batch_update_7104() {
'not null' => FALSE,
));
}

/**
* Make PID and Set ID combined primary Key.
*/
function islandora_batch_update_7105() {
db_drop_primary_key('islandora_batch_queue');
db_add_primary_key('islandora_batch_queue', array('id', 'sid'));
}