Skip to content

Commit

Permalink
feat: add object tagging
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewhilton committed Aug 2, 2024
1 parent 4e04360 commit 9456a19
Show file tree
Hide file tree
Showing 31 changed files with 1,777 additions and 11 deletions.
86 changes: 86 additions & 0 deletions TAGGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Tagging
Tagging allows extra metadata about your files to be send to the external object store. These sources are defined in code, and currently cannot be configured on/off from the UI.

Currently, this is only implemented for the S3 file system client.
**Tagging vs metadata**

Note object tags are different from object metadata.

Object metadata is immutable, and attached to the object on upload. With metadata, if you wish to update it (for example during a migration, or the sources changed), you have to copy the object with the new metadata, and delete the old object. This is problematic, since deletion is optional in objectfs.

Object tags are more suitable, since their permissions can be managed separately (e.g. a client can be allowed to modify tags, but not delete objects).

## File system setup
### S3
[See the S3 docs for more information about tagging](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html).

You must allow `s3:GetObjectTagging` and `s3:PutObjectTagging` permission to the objectfs client.

## Sources
The following sources are implemented currently:
### Environment
What environment the file was uploaded in. Configure the environment using `$CFG->objectfs_environment_name`

### Mimetype
What mimetype the file is stored as under the `mdl_files` table.

## Multiple environments pointing to single bucket
It is possible you are using objectfs with multiple environments (e.g. prod, staging) that both point to the same bucket. Since files are referenced by contenthash, it generally does not matter where they come from, so this isn't a problem. However to ensure the tags remain accurate, you should turn off `overwriteobjecttags` in the plugin settings for every environment except production.

This means that staging is unable to overwrite tags for files uploaded elsewhere, but can set it on files only uploaded only from staging. However, files uploaded from production will always have the correct tags, and will overwrite any existing tags.

```mermaid
graph LR
subgraph S3
Object("`**Object**
contenthash: xyz
tags: env=prod`")
end
subgraph Prod
UploadObjectProd["`**Upload object**
contenthash: xyz
tags: env=prod`"] --> Object
end
subgraph Staging
UploadObjectStaging["`**Upload object**
contenthash: xyz
tags: env=staging`"]
end
Blocked["Blocked - does not have permissions\nto overwrite existing object tags"]
UploadObjectStaging --- Blocked
Blocked -.-> Object
style Object fill:#ffffff00,stroke:#ffa812
style S3 fill:#ffffff00,stroke:#ffa812
style Prod fill:#ffffff00,stroke:#26ff4a
style UploadObjectProd fill:#ffffff00,stroke:#26ff4a
style Staging fill:#ffffff00,stroke:#978aff
style UploadObjectStaging fill:#ffffff00,stroke:#978aff
style Blocked fill:#ffffff00,stroke:#ff0000
```

## Migration
If the way a tag was calculated has changed, or new tags are added (or removed) or this feature was turned on for the first time (or turned on after being off), you must do the following:
- Manually run `trigger_update_object_tags` scheduled task from the UI, which queues a `update_object_tags` adhoc task that will process all objects marked as needing sync (default is true)
or
- Call the CLI to execute a `update_object_tags` adhoc task manually.

## Reporting
There is an additional graph added to the object summary report showing the tag value combinations and counts of each.

Note, this is only for files that have been uploaded from this environment, and may not be consistent for environments where `overwriteobjecttags` is disabled (because the site does not know if a file was overwritten in the external store by another client).

## For developers

### Adding a new source
Note the rules about sources:
- Identifier must be < 32 chars long.
- Value must be < 128 chars long.

While external providers allow longer key/values, we intentionally limit it to reserve space for future use. These limits may change in the future as the feature matures.

To add a new source:
- Implement `tag_source`
- Add to the `tag_manager` class
- As part of an upgrade step, mark all objects `tagsyncstatus` to needing sync (using `tag_manager` class, or manually in the DB)
- As part of an upgrade step, queue a `update_object_tags` adhoc task to process the tag migration.
62 changes: 62 additions & 0 deletions classes/check/tagging_status.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
<?php
// This file is part of Moodle - http://moodle.org/
//
// Moodle is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Moodle is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Moodle. If not, see <http://www.gnu.org/licenses/>.

namespace tool_objectfs\check;

use core\check\check;
use core\check\result;
use tool_objectfs\local\tag\tag_manager;

/**
* Tagging status check
*
* @package tool_objectfs
* @author Matthew Hilton <[email protected]>
* @copyright Catalyst IT
* @license http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later
*/
class tagging_status extends check {
/**
* Link to ObjectFS settings page.
*
* @return \action_link|null
*/
public function get_action_link(): ?\action_link {
$url = new \moodle_url('/admin/category.php', ['category' => 'tool_objectfs']);
return new \action_link($url, get_string('pluginname', 'tool_objectfs'));
}

/**
* Get result
* @return result
*/
public function get_result(): result {
if (!tag_manager::is_tagging_enabled_and_supported()) {
return new result(result::NA, get_string('check:tagging:na', 'tool_objectfs'));
}

// Do a tag set test.
$config = \tool_objectfs\local\manager::get_objectfs_config();
$client = \tool_objectfs\local\manager::get_client($config);
$result = $client->test_set_object_tag();

if ($result->success) {
return new result(result::OK, get_string('check:tagging:ok', 'tool_objectfs'), $result->details);
} else {
return new result(result::ERROR, get_string('check:tagging:error', 'tool_objectfs'), $result->details);
}
}
}
5 changes: 5 additions & 0 deletions classes/local/manager.php
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ public static function get_objectfs_config() {
$config->batchsize = 10000;
$config->useproxy = 0;
$config->deleteexternal = 0;
$config->enabletagging = false;

$config->filesystem = '';
$config->enablepresignedurls = 0;
Expand Down Expand Up @@ -329,6 +330,10 @@ public static function get_available_fs_list() {
* @return string
*/
public static function get_client_classname_from_fs($filesystem) {
// Unit tests need to return the test client.
if ($filesystem == '\tool_objectfs\tests\test_file_system') {
return '\tool_objectfs\tests\test_client';
}
$clientclass = str_replace('_file_system', '', $filesystem);
return str_replace('tool_objectfs\\', 'tool_objectfs\\local\\store\\', $clientclass.'\\client');
}
Expand Down
1 change: 1 addition & 0 deletions classes/local/report/objectfs_report.php
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ public static function get_report_types() {
'location',
'log_size',
'mime_type',
'tag_count',
];
}

Expand Down
51 changes: 51 additions & 0 deletions classes/local/report/tag_count_report_builder.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<?php
// This file is part of Moodle - http://moodle.org/
//
// Moodle is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Moodle is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Moodle. If not, see <http://www.gnu.org/licenses/>.

namespace tool_objectfs\local\report;

/**
* Tag count report builder.
*
* @package tool_objectfs
* @author Matthew Hilton <[email protected]>
* @copyright Catalyst IT
* @license http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later
*/
class tag_count_report_builder extends objectfs_report_builder {
/**
* Builds report
* @param int $reportid
* @return objectfs_report
*/
public function build_report($reportid) {
global $DB;
$report = new objectfs_report('tag_count', $reportid);

// Returns counts + sizes of key:value.
$sql = "
SELECT CONCAT(COALESCE(object_tags.tagkey, '(untagged)'), ': ', COALESCE(object_tags.tagvalue, '')) as datakey,
COUNT(objects.id) as objectcount,
SUM(objects.filesize) as objectsum
FROM {tool_objectfs_objects} objects
LEFT JOIN {tool_objectfs_object_tags} object_tags
ON objects.contenthash = object_tags.contenthash
GROUP BY object_tags.tagkey, object_tags.tagvalue
";
$result = $DB->get_records_sql($sql);
$report->add_rows($result);
return $report;
}
}
27 changes: 27 additions & 0 deletions classes/local/store/object_client.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@

namespace tool_objectfs\local\store;

use stdClass;

interface object_client {

/**
Expand Down Expand Up @@ -137,6 +139,31 @@ public function proxy_range_request(\stored_file $file, $ranges);
*/
public function test_range_request($filesystem);

/**
* Tests setting an objects tag.
* @return stdClass containing 'success' and 'details' properties
*/
public function test_set_object_tag(): stdClass;

/**
* Set the given objects tags in the external store.
* @param string $contenthash file content hash
* @param array $tags array of key=>value pairs to set as tags.
*/
public function set_object_tags(string $contenthash, array $tags);

/**
* Returns given objects tags queried from the external store. External object must exist.
* @param string $contenthash file content has
* @return array array of key=>value tag pairs
*/
public function get_object_tags(string $contenthash): array;

/**
* If the client supports object tagging feature.
* @return bool true if supports, else false
*/
public function supports_object_tagging(): bool;
}


36 changes: 36 additions & 0 deletions classes/local/store/object_client_base.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@

namespace tool_objectfs\local\store;

use stdClass;

/**
* [Description object_client_base]
*/
Expand Down Expand Up @@ -187,4 +189,38 @@ public function test_connection() {
public function test_permissions($testdelete) {
return (object)['success' => false, 'details' => ''];
}

/**
* Tests setting an objects tag.
* @return stdClass containing 'success' and 'details' properties
*/
public function test_set_object_tag(): stdClass {
return (object)['success' => false, 'details' => ''];
}

/**
* Set the given objects tags in the external store.
* @param string $contenthash file content hash
* @param array $tags array of key=>value pairs to set as tags.
*/
public function set_object_tags(string $contenthash, array $tags) {
return [];
}

/**
* Returns given objects tags queried from the external store. External object must exist.
* @param string $contenthash file content has
* @return array array of key=>value tag pairs
*/
public function get_object_tags(string $contenthash): array {
return [];
}

/**
* If the client supports object tagging feature.
* @return bool true if supports, else false
*/
public function supports_object_tagging(): bool {
return false;
}
}
56 changes: 56 additions & 0 deletions classes/local/store/object_file_system.php
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,10 @@
use stored_file;
use file_storage;
use BlobRestProxy;
use coding_exception;
use Throwable;
use tool_objectfs\local\manager;
use tool_objectfs\local\tag\tag_manager;

defined('MOODLE_INTERNAL') || die();

Expand Down Expand Up @@ -360,6 +363,12 @@ public function copy_object_from_local_to_external_by_hash($contenthash, $object
}
}

// If tagging is enabled, ensure tags are synced regardless of if object is local or duplicated, etc...
// The file may exist in external store because it was uploaded by another site, but we may want to put our tags onto it.
if (tag_manager::is_tagging_enabled_and_supported()) {
$this->push_object_tags($contenthash);
}

$this->logger->log_object_move('copy_object_from_local_to_external',
$initiallocation,
$finallocation,
Expand Down Expand Up @@ -1154,4 +1163,51 @@ private function update_object(array $result): array {

return $result;
}

/**
* Pushes tags to the external store (post upload) for a given hash.
* External client must support tagging.
*
* @param string $contenthash file to sync tags for
*/
public function push_object_tags(string $contenthash) {
if (!$this->get_external_client()->supports_object_tagging()) {
throw new coding_exception("Cannot sync tags, external client does not support tagging.");
}

// Get a lock before syncing, to ensure other parts of objectfs are not moving/interacting with this object.
$lock = $this->acquire_object_lock($contenthash, 10);

// No lock - just skip it.
if (!$lock) {
throw new coding_exception("Could not get object lock");
}

try {
$objectexists = $this->is_file_readable_externally_by_hash($contenthash);

// Object must exist, and we can overwrite (and not care about existing tags)
// or cannot overwrite, and the tags are empty.
// Avoid unnecessarily checking tags, since this is an extra API call.
$canset = $objectexists && (tag_manager::can_overwrite_object_tags() ||
empty($this->get_external_client()->get_object_tags($contenthash)));

if ($canset) {
$tags = tag_manager::gather_object_tags_for_upload($contenthash);
$this->get_external_client()->set_object_tags($contenthash, $tags);
tag_manager::store_tags_locally($contenthash, $tags);
}

// Regardless, it has synced.
tag_manager::mark_object_tag_sync_status($contenthash, tag_manager::SYNC_STATUS_SYNC_NOT_REQUIRED);
} catch (Throwable $e) {
$lock->release();

// Mark object as tag sync error, this should stop it re-trying until fixed manually.
tag_manager::mark_object_tag_sync_status($contenthash, tag_manager::SYNC_STATUS_ERROR);

throw $e;
}
$lock->release();
}
}
Loading

0 comments on commit 9456a19

Please sign in to comment.