-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache for the discovery loading #89
base: main
Are you sure you want to change the base?
Conversation
dd904e2
to
f8fa3c1
Compare
* @return string | ||
* The full contents of discovery.xml. | ||
* | ||
* @throws \Drupal\collabora_online\Exception\CollaboraNotAvailableException | ||
* The client url cannot be retrieved. | ||
*/ | ||
public function getDiscoveryXml(): string; | ||
public function getDiscoveryXml(RefinableCacheableDependencyInterface $cacheability): string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are forced to add this because we are separating the cache into a different class. But it feels unneeded.
We should just put the caching inside the fetcher itself. See e.g. \Drupal\media\OEmbed\ResourceFetcher. The caching is coded inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw I don't see ResourceFetcher adding any cache tags to the cache entry.
It makes sense here because the data is coming from an external source, and does not depend on any Drupal config or content. The $url
parameter might be, but that already goes into the $cache_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now all this stuff is in one class, CollaboraDiscoveryFetcher
. No more passing around of cache metadata.
* {@inheritdoc} | ||
*/ | ||
public function getDiscoveryXml(RefinableCacheableDependencyInterface $cacheability): string { | ||
$cached = $this->cache->get($this->cid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a configuration value that determines for how long the xml is cached. We should put a standard value of 5 minutes, which sounds quite sensible.
See https://git.drupalcode.org/project/media_avportal/-/blob/8.x-1.x/src/AvPortalClient.php?ref_type=heads#L109 for a similar approach .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Microsoft recommends 12-24 hours TTL for this cache.
(I have not found a recommendation from Collabora Online)
https://learn.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/online/discovery
But they also mention other options.
12-24 hours is a good cadence to refresh, although in practice it's updated much less frequently.
A more dynamic option is to re-run discovery when proof key validation fails, or when it succeeds using the old key. That implies that the keys have been rotated, so discovery should definitely be re-run to get the new public key.
Lastly, another option is to run discovery whenever one of your machines restart. All these approaches, as well as combinations of them, have been used by hosts in the past. Whichever approach makes the most sense depends on your infrastructure.
I don't know what is the typical circulation period for the proof keys in Collabora Online.
I would be careful with the "when proof key validation fails".
This would also apply to maliciously crafted requests from untrusted sources.
Not sure if we care.
We should put a standard value of 5 minutes, which sounds quite sensible.
In a low traffic scenario, this would mean cold cache on almost every request.
For now I chose the 12h as default, but we can discuss this.
use Drupal\Component\Datetime\TimeInterface; | ||
use Drupal\Core\Cache\Cache; | ||
use Drupal\Core\Cache\CacheableMetadata; | ||
use Drupal\Core\Cache\MemoryCache\MemoryCacheInterface; | ||
use Symfony\Component\ErrorHandler\ErrorHandler; | ||
|
||
/** | ||
* Service to get values from the discovery.xml. | ||
*/ | ||
class CollaboraDiscovery implements CollaboraDiscoveryInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end this service is a value object. Probably a factory would help in retrieving either a fresh or a cached instance of it, otherwise it gets cached in the service container and doesn't get really invalidated.
That's why I would not implement any static caching at the moment. We don't want to have to know here the cache metadata regarding how the xml was retrieved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the "factory" is the *Fetcher->getDiscovery().
The value object is no longer cached (as an object) but the raw xml is cached.
f8fa3c1
to
ead3915
Compare
… in an ExistingSite test.
…ViewerControllerTest.
7b8ec0a
to
3054303
Compare
43eea19
to
8341d85
Compare
…iewerControllerTest.
…urtherLogMessages().
Also change the log message in this case, to make it distinguishable.
8341d85
to
b0328ad
Compare
b0328ad
to
bf87951
Compare
/** | ||
* Creates a discovery value object. | ||
*/ | ||
class CollaboraDiscoveryFetcher implements CollaboraDiscoveryFetcherInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing TBD here is the names of the classes.
E.g. instead of CollaboraDiscoveryFetcher
we could just have DiscoveryFetcher
. Or we could rely on the namespace and name it Fetcher
.
To me, the shortname of a class should be sufficiently unique and distinguishable in most places where it is meant to be used, but not necessarily in all.
// This is known to happen when $xml is an empty string. | ||
// Instead we could check for $xml === '' earlier, but we don't know for | ||
// sure if this is, and always will be, the only such case. | ||
throw new CollaboraNotAvailableException('The discovery.xml file is empty.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the text, we can't assure that is empty. At least given the comment above.
* Most of this code is copied from openeuropa/oe_showcase, which is published | ||
* under the EUPL license. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
|
||
/** @var \Drupal\Core\Cache\CacheTagsInvalidatorInterface $invalidator */ | ||
$invalidator = \Drupal::service(CacheTagsInvalidatorInterface::class); | ||
$invalidator->invalidateTags(['config:collabora_online.settings']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check after cache TTL period has expired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with value 0?
fn () => simplexml_load_string($xml), | ||
); | ||
$this->assertNotFalse($parsed_xml, "XML: '$xml'"); | ||
return new CollaboraDiscovery($parsed_xml, $cache_tags, $cache_max_age); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 params?
$this->assertTrue($this->httpClientGetCalls[1][1]['verify']); | ||
$this->assertTrue($this->httpClientGetCalls[1][1]['verify']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it checking the same array index twice?
try { | ||
$keys = $this->getKeys(); | ||
} | ||
catch (CollaboraNotAvailableException $e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a check for this case in the tests?
@@ -81,6 +83,13 @@ public function editor(MediaInterface $media, Request $request, $edit = FALSE): | |||
['content-type' => 'text/plain'], | |||
); | |||
} | |||
if ($wopi_client_url === NULL) { | |||
return new Response( | |||
(string) $this->t('The Collabora Online editor/viewer is not available for this file type.'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this. Add a check in the tests?
@@ -1,6 +1,8 @@ | |||
cool: | |||
# The address of the COOL server. | |||
server: https://localhost:9980/ | |||
# Microsoft recommends 12-24 hours for this cache. | |||
discovery_cache_ttl: 43200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see more about this. Can you post the source, please?
), | ||
'#field_suffix' => $this->t('seconds'), | ||
'#default_value' => $cool_settings['discovery_cache_ttl'] ?? 43200, | ||
'#step' => 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why step?
Sibling MR: https://git.drupalcode.org/project/collabora_online/-/merge_requests/6
Changes in this PR
Still missing: Additional tests to cover the caching.
Features:
Optional additional changes not in this PR:
So.. probably we don't need these optional changes.
I had these changes locally before I simplified, and pushed them instead of discarding.