-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New API for downloading LDML files from projects that allow sharing WS data #1309
base: develop
Are you sure you want to change the base?
Conversation
Unfortunately it looks as though we won't be able to use ZipArchive to add files to the output stream asynchronously as they come in: we'd probably run into this issue where ZipArchive tries to write the end-of-zipfile directory synchronously when it's disposed, but Kestrel doesn't allow synchronous writes to its HTTP output stream. So we might have to extract everything, then send the zip file separately. Which means that the HTTP connection might time out, so it's possible we'll have to make two API endpoints, one to prepare the zip file and one to download it. Testing required. |
Will return 403 Forbidden if project does not allow sharing ws data with SLDR. Will also return same 403 Forbidden error code if project does not exist, to avoid possibly leaking project codes. If project exists and allows data sharing, command will return a zipfile containing CachedSettings/WritingSystems/*.ldml from the tip revision.
8888014
to
ae60963
Compare
Rebased on top of current develop since it's been a month since this was started, and a lot has changed on the develop branch in that time. |
Alas, this fails (when a ProjectController method is added to call PrepareLdmlZip) with "System.InvalidOperationException: Synchronous operations are disallowed. Call WriteAsync or set AllowSynchronousIO to true instead." We'll have to change this to prepare the entire zip file first, then send it.
This is less efficient, but the only method that actually works given the current state of the ZipFile / ZipArchive code in .NET.
C# Unit Tests104 tests ±0 101 ✅ - 3 5s ⏱️ ±0s For more details on these failures, see this check. Results for commit f8b0dba. ± Comparison against base commit bb51178. ♻️ This comment has been updated with latest results. |
Now working; marking as ready for review. Not ready to merge yet, as there are several discussion items (marked with TODO comments) in this PR, such as what the permission structure should be. But it's tested and working. To test locally, you'll need to get a project with
If all went well, the zipfile should contain a folder named for a project ID at the root. If you get a 22-byte zip file that's empty, then something went wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a bunch of feedback, let me know if I missed any questions you still have.
One thought is to return 404 if there's no projects found, that should make it clear what's going on rather than an empty zip being an expected result.
data[nameof(JobTriggerTraceId)] = Activity.Current?.Context.TraceId.ToHexString() ?? string.Empty; | ||
data[nameof(JobTriggerSpanParentId)] = Activity.Current?.Context.SpanId.ToHexString() ?? string.Empty; | ||
var trigger = TriggerBuilder.Create() | ||
// TODO: Is there a simpler way of telling Quartz "Hey, enqueue this job after X delay"? Picking a unique trigger name each time seems unnecessarily complicated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, actually if you look at the source, the method that we're using in LexJob scheduler.TriggerJob
internally creates a random identity for the trigger
var trigger = TriggerBuilder.Create() | ||
// TODO: Is there a simpler way of telling Quartz "Hey, enqueue this job after X delay"? Picking a unique trigger name each time seems unnecessarily complicated. | ||
.WithIdentity(key.Name + "_Trigger_" + now.Ticks.ToString(), key.Group) | ||
.StartAt(now.Add(delay)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably validate that delay is positive
|
||
namespace LexBoxApi.Jobs; | ||
|
||
public abstract class DelayedLexJob() : LexJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class isn't really doing anything, can we just rename the QueueJob
method and move it into LexJob then remove this class, maybe call it schedule job and or DelayedQueueJob
something like that?
public class DeleteTempDirectoryJob() : DelayedLexJob | ||
{ | ||
public static async Task Queue(ISchedulerFactory schedulerFactory, | ||
string path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where, but this should be heavily validated, probably before queing the job, and right before the job actually runs to avoid deleting anything we care about. For example this should never be used to delete repo data. I would suggest only allowing paths prefixed with Path.GetTempPath()
though I would validate that the temp path is not empy otherwise that'll match anything.
@@ -500,6 +508,16 @@ private async Task<HttpContent> ExecuteHgCommandServerCommand(ProjectCode code, | |||
return response.Content; | |||
} | |||
|
|||
private async Task<HttpContent?> ExecuteHgCommandServerCommand_ErrorsOk(ProjectCode code, string command, IEnumerable<HttpStatusCode> okErrors, CancellationToken token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name of this method is non standard C#. Also okErrors
could just be a parameter of the original ExecuteHgCommandServerCommand
, with a default value of null or empty and it would have the same result
public async Task<DirectoryInfo?> ExtractLdmlZip(Project project, string destRoot, CancellationToken token = default) | ||
{ | ||
if (project.Type != ProjectType.FLEx) return null; | ||
var zip = await hgService.GetLdmlZip(project.Code, token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a using statement to dispose of the zip once this method returns
protected override Task ExecuteJob(IJobExecutionContext context) | ||
{ | ||
ArgumentException.ThrowIfNullOrEmpty(Path); | ||
return Task.Run(() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't need to be wrapped in a task since it's not running as part of a web request
Response.Headers.ContentDisposition = "attachment;filename=\"ldml.zip\""; // TODO: Put timestamp in filename, or use the filename that PrepareLdmlZip returns once it has a timestamp in it | ||
Response.ContentType = "application/zip"; | ||
Response.StatusCode = 200; | ||
await Response.SendFileAsync(path, token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you return File
then you don't need to handle status codes or content disposition manually. You'll have to change the return type of this method
@@ -232,6 +232,17 @@ private async Task StreamHttpResponse(HttpContent hgResult) | |||
await hgResult.CopyToAsync(writer.AsStream()); | |||
} | |||
|
|||
[HttpGet("getLdmlZip")] // TODO: Discuss endpoint name, and whether it should be GET or POST, at next opportunity | |||
[AdminRequired] // TODO: Decide on permissions, because we don't want everyone triggering this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
admin only is a good choice, it won't be exposed in the UI so I'm not worried about our admins executing it
@@ -232,6 +232,17 @@ private async Task StreamHttpResponse(HttpContent hgResult) | |||
await hgResult.CopyToAsync(writer.AsStream()); | |||
} | |||
|
|||
[HttpGet("getLdmlZip")] // TODO: Discuss endpoint name, and whether it should be GET or POST, at next opportunity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a GET is fine since it's not actually modifying data. maybe sldr-export
?
Fixes #1306.
To test locally, you'll need to get a project with
addToSldr="true"
in its writing system configuration. None of our default test projects have that, but you can add that with the following method:kubectl --context=docker-desktop exec -it deploy/hg -- bash
cd /var/hg/repos/s/sena-3
or/e/elawa-dev-flex
hg co tip
to populate the repo's working directoryCachedSettings/SharedSettings/LexiconSettings.plsx
file<WritingSystems>
line to<WritingSystems addToSldr="true">
hg commit -m "Add to SLDR"
hg config --edit
to set your usernamehg co null
to empty the working directory now that the commit is in the repoIf all went well, the zipfile should contain a folder named for a project ID at the root. If you get a 22-byte zip file that's empty, then something went wrong.
Currently the directory structure for each project looks like this:
{project ID}/CachedSettings/WritingSystemStore/lots of .ldml files
Do we want to change that to just this?
{project ID}/lots of .ldml files
Discussion items
CachedSettings/WritingSystemStore/
?