Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs-watch: support RAW option? #6527

Open
chu11 opened this issue Dec 19, 2024 · 3 comments
Open

kvs-watch: support RAW option? #6527

chu11 opened this issue Dec 19, 2024 · 3 comments

Comments

@chu11
Copy link
Member

chu11 commented Dec 19, 2024

Per comments in #6523 (comment), the treeobj_create_val() can be quite costly for large amounts of data, especially due to the base64 encoding that goes on internally.

In some cases, return the raw value instead of treeobject might be warranted for performance and convenient. FLUX_KVS_STREAM might be such a case, since it is looking up individual blob refs rather than the true "combined" value from the KVS.

@chu11
Copy link
Member Author

chu11 commented Jan 6, 2025

In some cases, return the raw value instead of treeobject might be warranted for performance and convenient. FLUX_KVS_STREAM might be such a case, since it is looking up individual blob refs rather than the true "combined" value from the KVS.

Was thinking about this / brainstorming ideas:

  • b/c JSON doesn't support binary data, the base64 of the data probably can't be removed / skipped in the general sense
  • per RFC24 stdio eventlogs can be encoded w/ UTF-8 (by default, or on a per-eventlog entry basis)
  • perhaps when we happen to know an eventlog has data encoded w/ UTF-8 there could be a flag indicating to send this data over raw / not in a treeobj, allowing us to improve performance
  • but we are in kvs-watch land, and there is no knowledge of the data encoding (not to mention if the data is even an eventlog entry)
  • idea: perhaps job-info could send flags indicating if kvs-watch could try and send the data "raw" when it knows the default encoding for the stdio stream is utf-8.
    • this would hopefully be the common case and in the rare case some eventlog entry line is non-utf-8, that can be sent out base64-encoded via a treeobj like before
    • the problem of course is how to differentiate this "raw" output response from a normal response

@chu11
Copy link
Member Author

chu11 commented Jan 6, 2025

hmmm, alternately, it might be far easier if "raw" could be supported through a tweak in the treeobject format. i.e. could treeobj_create_val() take a "encoding" option (and similarly store how it was encoded). By default it could still always be base64, but perhaps under special circumstances don't do encoding.

This way protocol wise, almost nothing has to change. Only internals of parsing the treeobject.

@chu11
Copy link
Member Author

chu11 commented Jan 8, 2025

did some prototypes

prototype 1 - I tried to hack non-base64 into a treeobj value, effectively through tricks like this:

+    if (data && len) {
+        if (!(obj = json_pack ("{s:i s:s s:s# s:b}",
+                               "ver", treeobj_version,
+                               "type", "val",
+                               "data", data, len,
+                               "base64", false))) {
+            goto try_again;
+        }
+    }
+    else {
+    try_again:
+        xlen = base64_encoded_length (len) + 1; /* +1 for NUL */
+        if (!(xdata = malloc (xlen)))
+            goto done;
+        if (base64_encode (xdata, xlen, (char *)data, len) < 0)
+            goto done;
+        if (!(obj = json_pack ("{s:i s:s s:s s:b}",
+                               "ver", treeobj_version,
+                               "type", "val",
+                               "data", xdata,
+                               "base64", true))) {
+            errno = ENOMEM;
+            goto done;
+        }
     }

ended up giving up on the prototype, too many subtle gotchas throughout the code. I'm sure I could solve them if I put in more effort, but I think it reached a point of "not meant to be"

prototype 2 - when using FLUX_KVS_STREAM did a hack to send the data "raw" on the RPC response instead of creating a treeobj object

--- a/src/modules/kvs-watch/kvs-watch.c
+++ b/src/modules/kvs-watch/kvs-watch.c
@@ -265,7 +265,18 @@ static void handle_load_response (flux_future_t *f, struct watcher *w)
     }
 
     if (!w->mute) {
-        json_t *val = treeobj_create_val (data, size);
+        json_t *val;
+        if (data && size && (w->flags & FLUX_KVS_STREAM)) {
+            if (flux_respond_pack (h, w->request, "{ s:s# }", "raw", data, size) < 0) {
+                flux_log_error (h,
+                                "%s: failed to respond to kvs-watch.lookup",
+                                __FUNCTION__);
+                goto try_again;
+            }
+            goto out;
+        }
+try_again:
+        val = treeobj_create_val (data, size);

and some hackery had to be done in the libkvs/kvs_lookup side as well, but this worked

doing some

flux run flux lptest 2000000 80
time flux job attach `flux job last` > /dev/null

tests, the average time on corona went from 10.6 seconds to 9.6 seconds, so about a 9% improvement. So not bad.

The above is slightly flawed in that it assumes that the data being sent "raw" is a safe transmittable utf-8 string and the other side assumes whatever it is reading is an normal string. In some circumstances, such as eventlogs, I think this is a safe assumption?

I think this may require a tad more thought, but I think this has some promise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant