-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow a wider range of characters #27
Comments
Good idea! We could use Also, maybe we should limit key length? |
PR #28 covers 2 specific parts of this issue:
|
Just as an FYI if #59 is going to eventually merged, the safe characters for NATS is more limited: https://github.com/nats-io/nats.go/blob/9e5e70676ec02a86975d09193c6dd0a95450800c/kv.go#L331 |
Summary of a discussion on this topic with @ahouene: An application SHOULD limit key names to a small set of of characters that are expected to be safe for any backend. This is probably the set The safe set is not enforced on the simpleblob level. An application MAY use any character that a specific backend actually allows, but this use may not be portable to other backends. Particular care should be taken with A backend SHOULD support the basic safe set. It MAY support any characters. It MUST make sure that the use of a wider range of characters does not lead to security or operational issues. Simpleblob does not enforce a universal limit on the length of a key. A backend MAY limit the allowed length of names. |
Currently simpleblob is quite restrictive about the characters that are allowed in names, constraining users to alphanumerical names with a few special chars (".", "-", "_"). The constraint primarily follows from using unescaped filenames in the
fs
backend.We have a use case where we want to use a version specific prefix within a bucket (e.g. "v5/"), and also to be able to write a program that can discover all versions in use. This requires listing blobs with "/" is the name.
Amazon writes the following about S3 limitations: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
Basically they allow any UTF-8 character in the name, with a few caveats and exceptions.
Ideally we would like to allow an as wide as possible range of names, while also constraining to names that any current and future backend can safely support. These are conflicting goals. Perhaps we should use the same approach as the S3 documentation: define and recommend 'safe characters' that must work with any backend, while allowing almost all characters.
Potential solution
../../foo///bar
fs
backend to make the allowed names safeEscaping and validation algorithm for
fs
:\
in the namepath.Clean(name)
to check for non-canonical paths. Reject is the output differs from the input...
, because it could be something like../../../etc/passwd
(path.Clean
will not touch this)url.QueryEscape(name)
to escape unsafe characters.
, replace that character by%2E
to avoid hidden files on UNIXThe validation and escaping functions can be exposed for other backends to reuse. The validation function should be called by every backend, the escaping function is optional.
There is another issue with special reserved device names on Windows. Go 1.20 introduces a new
IsLocal
function to check for these, but I don't think we want to depend on this, and it's only available infilepath
. Perhaps always prepend_
to the filename to avoid this? This would also solve the UNIX hidden files issue, but be a breaking change, and it could be useful for the fs backend to produce unescaped files when restricting oneself to safe characters.Cc @ahouene @nvaatstra
The text was updated successfully, but these errors were encountered: