Skip to content
This repository has been archived by the owner on Aug 14, 2020. It is now read-only.

spec: Define image format string #586

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

sgotti
Copy link
Member

@sgotti sgotti commented Mar 29, 2016

This patch tries to define image format string (see #479). It tries to handle label values being any kind of string. To do this it requires it to be URL escaped (instead of inventing another escape format). This will break some implementation (see below).

If image tags #584 lands the required change will be to replace the :versionvalue with :tag (URL encoded).

I'm open to any suggestion for other ways to escape the label value.

spec: Define image format string

Add an "extra" spec section for image format string

This format is actually used by the appc spec golang reference
implementation and other tools (like rkt) but it has not been defined.

There're also various unclear cases in the actual implementationt that
this definition tries to clarify, but this is going to change some
behaviors.

The main problem is related to the label values. Since labels are
strings, they can contain any character (also special character like
carriage return, tabs etc...).

With the current implementation there are two main issues related to
label value being strings:

  • label value containing , : and = cannot be used inside the image format
    string. There's the need to find a way to escape them.
  • Many special character (like non printable characters) are difficult
    to represent in the actual format.

Instead name and label name are of type ACIdentifier, so they can
contain only a specific set of character and in this set ,, : and
= aren't included.

Instead of adding another ad hoc escape rule this spec requires that the
labels value must be URL escaped (percent encoding) with UTF-8 encoding.

This will break some current implementations:

For example rkt stage1 has a version like 1.2.1+gite568957-dirty and
with the current image format parsing it works. Now, with the need to URL
escape the labels values it should be expressed as
1.2.1%2Bgite568957-dirty (since passing 1.2.1+gite568957-dirty will
be interpreted as 1.2.1 gite568957-dirty).

Add an "extra" spec section for image format string

This format is actually used by the appc spec golang reference
implementation and other tools (like rkt) but it has not been defined.

There're also various unclear cases in the actual implementationt that
this definition tries to clarify, but this is going to change some
behaviors.

The main problem is related to the label values. Since labels are
strings, they can contain any character (also special character like
carriage return, tabs etc...).

With the current implementation there are two main issues relatd to
label value being strings:
* label value containing `,` `:` and `=` cannot be used inside the image format
string. There's the need to find a way to escape them.

* Many special character (like non printable characters) are difficult
to represent in the actual format.

Instead name and label name are of type ACIIdentifier, so they can
contain only a specific set of character and in this set `,`, `:` and
`=` aren't included.

Instead of adding another ad hoc escape rule this spec requires that the
labels value must be URL escaped (percent encoding) with UTF-8 encoding.

This will break some current implementations:

For example rkt stage1 has a version like `1.2.1+gite568957-dirty` and
with the current image format parsing it works. Now, with the need to URL
escape the labels values it should be expressed as
`1.2.1%2Bgite568957-dirty` (since passing `1.2.1+gite568957-dirty` will
be interpreted as `1.2.1 gite568957-dirty`).
@sgotti
Copy link
Member Author

sgotti commented Mar 29, 2016

/cc @jonboulle @krnowak @iaguis since they worked on image string parsing.

@jonboulle
Copy link
Contributor

I like this clean-up a lot, but the breakage is quite unpalatable :/. Seems like we dug ourselves a hole by allowing values to be arbitrary. Hmm.

@sgotti
Copy link
Member Author

sgotti commented Mar 30, 2016

@jonboulle I agree.

If it'll possible to change the label value format (I don't know how many ACIs will break) we could:

  • Define only a set of accepted chars (only printable? how to specify all the possible chars in the code to check them?) excluding the single quote '.
  • Accept single quoted label values.
  • Require the use of single quotes around label values when they'll contain commas, colons or equals.

additionally, in the meantime I'll also limit the label value size to max N chars (or bytes?). This will be useful for labels printing and to avoid exceeding max argument sizes and max URL sizes for fetching.

@sgotti
Copy link
Member Author

sgotti commented Apr 1, 2016

@jonboulle with ACString in #590 one idea will be to just force ,, : and = to be escaped with \ (\\ to get a \) without quoting since it will give some problems when used inside a shell (inside a shell an user can just single quote the whole app string and use \ to escape the reserved app string chars).

@sgotti
Copy link
Member Author

sgotti commented Apr 12, 2016

If you have some thoughts on #590 I will update this to match the ACString type and the escaping rules described in the above comment.

@lucab
Copy link
Contributor

lucab commented Aug 23, 2016

I think it may be also worth picking a schema prefix (appc://?) and specifying that "schema+label" is the canonical way to represent an appc discovery endpoint wherever disambiguation is needed.

@sgotti
Copy link
Member Author

sgotti commented Aug 23, 2016

@lucab Thanks for respawning this.

I think the scheme will be a really good thing (to distinguish between similar discovery strings like the possible future OCI image spec one). There's the question of keeping backward compatibility with the current schemeless and undefined format since it's used.

I'll stay with an URI instead of an URL (or both appc: and appc://) since the discovery string format is between a location identifier (but I don't consider it a primary access mechanism because it's not a transport) and a name identifier (for the same reasons explained in rkt/rkt#2953).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants