Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[minor] Add String Type, Format String Placeholder #45

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

seldridge
Copy link
Member

@seldridge seldridge commented Sep 19, 2022

Add a string type to the FIRRTL spec. This is intended to clarify the
type of an external module's string parameter as well as the type of the
argument to certain operations that expect a literal
string. (Previously, this was effectively unspecified.)

Signed-off-by: Schuyler Eldridge [email protected]

spec.md Outdated Show resolved Hide resolved
@dtzSiFive
Copy link
Member

Awesome! Few comments/observations:

Should this type be mentioned in type equivalence section(s)?

What happens if a String wire is invalidated?

Can we have a register of type String, or a memory? (probably not?)

Wires and other constructs of type String seem useful for plumbing strings to operations that consume them-- like an argument to printf or maybe used in an assert (which seems missing from the grammar?). What does it mean for a circuit if there's a wire of type String (assuming not optimized/constant-prop'd away)?

Verifying printf format strings is important for security/bugs in the C/software world, maybe our example shouldn't encourage this pattern for same reasons?
Making printf require a string literal is maybe too restrictive (?) but if you don't know the format string you probably don't know what arguments to pass in either so beyond possibly problematic doesn't seem especially useful. Example of using this "String" could instead be an assert or using the String as an argument to a format string containing the "%s"--two birds with one stone.
Also, the grammar mentions printf takes a "string" (by the way, not specified like "int" is) for the format string, not a "String" (or other "type"), which I think makes the example as-is invalid. This is the same "string" parameters currently take, so resolving this isn't a side issue.

Should the syntax for a string literal also be specified? What characters are supported (regardless of encoding used)-- presumably at least printable ASCII is required to be supported? Maybe it's best to not grapple with too much of this (UTF-8 all the things) to accommodate as many use cases as possible.

https://github.com/chipsalliance/firrtl-spec/blob/main/spec.md#unsigned-integers-from-literal-bits mentions using a string to specify a UInt, which we probably don't want to support (may require runtime handling of the string). More generally the spec+grammar and other operations mention "string" not just the parameter type this is introduced for. I guess this is same as earlier comment-- let's be clear about where "string" (which seems to mean quoted sequence of characters, aka a simple conventional string literal) is allowed and where "String" (this new ground type) is allowed.

@ekiwi
Copy link
Contributor

ekiwi commented Sep 19, 2022

It additionally enables all existing FIRRTL
constructs (wires, ports, etc.) and aggregate constructions of them to
include String types.

If you can have a wire of type string, does that mean that you could have conditional assignments to these String type wires? How would that translate to verilog if the format string is different depending on a when condition?

@seldridge
Copy link
Member Author

I'm intending to roll back the scope of this to narrowly define the String type, define a String literal (which already implicitly exists), and make it clear that this is the type of the existing parameter foo = "hello" usage in the current external module definition. I really only want to lay the groundwork for #46. I then want this to be sufficient to start talking about more complicated parameters that are used for modules and not just extmodules.

Thanks for the great comments.

(I got a little out over my skis here.)

Answering questions:

Should this type be mentioned in type equivalence section(s)?

If it is defined for connections, then yes.

What happens if a String wire is invalidated?

This is a great question. Given existing FIRRTL semantics, this is likely an indeterminate value, i.e., the compiler can choose any value. 😬 This sounds idiotic, so it's likely that we don't do this. A prerequisite to disallowing this is to change is invalidate to be a value of invalid type. We can then disallow any usage of an invalid type where a String is expected.

Can we have a register of type String, or a memory? (probably not?)

Megan was asking about this, too. I originally thought no. An analog type fits into this category (and intended usage) of allowable in a wire because it's only passed around, but never put into a register. However, I think this just requires some more thinking. I can see a weird utility in being able to register a string. That would, however, require defining both an encoding an a length for a String type which is pretty far outside of the intended usage.

The real question here is does a String behave like any other FIRRTL type that lowers to some hardware construct or is it a non-hardware construct that behaves differently? Is this only used to customize format strings and for encoding parameterization or is it more broadly used?

Since I don't have a broad use case, I should keep this narrow to just sufficient for parameters.

Wires and other constructs of type String seem useful for plumbing strings to operations that consume them-- like an argument to printf or maybe used in an assert (which seems missing from the grammar?). What does it mean for a circuit if there's a wire of type String (assuming not optimized/constant-prop'd away)?

We need some mechanism to pass these things around. I pulled the printf case out of thin air. This really matters for parameters. The simpler case is to allow String type to show up in nodes and to disallow String type in aggregates. This is sufficient for passing String type parameters to instances and leaves the door open for operations involving (e.g., cat(String, String)) or allowing conversion of String to other FIRRTL types (e.g., asUInt(String, Encoding)).

It is not initially necessary to allow for String to show up in any construct as long as we can propagate them along by value. It doesn't show up in #46, but I envision something like:

extmodule Foo<parameter a: UInt<32>>:

module Bar<parameter a: UInt<32>>:

  inst foo of Foo<a>

This follows from #46.

@darthscsi has brought up the distinction between types which are hardware (he called them "synth") vs. those which are metaprogramming. The intent of this proposal is that strings are only used for the latter. However, I can see the eventual need to wrestle with this. UInt will need to have the same distinction. We may need explicit Param<String> boxing or something similar to keep the different kinds of types isolated.

Verifying printf format strings is important for security/bugs in the C/software world, maybe our example shouldn't encourage this pattern for same reasons?

Again: I'm out over my skis. The point is totally valid and I agree that we don't want to be doing this right now. I don think it's totally reasonable to add the %s format string. (We're in agreement with the latter point.)

Also, the grammar mentions printf takes a "string" (by the way, not specified like "int" is) for the format string, not a "String" (or other "type"), which I think makes the example as-is invalid. This is the same "string" parameters currently take, so resolving this isn't a side issue.

Yeah, we should clarify places where it can take a String literal versus a String. All current places where the spec says String it means String literal. After we have parameters in modules, that's the first time that we could do something with an actual String type as opposed to just a String literal, e.g.:

module Baz<parameter name: String>:
  input clock: Clock
  input en: UInt<1>

  printf(clock, en, "My name is: %s", name)

Should the syntax for a string literal also be specified? What characters are supported (regardless of encoding used)-- presumably at least printable ASCII is required to be supported? Maybe it's best to not grapple with too much of this (UTF-8 all the things) to accommodate as many use cases as possible.

Yes, we should define what a string literal is.

Supported characters seems tough. I think it's fine to leave this as the existing language I added around "the encoding of a string is implementation defined". It is perhaps better to say, "An implementation must accepts ASCII-encoded strings. An implementation may accept any other string encoding." I'm not totally sure how to handle this in a good way.

https://github.com/chipsalliance/firrtl-spec/blob/main/spec.md#unsigned-integers-from-literal-bits mentions using a string to specify a UInt, which we probably don't want to support (may require runtime handling of the string). More generally the spec+grammar and other operations mention "string" not just the parameter type this is introduced for. I guess this is same as earlier comment-- let's be clear about where "string" (which seems to mean quoted sequence of characters, aka a simple conventional string literal) is allowed and where "String" (this new ground type) is allowed.

Yes. This should only be a String literal.

If you can have a wire of type string, does that mean that you could have conditional assignments to these String type wires? How would that translate to verilog if the format string is different depending on a when condition?

I'll rollback this PR to not include this. However, it's reasonable to think through some consequences of this. (It's interesting!)

Because this is implicitly now a hardware string, then it would be equivalent to a vector of some size that is a function of the length and encoding. It would then inherit the same properties of any other wire. Actually using this as the printf string would be bad (for reasons that @dtzSiFive brought up around you don't know how many arguments it takes). This motivates clearly separating what is a String literal vs. any String type. Using any String type for a format string value seems totally okay, e.g.:

wire foo: String
when cond :
  foo <= "hello"
else :
  foo <= "world"

printf(clock, en, "%s", foo)

Allowing this to be used as the actual printf string could lead to runtime errors (or invalid prints) and should be avoided.

(And reiterating that I don't want to push on this right now!)

@seldridge seldridge force-pushed the dev/seldridge/string-type branch 2 times, most recently from 85a228b to 26db283 Compare September 20, 2022 03:47
Add a string type to the FIRRTL spec.  This is intended to clarify the
type of an external module's string parameter as well as the type of the
argument to certain operations that expect a literal
string.  (Previously, this was effectively unspecified.)

Signed-off-by: Schuyler Eldridge <[email protected]>
@seldridge
Copy link
Member Author

This is now relaxed to only add a String type and describe a String literal. This is narrowly a patch change as opposed to actually defining any new behavior.

Thanks again for all the feedback.

spec.md Show resolved Hide resolved
@ekiwi
Copy link
Contributor

ekiwi commented Sep 20, 2022

has brought up the distinction between types which are hardware (he called them "synth") vs. those which are metaprogramming.

How do you plan to ensure that all expressions involving metaprogramming types can be resolved at compile time? Are you planning to introduce something along the lines of constexpr in C++?

@@ -2867,7 +2901,8 @@ width = "<" , int , ">" ;
binarypoint = "<<" , int , ">>" ;
type_ground = "Clock" | "Reset" | "AsyncReset"
| ( "UInt" | "SInt" | "Analog" ) , [ width ]
| "Fixed" , [ width ] , [ binarypoint ] ;
| "Fixed" , [ width ] , [ binarypoint ]
| "String" ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this from type_ground so that we aren't confusing whether a Reg can hold a String and whatnot? it seems like we don't need it except for parameter and printf in this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants