Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for qualified refs to $ion, $ion_encoding #830

Merged
merged 7 commits into from
Sep 7, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions src/lazy/any_encoding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,15 @@ pub enum AnyEExpArgGroupKind<'top> {
Binary_1_1(BinaryEExpArgGroup<'top>),
}

impl<'top> AnyEExpArgGroupKind<'top> {
fn encoding(&self) -> &ParameterEncoding {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Adding this method and then calling it below was a workaround for a lifetime constraint.

match self {
AnyEExpArgGroupKind::Text_1_1(g) => g.encoding(),
AnyEExpArgGroupKind::Binary_1_1(g) => g.encoding(),
}
}
}

impl<'top> HasRange for AnyEExpArgGroup<'top> {
fn range(&self) -> Range<usize> {
match self.kind {
Expand Down Expand Up @@ -353,11 +362,8 @@ impl<'top> Iterator for AnyEExpArgGroupIterator<'top> {
impl<'top> EExpressionArgGroup<'top, AnyEncoding> for AnyEExpArgGroup<'top> {
type Iterator = AnyEExpArgGroupIterator<'top>;

fn encoding(&self) -> ParameterEncoding {
match self.kind {
AnyEExpArgGroupKind::Text_1_1(g) => g.encoding(),
AnyEExpArgGroupKind::Binary_1_1(g) => g.encoding(),
}
fn encoding(&self) -> &ParameterEncoding {
self.kind.encoding()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The callsite mentioned above.

}

fn resolve(self, context: EncodingContextRef<'top>) -> ArgGroup<'top, AnyEncoding> {
Expand Down
20 changes: 13 additions & 7 deletions src/lazy/binary/encoded_value.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use crate::lazy::binary::raw::type_descriptor::Header;
use crate::lazy::binary::raw::v1_1::immutable_buffer::AnnotationsEncoding;
use crate::lazy::expanded::template::ParameterEncoding;
use crate::lazy::binary::raw::v1_1::value::BinaryValueEncoding;
use crate::IonType;
use std::ops::Range;

Expand Down Expand Up @@ -41,7 +41,7 @@ impl EncodedHeader for Header {
/// without re-parsing its header information each time.
#[derive(Clone, Copy, Debug, PartialEq)]
pub(crate) struct EncodedValue<HeaderType: EncodedHeader> {
pub(crate) encoding: ParameterEncoding,
pub(crate) encoding: BinaryValueEncoding,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Previously, ParameterEncoding was a Copy type:

#[derive(Debug, Copy, Clone, PartialEq)]
pub enum ParameterEncoding {
/// A 'tagged' type is one whose binary encoding begins with an opcode (sometimes called a 'tag'.)
Tagged,
FlexUInt,
// TODO: tagless types, including fixed-width types and macros
}

Starting with this PR (which stubs out macro-shaped parameter support), parameters needed the ability to hold a reference to a Macro that would last as long as necessary. To achieve this, a new variant was added: MacroShaped(Rc<Macro>).

This meant that ParameterEncoding was no longer Copy. However, the only place it was used that required it to be Copy was the EncodedValue struct, which was using it as a proxy for "encoding backing this binary value". Since binary values can only be backed by value encodings (not macros), it made sense to introduce a new type for that use case.

// If the compiler decides that a value is too large to be moved/copied with inline code,
// it will relocate the value using memcpy instead. This can be quite slow by comparison.
//
Expand Down Expand Up @@ -90,8 +90,6 @@ pub(crate) struct EncodedValue<HeaderType: EncodedHeader> {
pub annotations_encoding: AnnotationsEncoding,
// The offset of the type descriptor byte within the overall input stream.
pub header_offset: usize,
// If this value was written with a tagless encoding, this will be 0. Otherwise, it's 1.
pub opcode_length: u8,
// The number of bytes used to encode the optional length VarUInt following the header byte.
pub length_length: u8,
// The number of bytes used to encode the value itself, not including the header byte
Expand Down Expand Up @@ -130,6 +128,15 @@ impl<HeaderType: EncodedHeader> EncodedValue<HeaderType> {
start..end
}

/// Returns the number of bytes used to encode this value's opcode. If this value was serialized
/// using a tagless encoding, returns `0`.
pub fn opcode_length(&self) -> usize {
match self.encoding {
BinaryValueEncoding::Tagged => 1,
_ => 0,
}
}

Comment on lines +131 to +139
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather that setting/storing a number of opcode bytes on every value, we can derive whether an opcode byte was present. Fixes #805.

/// Returns the number of bytes used to encode this value's data.
/// If the value can fit in the type descriptor byte (e.g. `true`, `false`, `null`, `0`),
/// this function will return 0.
Expand Down Expand Up @@ -262,14 +269,14 @@ mod tests {
use crate::lazy::binary::encoded_value::EncodedValue;
use crate::lazy::binary::raw::type_descriptor::Header;
use crate::lazy::binary::raw::v1_1::immutable_buffer::AnnotationsEncoding;
use crate::lazy::expanded::template::ParameterEncoding;
use crate::lazy::binary::raw::v1_1::value::BinaryValueEncoding;
use crate::{IonResult, IonType};

#[test]
fn accessors() -> IonResult<()> {
// 3-byte String with 1-byte annotation
let value = EncodedValue {
encoding: ParameterEncoding::Tagged,
encoding: BinaryValueEncoding::Tagged,
header: Header {
ion_type: IonType::String,
ion_type_code: IonTypeCode::String,
Expand All @@ -279,7 +286,6 @@ mod tests {
annotations_sequence_length: 1,
annotations_encoding: AnnotationsEncoding::SymbolAddress,
header_offset: 200,
opcode_length: 1,
length_length: 0,
value_body_length: 3,
total_length: 7,
Expand Down
5 changes: 2 additions & 3 deletions src/lazy/binary/immutable_buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@ use crate::lazy::binary::encoded_value::EncodedValue;
use crate::lazy::binary::raw::r#struct::LazyRawBinaryFieldName_1_0;
use crate::lazy::binary::raw::type_descriptor::{Header, TypeDescriptor, ION_1_0_TYPE_DESCRIPTORS};
use crate::lazy::binary::raw::v1_1::immutable_buffer::AnnotationsEncoding;
use crate::lazy::binary::raw::v1_1::value::BinaryValueEncoding;
use crate::lazy::binary::raw::value::{LazyRawBinaryValue_1_0, LazyRawBinaryVersionMarker_1_0};
use crate::lazy::decoder::LazyRawFieldExpr;
use crate::lazy::encoder::binary::v1_1::flex_int::FlexInt;
use crate::lazy::encoder::binary::v1_1::flex_uint::FlexUInt;
use crate::lazy::encoding::BinaryEncoding_1_0;
use crate::lazy::expanded::template::ParameterEncoding;
use crate::result::IonFailure;
use crate::{Int, IonError, IonResult, IonType};

Expand Down Expand Up @@ -704,14 +704,13 @@ impl<'a> ImmutableBuffer<'a> {
}

let encoded_value = EncodedValue {
encoding: ParameterEncoding::Tagged,
encoding: BinaryValueEncoding::Tagged,
header,
// If applicable, these are populated by the caller: `read_annotated_value()`
annotations_header_length: 0,
annotations_sequence_length: 0,
annotations_encoding: AnnotationsEncoding::SymbolAddress,
header_offset,
opcode_length: 1,
length_length,
value_body_length: value_length,
total_length,
Expand Down
5 changes: 4 additions & 1 deletion src/lazy/binary/raw/v1_1/e_expression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,9 @@ impl<'top> Iterator for BinaryEExpArgsInputIter<'top> {
EExpArg::new(parameter, EExpArgExpr::ValueLiteral(value_ref)),
remaining,
)
}
ParameterEncoding::MacroShaped(_macro_ref) => {
todo!("macro-shaped parameter encoding")
} // TODO: The other tagless encodings
},
// If it's an argument group...
Expand Down Expand Up @@ -448,7 +451,7 @@ impl<'top> IntoIterator for BinaryEExpArgGroup<'top> {
impl<'top> EExpressionArgGroup<'top, BinaryEncoding_1_1> for BinaryEExpArgGroup<'top> {
type Iterator = BinaryEExpArgGroupIterator<'top>;

fn encoding(&self) -> ParameterEncoding {
fn encoding(&self) -> &ParameterEncoding {
Comment on lines -451 to +454
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ ParameterEncoding is no longer Copy, so accessors hand out a reference instead of a copy.

self.parameter.encoding()
}

Expand Down
10 changes: 4 additions & 6 deletions src/lazy/binary/raw/v1_1/immutable_buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ use crate::lazy::binary::raw::v1_1::e_expression::{
};
use crate::lazy::binary::raw::v1_1::r#struct::LazyRawBinaryFieldName_1_1;
use crate::lazy::binary::raw::v1_1::value::{
DelimitedContents, LazyRawBinaryValue_1_1, LazyRawBinaryVersionMarker_1_1,
BinaryValueEncoding, DelimitedContents, LazyRawBinaryValue_1_1, LazyRawBinaryVersionMarker_1_1,
};
use crate::lazy::binary::raw::v1_1::{Header, LengthType, Opcode, OpcodeType, ION_1_1_OPCODES};
use crate::lazy::decoder::{LazyRawFieldExpr, LazyRawValueExpr, RawValueExpr};
Expand All @@ -21,7 +21,6 @@ use crate::lazy::encoder::binary::v1_1::flex_int::FlexInt;
use crate::lazy::encoder::binary::v1_1::flex_sym::{FlexSym, FlexSymValue};
use crate::lazy::encoder::binary::v1_1::flex_uint::FlexUInt;
use crate::lazy::expanded::macro_table::MacroRef;
use crate::lazy::expanded::template::ParameterEncoding;
use crate::lazy::expanded::EncodingContextRef;
use crate::lazy::text::raw::v1_1::arg_group::EExpArgExpr;
use crate::lazy::text::raw::v1_1::reader::MacroAddress;
Expand Down Expand Up @@ -663,15 +662,13 @@ impl<'a> ImmutableBuffer<'a> {
}

let encoded_value = EncodedValue {
encoding: ParameterEncoding::Tagged,
encoding: BinaryValueEncoding::Tagged,
header,
// If applicable, these are populated by the caller: `read_annotated_value()`
annotations_header_length: 0,
annotations_sequence_length: 0,
annotations_encoding: AnnotationsEncoding::SymbolAddress,
header_offset,
// This is a tagged value, so its opcode length is always 1
opcode_length: 1,
length_length,
value_body_length,
total_length,
Expand Down Expand Up @@ -1544,7 +1541,8 @@ mod tests {

// Construct an encoding directive that defines this number of macros. Each macro will expand
// to its own address.
let mut macro_definitions = String::from("$ion_encoding::(\n (macro_table\n");
let mut macro_definitions =
String::from("$ion_encoding::(\n (macro_table $ion_encoding\n");
Comment on lines -1547 to +1545
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This PR modified the encoding directives to create a new macro table each time while also supporting re-exports of $ion and $ion_encoding. This meant that several unit tests that used multiple encoding directives saw macro definitions disappear. I've modified those tests to re-export $ion_encoding (that is: do a macro table append) as needed.

for address in MacroTable::FIRST_USER_MACRO_ID..MAX_TEST_MACRO_ADDRESS {
writeln!(macro_definitions, " (macro m{address} () {address})")?;
}
Expand Down
1 change: 1 addition & 0 deletions src/lazy/binary/raw/v1_1/struct.rs
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,7 @@ mod tests {
$ion_encoding::(
(symbol_table $ion_encoding)
(macro_table
$ion_encoding
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Macro table append.

(macro greet (name) (make_string "hello, " name))
)
)
Expand Down
22 changes: 15 additions & 7 deletions src/lazy/binary/raw/v1_1/value.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ use crate::lazy::binary::raw::v1_1::sequence::{LazyRawBinaryList_1_1, LazyRawBin
use crate::lazy::binary::raw::value::EncodedBinaryValue;
use crate::lazy::bytes_ref::BytesRef;
use crate::lazy::decoder::{HasRange, HasSpan, RawVersionMarker};
use crate::lazy::expanded::template::ParameterEncoding;
use crate::lazy::expanded::EncodingContextRef;
use crate::lazy::span::Span;
use crate::lazy::str_ref::StrRef;
Expand Down Expand Up @@ -95,6 +94,16 @@ impl<'top> RawVersionMarker<'top> for LazyRawBinaryVersionMarker_1_1<'top> {
}
}

/// Encodings that can back a lazy binary value. Binary 1.0 values are always backed by
/// the `Tagged` variant.
///
/// This is a subset of the encodings in the [`ParameterEncoding`] enum.
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum BinaryValueEncoding {
Tagged,
FlexUInt,
}

#[derive(Debug, Copy, Clone)]
pub struct LazyRawBinaryValue_1_1<'top> {
pub(crate) encoded_value: EncodedValue<Header>,
Expand Down Expand Up @@ -144,7 +153,7 @@ impl<'top> LazyRawValue<'top, BinaryEncoding_1_1> for &'top LazyRawBinaryValue_1
}

fn read(&self) -> IonResult<RawValueRef<'top, BinaryEncoding_1_1>> {
if self.encoded_value.encoding == ParameterEncoding::FlexUInt {
if self.encoded_value.encoding == BinaryValueEncoding::FlexUInt {
let flex_uint = FlexUInt::read(self.input.bytes(), self.input.offset())?;
let int: Int = flex_uint.value().into();
return Ok(RawValueRef::Int(int));
Expand Down Expand Up @@ -188,7 +197,7 @@ impl<'top> LazyRawValue<'top, BinaryEncoding_1_1> for &'top LazyRawBinaryValue_1
&self,
context: EncodingContextRef<'top>,
) -> IonResult<ValueRef<'top, BinaryEncoding_1_1>> {
if self.encoded_value.encoding == ParameterEncoding::FlexUInt {
if self.encoded_value.encoding == BinaryValueEncoding::FlexUInt {
let flex_uint = FlexUInt::read(self.input.bytes(), self.input.offset())?;
let int: Int = flex_uint.value().into();
return Ok(ValueRef::Int(int));
Expand All @@ -211,7 +220,7 @@ impl<'top> LazyRawValue<'top, BinaryEncoding_1_1> for &'top LazyRawBinaryValue_1
value: &'a LazyRawBinaryValue_1_1<'a>,
context: EncodingContextRef<'a>,
) -> IonResult<ValueRef<'a, BinaryEncoding_1_1>> {
if value.encoded_value.encoding == ParameterEncoding::FlexUInt {
if value.encoded_value.encoding == BinaryValueEncoding::FlexUInt {
let flex_uint = FlexUInt::read(value.input.bytes(), value.input.offset())?;
let int: Int = flex_uint.value().into();
return Ok(ValueRef::Int(int));
Expand Down Expand Up @@ -286,7 +295,7 @@ impl<'top> LazyRawBinaryValue_1_1<'top> {
/// a complete `FlexUInt`.
pub(crate) fn for_flex_uint(input: ImmutableBuffer<'top>) -> Self {
let encoded_value = EncodedValue {
encoding: ParameterEncoding::FlexUInt,
encoding: BinaryValueEncoding::FlexUInt,
header: Header {
// It is an int, that's true.
ion_type: IonType::Int,
Expand All @@ -304,7 +313,6 @@ impl<'top> LazyRawBinaryValue_1_1<'top> {
annotations_encoding: AnnotationsEncoding::SymbolAddress,

header_offset: input.offset(),
opcode_length: 0,
length_length: 0,
value_body_length: input.len(),
total_length: input.len(),
Expand Down Expand Up @@ -847,7 +855,7 @@ impl<'top> LazyRawBinaryValue_1_1<'top> {

impl<'top> EncodedBinaryValue<'top, BinaryEncoding_1_1> for &'top LazyRawBinaryValue_1_1<'top> {
fn opcode_length(&self) -> usize {
self.encoded_value.opcode_length as usize
self.encoded_value.opcode_length()
}

fn length_length(&self) -> usize {
Expand Down
2 changes: 1 addition & 1 deletion src/lazy/binary/raw/value.rs
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ pub trait EncodedBinaryValue<'top, D: Decoder>: LazyRawValue<'top, D> {

impl<'top> EncodedBinaryValue<'top, BinaryEncoding_1_0> for LazyRawBinaryValue_1_0<'top> {
fn opcode_length(&self) -> usize {
self.encoded_value.opcode_length as usize
self.encoded_value.opcode_length()
}

fn length_length(&self) -> usize {
Expand Down
2 changes: 1 addition & 1 deletion src/lazy/encoder/writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ impl<E: Encoding, Output: Write> Writer<E, Output> {
// TODO: LazyEncoder should define a method to construct a new symtab and/or macro table
let ion_version = E::ion_version();
let symbol_table = SymbolTable::new(ion_version);
let macro_table = MacroTable::new();
let macro_table = MacroTable::with_system_macros();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ MacroTable now has a with_system_macros() method (that pre-populates it with system macro aliases) and an empty() method that constructs a new instance with no macros at all. The latter is used for processing encoding directives.

let context = WriterContext::new(symbol_table, macro_table);
let mut writer = Writer {
context,
Expand Down
Loading
Loading