Feature/expr parser #21

ArnaudBger · 2024-05-08T13:43:02Z

Add an expression parser in the substreams repo
Add a new pb file for index keys

maoueh · 2024-05-08T14:06:29Z

Cargo.toml

@@ -18,6 +18,9 @@ rust-version = "1.60"

 [workspace.dependencies]
 substreams-macro = { version = "0.5.13", path = "./substreams-macro" }
+pest= "2.7.10"
+pest_derive = "2.7.10"
+rstest = "0.19.0" 


Is this just for testing, if yes, should go in dev-dependencies bucket

maoueh · 2024-05-08T14:06:43Z

Cargo.toml

@@ -18,6 +18,9 @@ rust-version = "1.60"

 [workspace.dependencies]
 substreams-macro = { version = "0.5.13", path = "./substreams-macro" }
+pest= "2.7.10"


Suggested change

pest= "2.7.10"

pest = "2.7.10"

maoueh · 2024-05-08T14:07:34Z

substreams/Cargo.toml

+pest= "2.7.10"
+pest_derive = "2.7.10"
+rstest = "0.19.0" 


Pull from the workspace dependencies set

Suggested change

pest= "2.7.10"

pest_derive = "2.7.10"

rstest = "0.19.0"

pest = { workspace = true }

pest_derive = { workspace = true }

rstest = { workspace = true }

maoueh · 2024-05-08T14:09:36Z

substreams/tests/parser_test.rs

+#[case(TEST_KEYS, "test1    && test2 ", true)]
+#[case(TEST_KEYS, "test1    &&     test6", false)]


How do you validate that this is parsed correctly? It parses ok, but the resulting expression should be checked.

In the Go version, I created a string representation of the tree structure, so it can be verified too.

maoueh · 2024-05-08T14:12:30Z

substreams/src/parser.rs

+use pest::{iterators::Pair, Parser};
+use pest_derive::Parser;
+
+#[derive(Parser)]


Can you check if it's possible to generate the code and commit it? It would be preferable leading to faster compiling for users since less time running the derive macro.

maoueh · 2024-05-08T14:14:00Z

substreams/src/lib.rs

@@ -125,6 +125,7 @@ mod state;

 pub mod key;
 pub mod store;
+pub mod parser;


I don't like that it's called parser, parser doesn't say anything about what is being parsed.

Second, let's not publicly export the module, remove pub. We should only publicly expose the minimum we need for the library to be useful.

maoueh · 2024-05-08T14:15:35Z

substreams/src/parser.rs

+pub fn evaluate_expression(keys: Vec<String>, input: &str) -> bool {
+    let successful_parse = EParser::parse(Rule::expression, input).unwrap();
+    return evaluate_rule(successful_parse.clone().into_iter().next().unwrap(), keys);
+}
+
+fn evaluate_rule(pair: Pair<Rule>,  keys: Vec<String>) -> bool {
+    match pair.as_rule() {
+        Rule::expression => {
+            let inner_pair = pair.into_inner().next().unwrap();
+            return evaluate_rule(inner_pair, keys);
+        }
+        Rule::or => {
+            let mut result = false;
+            for inner_pair in pair.into_inner() {
+                result = result || evaluate_rule(inner_pair, keys.clone());
+            }
+            return result;
+        },
+        Rule::and => {
+            let mut result = true;
+            for inner_pair in pair.into_inner() {
+                result = result && evaluate_rule(inner_pair, keys.clone());
+            }
+            return result;
+        },
+        Rule::value => {
+            let inner_pair = pair.into_inner().next().unwrap();
+            return evaluate_rule(inner_pair, keys);
+        }
+        Rule::keyterm => {
+            return keys.contains(&pair.as_str().to_string());
+        }
+        Rule::singleQuoteKeyTerm => {
+            return keys.contains(&pair.as_str().to_string().replace("'", ""));
+        }
+        Rule::doubleQuoteKeyTerm => {
+            return keys.contains(&pair.as_str().to_string().replace("\"", ""));
+        }
+        _ => {panic!("Unexpected rule encountered")}
+    } 
+}


You need to remove all the unwrap here and the panic. Transform the method to return a Result<bool, anyhow::Error>, all error code path should have a .context attached to the error so we now where it came from.

maoueh · 2024-05-08T14:23:03Z

substreams/src/parser.rs

+#[grammar = "parser_rule.pest"]
+pub struct EParser;
+
+pub fn evaluate_expression(keys: Vec<String>, input: &str) -> bool {


No need to receive a owned copy of a Vec<String>. We should accept variant of String like &str too. I would change to

Suggested change

pub fn evaluate_expression(keys: Vec<String>, input: &str) -> bool {

pub fn evaluate_expression<K: AsRef<str>, I: AsRef<str>>(keys: &[K], input: I) -> bool {

And this can be used now with different variations of String, &String and &str:

evaluate_expression(&vec!["a", "b", "c"], "d"); evaluate_expression( &vec!["a".to_string(), "a".to_string(), "a".to_string()], "d", ); let keys = vec!["a".to_string(), "a".to_string(), "a".to_string()]; let keys_ref = keys.iter().map(|x| x).collect::<Vec<&String>>(); evaluate_expression(&keys_ref, "d");

Let's also find a more meaningful name for this function, it will be the only one exposed IMO so we better find it a very good clear name.

maoueh · 2024-05-08T14:24:50Z

substreams/src/parser.rs

+
+pub fn evaluate_expression(keys: Vec<String>, input: &str) -> bool {
+    let successful_parse = EParser::parse(Rule::expression, input).unwrap();
+    return evaluate_rule(successful_parse.clone().into_iter().next().unwrap(), keys);


Let's avoid all .clone() in your code, there is a good chance that we can remove all of them.

ArnaudBger added 3 commits May 7, 2024 15:04

Add substreams parser

467e089

Add index keys within pb

f4f960c

Remove empty folder

adaa79e

ArnaudBger requested a review from maoueh May 8, 2024 13:43

maoueh requested changes May 8, 2024

View reviewed changes

ArnaudBger added 2 commits May 8, 2024 16:37

Take review into account

8b38ba9

Add unit test

0a0d944

ArnaudBger merged commit 3ee1204 into develop May 8, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/expr parser #21

Feature/expr parser #21

ArnaudBger commented May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

maoueh May 8, 2024

		#[case(TEST_KEYS, "test1 && test2 ", true)]
		#[case(TEST_KEYS, "test1 && test6", false)]

	pub fn evaluate_expression(keys: Vec<String>, input: &str) -> bool {
	pub fn evaluate_expression<K: AsRef<str>, I: AsRef<str>>(keys: &[K], input: I) -> bool {

Feature/expr parser #21

Feature/expr parser #21

Conversation

ArnaudBger commented May 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment