Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unquote flow_name in create flow expr #5483

Merged
merged 3 commits into from
Feb 7, 2025

Conversation

discord9
Copy link
Contributor

@discord9 discord9 commented Feb 7, 2025

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

fix and tested, flow name with quote will now be sanitize as normal

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

Summary by CodeRabbit

  • New Features
    • Enhanced flow name validation now supports legacy formats while providing clearer, user-friendly error feedback.
    • Flow creation has been improved through consistent sanitization of names (removing unwanted quotes and ensuring proper quoting) and case-insensitive keyword handling in SQL statements.
  • Tests
    • Introduced new tests to verify robust flow name validation, sanitization, and error reporting.

@discord9 discord9 requested a review from a team as a code owner February 7, 2025 04:02
Copy link
Contributor

coderabbitai bot commented Feb 7, 2025

Walkthrough

This pull request updates the handling of flow names across multiple modules. A new static regex for compatibility is added to allow older, less strict naming conventions, and the behavior of the flow name parser is modified to log warnings when falling back to the compatibility pattern. Additionally, a new error variant is introduced for invalid flow names and a sanitization function is implemented to validate and clean flow names before creating flow expressions. The SQL parser is also updated to consistently handle quoted flow names and case-insensitive matching for the SINK keyword.

Changes

File(s) Change Summary
src/common/meta/.../key/flow/flow_name.rs Added static variable COMPAT_FLOW_NAME_KEY_PATTERN and modified FlowNameKeyInner::from_bytes to try the compatibility regex after the standard regex fails, logging a warning in the process; added a test for compatibility deserialization.
src/operator/src/error.rs Introduced a new enum variant InvalidFlowName { name: String, location: Location } and updated the status_code method in the ErrorExt trait to return StatusCode::InvalidArguments for this error.
src/operator/src/expr_factory.rs Added function sanitize_flow_name to validate and clean flow names before creating a flow task expression; updated to_create_flow_task_expr to use this function and included a new test case to verify flow name sanitization and error handling.
src/sql/src/parsers/create_parser.rs Updated parse_create_flow to enforce quoted identifiers for flow names and modified SINK keyword matching to be case-insensitive, along with improved error messaging using additional context if the SINK keyword is not found.

Sequence Diagram(s)

sequenceDiagram
  participant F as FlowNameKeyInner
  participant SR as StandardRegex
  participant CR as CompatRegex
  participant W as Telemetry (warn)
  participant C as Caller

  C->>F: Call from_bytes(input)
  F->>SR: Try matching input with standard pattern
  alt Standard Pattern Matched
      SR-->>F: Match result
      F-->>C: Return parsed flow name
  else Standard Pattern Failed
      F->>W: Log warning message
      F->>CR: Try matching input with compatibility pattern
      alt Compatibility Pattern Matched
          CR-->>F: Match result
          F-->>C: Return parsed flow name
      else
          CR-->>F: No match
          F-->>C: Return error
      end
  end
Loading
sequenceDiagram
  participant U as User Query
  participant P as Parser (parse_create_flow)
  participant E as ExprFactory (to_create_flow_task_expr)
  participant S as sanitize_flow_name
  participant T as Test

  U->>P: Submit CREATE FLOW query with flow name
  P->>E: Invoke to_create_flow_task_expr with flow name
  E->>S: Call sanitize_flow_name(flow name)
  alt Flow name valid and sanitized
      S-->>E: Return sanitized name
      E-->>P: Return CreateFlowExpr with sanitized name
      P-->>U: Successfully parse and process flow
  else Flow name invalid
      S-->>E: Return error
      E-->>P: Propagate error
      P-->>U: Return error response
  end
  T-->>E: Validate behavior with test case
Loading

Possibly related PRs

Suggested labels

A-flow

Suggested reviewers

  • waynexia
  • zhongzc
  • evenyag
  • killme2008

Poem

I hopped through the code with glee,
Making flow names as friendly as can be.
With regex tricks both new and old,
My warnings sing, my tests unfold.
A joyful dance in code so bright,
Rabbit-approved changes taking flight! 🐇✨


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Feb 7, 2025
@discord9 discord9 requested a review from WenyXu February 7, 2025 06:19
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/operator/src/expr_factory.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@v0y4g3r v0y4g3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

src/operator/src/expr_factory.rs Outdated Show resolved Hide resolved
@v0y4g3r
Copy link
Contributor

v0y4g3r commented Feb 7, 2025

What if users already have invalid flow keys like a/`b`, causing persistent failure when reading information_schema.flows table? Can we tolerate that format when decoding flow keys?

@discord9
Copy link
Contributor Author

discord9 commented Feb 7, 2025

a/b

added compat key pattern for older flow_name, and tests, should be ok now

static ref COMPAT_FLOW_NAME_KEY_PATTERN: Regex = Regex::new(&format!(
        "^{FLOW_NAME_KEY_PREFIX}/({NAME_PATTERN})/(.*)$"
    ))
    .unwrap();

@discord9
Copy link
Contributor Author

discord9 commented Feb 7, 2025

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Feb 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/common/meta/src/key/flow/flow_name.rs (1)

124-135: Consider enhancing the warning message.

The warning message could be more informative by including details about the expected pattern.

-                    "FlowNameKeyInner '{}' is not a valid flow name in newer version.",
+                    "FlowNameKeyInner '{}' is not a valid flow name in newer version. Expected pattern: {}/catalog/flow_name",
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a16998 and 4257013.

📒 Files selected for processing (4)
  • src/common/meta/src/key/flow/flow_name.rs (4 hunks)
  • src/operator/src/error.rs (2 hunks)
  • src/operator/src/expr_factory.rs (5 hunks)
  • src/sql/src/parsers/create_parser.rs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (18)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_insert_logical_table)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_insert)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Local WAL, fuzz_migrate_metric_regions)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_alter_logical_table)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Local WAL, fuzz_migrate_mito_regions)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_create_logical_table)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Remote WAL, fuzz_failover_metric_regions)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_create_database)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Remote WAL, fuzz_failover_mito_regions)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_alter_table)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Remote WAL, fuzz_migrate_metric_regions)
  • GitHub Check: Fuzz Test (Distributed, Remote WAL, fuzz_create_table)
  • GitHub Check: Fuzz Test with Chaos (Distributed, Remote WAL, fuzz_migrate_mito_regions)
  • GitHub Check: Fuzz Test (fuzz_insert)
  • GitHub Check: Sqlness Test (Pg Kvbackend)
  • GitHub Check: Sqlness Test (Remote WAL)
  • GitHub Check: Sqlness Test (Basic)
  • GitHub Check: test
🔇 Additional comments (7)
src/common/meta/src/key/flow/flow_name.rs (2)

42-46: LGTM! Good addition of backward compatibility.

The new regex pattern COMPAT_FLOW_NAME_KEY_PATTERN allows for less strict name patterns while maintaining compatibility with older flow names. The documentation clearly explains its purpose.


301-305: LGTM! Good test coverage for compatibility.

The test case verifies the backward compatibility with older flow name formats, specifically testing the a/\b`` pattern mentioned in the PR comments.

src/operator/src/error.rs (2)

707-712: LGTM! Well-structured error variant.

The new InvalidFlowName error variant follows the established pattern and includes appropriate fields for error reporting.


831-831: LGTM! Consistent error code mapping.

The InvalidFlowName error is correctly mapped to StatusCode::InvalidArguments, consistent with similar validation errors.

src/operator/src/expr_factory.rs (2)

748-757: LGTM! Clean and efficient flow name sanitization.

The sanitize_flow_name function effectively validates and sanitizes flow names. The use of swap_remove is efficient as it avoids unnecessary cloning.


772-825: LGTM! Comprehensive test coverage.

The test cases cover both successful and error scenarios, including the specific case of invalid flow names with dots (e.g., abc.task_2).

src/sql/src/parsers/create_parser.rs (1)

1309-1325: LGTM! Updated test case for quoted flow names.

The test case now correctly verifies the handling of quoted flow names, ensuring consistency with the new flow name requirements.

@discord9 discord9 added this pull request to the merge queue Feb 7, 2025
Merged via the queue into GreptimeTeam:main with commit f29a1c5 Feb 7, 2025
39 checks passed
@discord9 discord9 deleted the fix_flow_quote branch February 7, 2025 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants