-
Notifications
You must be signed in to change notification settings - Fork 0
Linear expressions - simple and fast pattern matching
jbee/lex
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
▓ ▓▓▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓▓▓ ▓ ▓ ▓ ▓ ▓ ▓▓▓ ▓▓▓ ▓ ▓ Linear expressions First match pattern matching with a tiny bytecode instructed matching machine. Spec http://jbee.github.io/lex/ Feedback http://github.com/jbee/lex/issues Copyright (c) 2017 Jan Bernitt ________________________________________________________________________________ IMPLEMENTAIONS Java: basic ~100 LOC, optimized ~150 LOC ________________________________________________________________________________ SETS {...} {abc} a set of bytes "a","b" and "c" {^abc} a set of any byte but "a","b" and "c" {a-c} a set of "a", "b" and "c" given as a range {?} a set of *all* non ASCII bytes SPECIAL SETS # = {0-9} any ASCII digit @ = {a-zA-Z} any ASCII letter $ any ASCII new line (\n or \r) _ any ASCII whitespace character ^ any byte that is not an ASCII whitespace character ? any single byte REPETITION x+ + try previous set, group or literal again GROUPS (...), [...], `...` (abc) a group with sequence "abc" that *must* occur [abc] a group with sequence "abc" that *can* occur ` exit group, unless first in group (used to embed) SCANNING ~x ~ skip until following set, group or literal matches ESCAPING \ escape following byte to a literal (also in set) Sets can also be used to match most of the instruction symbols literally. For example {~} is similar to \~ or {\~}. Any other byte (not {}()[]#@^_$+~?`\) is matched literally. Escaping can be applied to any byte even if it is not needed. ________________________________________________________________________________ EXAMPLES ####/##/## a date of format yyyy/mm/dd ##:##[:##] a time of format hh:mm or hh:mm:ss #+[.#+] a simple floating point number with optional decimals "{^"}+" a quoted string using a set to find the end "~" a quoted string using a scan to find the end "~({^\\}") a quoted string using a scan with escaping support \$@}a-zA-Z0-9_{+ a php style identifier [\+]#+[{ -}#+]+ international phone numbers ~(Foo) searching for "Foo" (e.g. in a file) ~(<h#>) searching for "<h0>" to "<h9>" ~(<h{1-6}>) searching for "<h1>" to "<h6>" ~(Foo~(Bar)) searching for "Foo"s followed by "Bar"s
About
Linear expressions - simple and fast pattern matching
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published