git clone https://github.com/TristanRhodes/TextProcessing.git
cd TextProcessing
dotnet test
This is a small research project to go along with a series of blog posts where I build a parser from scratch. It covers Regex, two phase parsing (Lexing/Tokenising and Expression Parsing), combinatorial parsers, scannerless parsing and using Sprache. It is written in C# .Net 5.0 with the end result being a simple two phase, multi language expression parser.
It's now a playground for Lexing / Parsing and text processing in general.
Using regex to match different string formats of DayOfWeek and ClockTime.
Breaking a longer DayTime string into recognized Day and LocalTime parts.
Use a basic suite of Object Orientated IParser implementations to parse a DayTime string and a number of other simple expressions.
// Separate two part element context => DayTime range
"Pickup Mon 08:00 dropoff wed 17:00"
// Range elements with different separators => Open Days range and Hours Range
"Open Mon to Fri 08:00 - 18:00"
// Repeating tokens => List of tour times
"Tours 10:00 12:00 14:00 17:00 20:00"
// Repeating complex elements => List of event day times
"Events Tuesday 18:00 Wednesday 15:00 Friday 12:00"
Replace all IParser interfaces with Delegates and go all in on functional combinators.
Instead of using an array of pre-parsed Tokens we're going to use the string/Char[] array directly and implement our parser in Sprache.
A simple tokeniser and parser system implemented using Interfaces and an Object Orientated style.
A simple tokeniser and parser system implemented using Delegates and monads in a functional style.
An implementation of the demo parsers written in Sprache, the scannerless C# functional parsing library.