token_parser 1.7.0
token_parser: ^1.7.0 copied to clipboard
An intuitive Token Parser that includes grammar definition, tokenization, parsing, syntax error and debugging. Implementation based on Lexical Analysis for Dart.
1.7.0 #
Added:
- Lexeme extension
lexeme.copy(), creates a copy of the lexeme, with the same properties. This is useful to avoid having duplicate lexeme variable references. - Top-level range regex pattern,
range(a, b)matches any number or character betweenaandb. - Example of a simple RGB Color Parser, using the
rangepattern.
1.6.0 #
Added:
- GlobalLexeme matches the target pattern on any part of the input.
- Grammar
(remover)rule, removes the target pattern from the input. Useful for removing comments before tokenizing.
Fixed:
- PatternLexeme's equality operator was not comparing simple patterns correctly.
1.5.2 #
Changed:
- Lexeme modifier
.pad()is not multiple anymore, it was impossible to make it optional
Fixed:
- Lexeme operator
~was not working properly,.optional.multipleis impossible as multiple requires at least one match
1.5.1 #
Fixed:
- Infinite loop when combining
.optional.multiplelexeme modifiers, instead of.multiple.optional. Multiple lexeme modifier now skips if the pattern did not consume any characters
1.5.0 #
Changed:
- Lexeme modifier
pattern.pad()is now not optional, to have an optional pattern dopattern.optionalbefore passing in - Lexeme modifier
pattern.spacedis now not optional, it wouldn't be called spaced otherwise. To have optional spacing, usepattern.optionalSpaced
Fixed:
- Debugging tokenization, no character error
1.4.0 #
BREAKING CHANGES:
- Added, changed and removed a few lexeme operators, to simplify the API and make it more consistent.
Added:
- Lexeme operator
*to combine 2 lexemes with a space in between, multiple and optional
Changed:
- Lexeme operator
+to combine 2 lexemes with a space in between, multiple
Remove:
- Lexeme operators
>and>=, are more probable to cause errors than to be useful, due to how Dart handles them - Lexeme operator
>>, does the same thing as+
1.3.1 #
Fixed:
- Multiple
>and>=would prompt an error: "A comparison expression can't be an operand of another comparison expression.". Added>>and*to fix this issue, respectively.
1.3.0 #
Added:
- Lexeme operator extension
>and>=, add spacing in between, and optionally - Shield.io badges to README.md
1.2.0 #
Added:
- New
spacingtop-level lexeme, matches any conventional spacing, multiple space - Lazy lexeme extension modifiers,
pattern.plus,pattern.starandpattern.question. They do what you're expecting from a regex expression, and can be easier for translation and lazy debugging, but it's not recommended to use them because they are harder to read - Lexeme extension
iterable.andanditerable.or, transform a list of patterns into an and/or lexeme - Lexeme extension
iterable.spacedanditerable.optionalSpaced, transform a list of lexemes into an and lexeme with spaces in between, optionally
Changed:
spacingmoved tospace, single space
1.1.0 #
TODO:
Added:
anyUntil(pattern)top-level lexeme, matches any character until the pattern is matchedpattern.until(pattern)lexeme extension, matches the current pattern until the target pattern is matchedpattern.repeat(min, [ max ])lexeme extension, matches the pattern betweenminandmaxtimesstart,endtop-level lexemes, matches the start and end of the inputstartLine,endLinetop-level lexemes, matches the start and end of the line
Fixed:
- Lexeme's Regex string, was using
'$pattern'within itself instead of'${ pattern.regexString }'
Changed:
- Shorter error messages, to be less descriptive
1.0.0 #
BREAKING CHANGES:
- Token Parser was refactored to be able to throw lexical syntax errors, using
LexicalSyntaxError. This change means that tokenization is mandatory to return a token. If there's no match, it will throw an error suggesting where it went wrong. If you must have optional tokenization, use thegrammar.optionalParse()andlexeme.optionalTokenize()methods instead. - Every lexeme type was reworked, so they might have different behavior than before.
- Grammar debugging is available, instantiating grammar using the
DebugGrammarclass. It will show you the tokenization process, and the path it took to get to the token. It's recommended to use it when you're debugging your grammar, and remove it when you're done.
Added:
- Tokenization error
LexicalSyntaxError, is thrown when a token is not matched CharacterLexemeto match single characters- Grammar debugging, using the grammar class
DebugGrammar - Grammar debugging methods,
grammar.tokenizing(), called before lexemes start tokenizing .lengthproperty to token, which returns the length of value matchedempty()top-level lexeme extension, same asLexeme.empty()spacingtop-level lexeme, matches any conventional spacingpattern.pad()method surrounds the lexeme with another lexeme, optionallypattern.spacedlexeme extension, same aspattern.pad(spacing)-operator to exclude patterns, same aspattern.not.character~operator to pad lexeme with spacing around it, same aspattern.spaced
Fixed:
TokenandLexemecomparison hash code, now it has a much better performance
Changed:
- Reorganized the documentation
- Moved the utils directory inside the source directory
- Renamed
grammar.lemexestogrammar.rules - Moved
toString()intoregexStringgetter,toString()now displays similar todisplayName
0.0.1 #
Initial release: Token Parser