token_parser 1.7.0 token_parser: ^1.7.0 copied to clipboard
An intuitive Token Parser that includes grammar definition, tokenization, parsing, syntax error and debugging. Implementation based on Lexical Analysis for Dart.
1.7.0 #
Added:
- Lexeme extension
lexeme.copy()
, creates a copy of the lexeme, with the same properties. This is useful to avoid having duplicate lexeme variable references. - Top-level range regex pattern,
range(a, b)
matches any number or character betweena
andb
. - Example of a simple RGB Color Parser, using the
range
pattern.
1.6.0 #
Added:
- GlobalLexeme matches the target pattern on any part of the input.
- Grammar
(remover)
rule, removes the target pattern from the input. Useful for removing comments before tokenizing.
Fixed:
- PatternLexeme's equality operator was not comparing simple patterns correctly.
1.5.2 #
Changed:
- Lexeme modifier
.pad()
is not multiple anymore, it was impossible to make it optional
Fixed:
- Lexeme operator
~
was not working properly,.optional.multiple
is impossible as multiple requires at least one match
1.5.1 #
Fixed:
- Infinite loop when combining
.optional.multiple
lexeme modifiers, instead of.multiple.optional
. Multiple lexeme modifier now skips if the pattern did not consume any characters
1.5.0 #
Changed:
- Lexeme modifier
pattern.pad()
is now not optional, to have an optional pattern dopattern.optional
before passing in - Lexeme modifier
pattern.spaced
is now not optional, it wouldn't be called spaced otherwise. To have optional spacing, usepattern.optionalSpaced
Fixed:
- Debugging tokenization, no character error
1.4.0 #
BREAKING CHANGES:
- Added, changed and removed a few lexeme operators, to simplify the API and make it more consistent.
Added:
- Lexeme operator
*
to combine 2 lexemes with a space in between, multiple and optional
Changed:
- Lexeme operator
+
to combine 2 lexemes with a space in between, multiple
Remove:
- Lexeme operators
>
and>=
, are more probable to cause errors than to be useful, due to how Dart handles them - Lexeme operator
>>
, does the same thing as+
1.3.1 #
Fixed:
- Multiple
>
and>=
would prompt an error: "A comparison expression can't be an operand of another comparison expression.". Added>>
and*
to fix this issue, respectively.
1.3.0 #
Added:
- Lexeme operator extension
>
and>=
, add spacing in between, and optionally - Shield.io badges to README.md
1.2.0 #
Added:
- New
spacing
top-level lexeme, matches any conventional spacing, multiple space - Lazy lexeme extension modifiers,
pattern.plus
,pattern.star
andpattern.question
. They do what you're expecting from a regex expression, and can be easier for translation and lazy debugging, but it's not recommended to use them because they are harder to read - Lexeme extension
iterable.and
anditerable.or
, transform a list of patterns into an and/or lexeme - Lexeme extension
iterable.spaced
anditerable.optionalSpaced
, transform a list of lexemes into an and lexeme with spaces in between, optionally
Changed:
spacing
moved tospace
, single space
1.1.0 #
TODO:
Added:
anyUntil(pattern)
top-level lexeme, matches any character until the pattern is matchedpattern.until(pattern)
lexeme extension, matches the current pattern until the target pattern is matchedpattern.repeat(min, [ max ])
lexeme extension, matches the pattern betweenmin
andmax
timesstart
,end
top-level lexemes, matches the start and end of the inputstartLine
,endLine
top-level lexemes, matches the start and end of the line
Fixed:
- Lexeme's Regex string, was using
'$pattern'
within itself instead of'${ pattern.regexString }'
Changed:
- Shorter error messages, to be less descriptive
1.0.0 #
BREAKING CHANGES:
- Token Parser was refactored to be able to throw lexical syntax errors, using
LexicalSyntaxError
. This change means that tokenization is mandatory to return a token. If there's no match, it will throw an error suggesting where it went wrong. If you must have optional tokenization, use thegrammar.optionalParse()
andlexeme.optionalTokenize()
methods instead. - Every lexeme type was reworked, so they might have different behavior than before.
- Grammar debugging is available, instantiating grammar using the
DebugGrammar
class. It will show you the tokenization process, and the path it took to get to the token. It's recommended to use it when you're debugging your grammar, and remove it when you're done.
Added:
- Tokenization error
LexicalSyntaxError
, is thrown when a token is not matched CharacterLexeme
to match single characters- Grammar debugging, using the grammar class
DebugGrammar
- Grammar debugging methods,
grammar.tokenizing()
, called before lexemes start tokenizing .length
property to token, which returns the length of value matchedempty()
top-level lexeme extension, same asLexeme.empty()
spacing
top-level lexeme, matches any conventional spacingpattern.pad()
method surrounds the lexeme with another lexeme, optionallypattern.spaced
lexeme extension, same aspattern.pad(spacing)
-
operator to exclude patterns, same aspattern.not.character
~
operator to pad lexeme with spacing around it, same aspattern.spaced
Fixed:
Token
andLexeme
comparison hash code, now it has a much better performance
Changed:
- Reorganized the documentation
- Moved the utils directory inside the source directory
- Renamed
grammar.lemexes
togrammar.rules
- Moved
toString()
intoregexString
getter,toString()
now displays similar todisplayName
0.0.1 #
Initial release: Token Parser