token_parser 1.2.0 token_parser: ^1.2.0 copied to clipboard
An intuitive Token Parser that includes grammar definition, tokenization, parsing, syntax error and debugging. Implementation based on Lexical Analysis.
1.2.0 #
Added:
- New
spacing
top-level lexeme, matches any conventional spacing, multiple space - Lazy lexeme extension modifiers,
pattern.plus
,pattern.star
andpattern.question
. They do what you're expecting from a regex expression, and can be easier for translation and lazy debugging, but it's not recommended to use them because they are harder to read - Lexeme extension
iterable.and
anditerable.or
, transform a list of patterns into an and/or lexeme - Lexeme extension
iterable.spaced
anditerable.optionalSpaced
, transform a list of lexemes into an and lexeme with spaces in between, optionally
Changed:
spacing
moved tospace
, single space
1.1.0 #
TODO:
Added:
anyUntil(pattern)
top-level lexeme, matches any character until the pattern is matchedpattern.until(pattern)
lexeme extension, matches the current pattern until the target pattern is matchedpattern.repeat(min, [ max ])
lexeme extension, matches the pattern betweenmin
andmax
timesstart
,end
top-level lexemes, matches the start and end of the inputstartLine
,endLine
top-level lexemes, matches the start and end of the line
Fixed:
- Lexeme's Regex string, was using
'$pattern'
within itself instead of'${ pattern.regexString }'
Changed:
- Shorter error messages, to be less descriptive
1.0.0 #
BREAKING CHANGES:
- Token Parser was refactored to be able to throw lexical syntax errors, using
LexicalSyntaxError
. This change means that tokenization is mandatory to return a token. If there's no match, it will throw an error suggesting where it went wrong. If you with to have optional tokenization, use thegrammar.optionalParse()
andlexeme.optionalTokenize()
methods instead. - Every lexeme type was reworked, so they might have different behavior than before.
Added:
- Tokenization error
LexicalSyntaxError
, is thrown when a token is not matched CharacterLexeme
to match single characters- Grammar debugging, using the grammar class
DebugGrammar
- Grammar debugging methods,
grammar.tokenizing()
, called before lexemes start tokenizing .length
property to token, which returns the length of value matchedempty()
top-level lexeme extension, same asLexeme.empty()
spacing
top-level lexeme, matches any conventional spacingpattern.pad()
method surrounds the lexeme with another lexeme, optionallypattern.spaced
lexeme extension, same aspattern.pad(spacing)
-
operator to exclude patterns, same aspattern.not.character
~
operator to pad lexeme with spacing around it, same aspattern.spaced
Fixed:
Token
andLexeme
comparison hash code, now it has a much better performance
Changed:
- Reorganized the documentation
- Moved the utils directory inside the source directory
- Renamed
grammar.lemexes
togrammar.rules
- Moved
toString()
intoregexString
getter,toString()
now displays similar todisplayName
0.0.1 #
Initial release: Token Parser