Basic Features
The features needed to make the parser basically useful are:
- Parser subclasses can define their own grammar using Ruby syntax as
much as possible. E.g.,
grammar { def_rule :foo, "production" do |match| do_something_with(match) end }
- Parsers which inherit from a subclass of OOParser inherit their grammars with rule polymorphism. (E.g., HTMLParser < XMLParser < SGMLParser)
- Parse failures should present human-readable errors which describe the failure in detail, including line number, what it was expecting to find, what it did find, etc.
Initial Production Items
Productions (currently) can contain three basic kinds of items:
- Literals/Terminals
- Matches exactly whatever is specified.
- Example:
/literal/xior"literal"(which are equivalent) - Directives
- Match using some other more-complex construct.
- Only a subset of the planned production directives will be handled for the
first release. Directives are of the form:
<identifier>. The currently-implemented ones are:- Subrules
- Matches using another rule in the grammar.
<subrule>- Pre-defined Subrules
- Subrules that are pre-defined by OOParser
<WHITESPACE>,<CRLF>,<LT>- [perhaps pre-define all the HTML entities? Grab most of Perl6::Rules's predefined named rules at the very least.]
_function in the parser class.<&set_skip(//)>- Turns off whitespace-skipping
<&debug("Got here.")>- Logs a message at
DEBUGlevel. <&foo("bar")>- Call the
#foo_functionmethod in the current parser, passing the current ParseState object and the string "bar" to it and considering a non-nil, non-falsereturn value as a successful match.
<foo>+ (one or more), <foo>?
(zero or one), <foo>* (zero or more)