=encoding utf8 =head1 API documentation: The AST API from the Macro Perspective Macros in Perl have access to the raw, just-parsed representation of the program through their parameters. This representation is called the C. The AST is documented in full elsewhere, but this document is written from the perspective of the macro itself, and demonstrates the major components of the AST that must be in place for typical macros to function. =head2 Macro definition A macro definition looks like this: macro fizzle(...) { ... } Within the parameter list, any parameters can be defined, but they will always be of type C, some sub-type of C or plain scalars. An invocant may be declared. If so, it will be the C<$/> object that contains the macro invocation itself, and thus contains all state relevant to the macro. A macro can have an operator type: macro quote: (...) { ... } If it does, the operator type will constrain the parameters (if any) that are available to the macro, and a mis-match between the operator type and the parameter count will generate a compile-time error. =head2 Macro types Macros can be defined for each of the following operator types: =over =item list List operators are typical function-like macros that take arguments as a list of named or positional parameters. Each parameter is an C node. =item term Term macros are like list macros, but never take any arguments. =item quote [ Note: notice camel-casing of LiteralString. Is that what was intended? -ajs ] Quote macros take a single C node. Backslash escaping is performed only for the balancing quote terminator. Should some other sort of parsing be required, use the "is parsed" trait. =item prefix Prefix macros take only a single C parameter. =item infix Infix macros take two C parameters. =item postfix Postfix macros can never be "is parsed" because their only parameter (an C node) is found before the macro name in the program text. =item circumfix Circumfix macros take a single C as a parameter. =item postcircumfix Postcircumfix macros take two parameters. The first a C. The second is an C that comes inside of the circumfix delimeters. =item regex_metachar =item regex_backslash =item regex_mod_internal =item regex_assertion =item regex_mod_external TBD =item trait_verb =item trait_auxiliary [ Ok, here's a scary kind of question... can a macro be multi, dispatched at compile-time based on AST sub-types? If so, that answers the question of how "class Foo is Bar" is distinguished from "my Foo $x is Bar" which is similar, but certainly has very different types of parameters (the LHS is an AST node that is either a class name or a variable declaration). -ajs ] =item scope_declarator Scope_delcatator macros take a C node which contains the information about the variable being declared. =item statement_control Statement_controls are like if or while, and take an C and a C. [ Question: How is elsif/else handled? Are they named as part of the macro somehow, or must any elsif block ever be called "elsif"? -ajs ] =item statement_modifier Statement_modifier macros take two parameters, a C and a C. =item infix_prefix_meta_operator =item infix_postfix_meta_operator =item prefix_circumfix_meta_operator All of these macro types take three paramters: a C for the LHS, an C node for the operator that it is modifying, and an C for the RHS. [ Question: infix_postfix_meta_operator is designed for defining '=' which will take an LValue for its first parameter, but not all such operators will be for assignment, potentially. Again, this brings up the question of multi dispatch on AST node types. Is that what's intended, or should macro operator types be richer? -ajs ] =item postfix_prefix_meta_operator Postfix_prefix_meta_operator macros take an C and a C as parameters. =item prefix_postfix_meta_operator Prefix_postfix_meta_operator macros take a C and an C as parameters. =item infix_circumfix_meta_operator Infix_circumfix_meta_operator macros take an C and an C as parameters. [ Question: What about sub, is that a statement_control? What about use? Is that just a list op that does its thing at run-time? -ajs ] =head2 Accessing AST Internals Cs can be treated much the same as any rule state object. They can be indexed like a hash to extract their match terms (in this case, subrule names that match AST nodes). However, they also carry state information about the file being parsed. [ Question: how is that state information extracted? Is there a method that can be called to get line number for example? -ajs ] Expressions are the most often-seen element of a macro's parameter list. Expressions An Expression is either a Literal, a LValue (Variable/Apply/Call), or one of the special forms (Binding/Assignment). It can be tested like so: macro debug(*@exprs) { for @exprs -> $expr { if exists $expr { ... } elsif exists $expr { ... } elsif exists $expr or exists $expr { ... } else { die "Unknown Expression type '$expr'\n"; } } } But in many cases, such a test is not required, as Expressions are so universally useful: use AST::Tools :all; macro debug(*@exprs) { q:code{ say {{{ astlist(@exprs) }}} }; } [ Question: we need some tools for constructing AST nodes from other AST nodes. One of the most obvious to me is the above, but my name for it ("astlist") is just a suggestion. It's probably exported by something like AST::Tools -ajs ] [ Question: Another way to do that would be to have a generic AST initializer so that this worked: q:code{ say {{{ AST.new('List',@exprs) }}} }; which might make more sense, as you could do arbitrarily complex things with the combination of high-level tools and initializers like: macro curry($subroutine, *@args) { my $body = call_as_block($subroutine,\(=@args)); return AST.new('Closure', :$body); } -ajs ] ... More to come once we work out the questions above ... =cut