UglifyJS — 抽象语法树 (AST)
You can get a cruel description of the AST with uglifyjs --ast-help. Each definition starts with the node name (i.e. AST_Node), followed by a list of own properties in parens (if it has any), followed by a string description and followed by any subclasses (if there are any). Nodes inherit properties from the base classes; for example since the start and end properties are defined in the base class AST_Node, then every node contains those properties.
The parser will instantiate the most specific subclass; for example you will never find an object of type AST_Node in the AST; that's just the base class. You won't find an AST_Statement either, since every kind of statement has its own dedicated subclass.
The AST nodes
The following hierarchy is generated by your browser using introspection from the UglifyJS objects. Click a node to get a brief description of it. See below for some information on AST_Token, also take a look at the scope analyzer for more information about properties in red and SymbolDef.
AST_Node {
AST_Statement {
AST_Debugger
AST_Directive
AST_SimpleStatement
AST_Block {
AST_BlockStatement
AST_Scope {
AST_Toplevel
AST_Lambda {
AST_Accessor
AST_Function
AST_Defun
}
}
AST_Switch
AST_SwitchBranch {
AST_Default
AST_Case
}
AST_Try
AST_Catch
AST_Finally
}
AST_EmptyStatement
AST_StatementWithBody {
AST_LabeledStatement
AST_IterationStatement {
AST_DWLoop {
AST_Do
AST_While
}
AST_For
AST_ForIn
}
AST_With
AST_If
}
AST_Jump {
AST_Exit {
AST_Return
AST_Throw
}
AST_LoopControl {
AST_Break
AST_Continue
}
}
AST_Definitions {
AST_Var
AST_Const
}
}
AST_VarDef
AST_Call {
AST_New
}
AST_Seq
AST_PropAccess {
AST_Dot
AST_Sub
}
AST_Unary {
AST_UnaryPrefix
AST_UnaryPostfix
}
AST_Binary {
AST_Assign
}
AST_Conditional
AST_Array
AST_Object
AST_ObjectProperty {
AST_ObjectKeyVal
AST_ObjectSetter
AST_ObjectGetter
}
AST_Symbol {
AST_SymbolAccessor
AST_SymbolDeclaration {
AST_SymbolVar {
AST_SymbolFunarg
}
AST_SymbolConst
AST_SymbolDefun
AST_SymbolLambda
AST_SymbolCatch
}
AST_Label
AST_SymbolRef
AST_LabelRef
AST_This
}
AST_Constant {
AST_String
AST_Number
AST_RegExp
AST_Atom {
AST_Null
AST_NaN
AST_Undefined
AST_Hole
AST_Infinity
AST_Boolean {
AST_False
AST_True
}
}
}
}
The tokenizer
For a higher-level operation, the parser works concomitantly with a tokenizer. The tokenizer is initialized to the stream of the source code text and reads one token at a time, producing an AST_Token object which has the following properties:
- type — the type of this token; can be "num", "string", "regexp", "operator", "punc", "atom", "name", "keyword", "comment1" or "comment2".
"comment1" and "comment2" are for single-line, respectively multi-line comments.
file — the name of the file where this token originated from. Useful when compressing multiple files at once to generate the proper source map.
value — the "value" of the token; that's additional information and depends on the token type: "num", "string" and "regexp" tokens you get their literal value; for "operator" you get the operator; for "punc" it's the punctuation sign (parens, comma, semicolon etc); for "atom", "name" and "keyword" it's the name of the identifier, and for comments it's the body of the comment (excluding the initial "//" and "/*".
line and col — the location of this token in the original code. The line is 1-based index, and the column is the 0-based index.
pos and endpos — the zero-based start and end positions of this token in the original text.
nlb — short for "newline before", it's a boolean that tells us whether there was a newline before this node in the original source. It helps for automatic semicolon insertion. For multi-line comments in particular this will be set to true if there either was a newline before this comment, or if this comment contains a newline.
comments_before — this doesn't apply for comment tokens, but for all other token types it will be an array of comment tokens that were found before.
The start and end properties of AST nodes are AST_Token objects and tell you where that node begins and ends. The AST_Toplevel is the single node that might start in one file and end in another (when parsing multiple files); the parser will properly update its end property.
Read more about the scope analyzer.