UglifyJS — 抽象语法树 (AST)

You can get a cruel description of the AST with uglifyjs --ast-help. Each definition starts with the node name (i.e. AST_Node), followed by a list of own properties in parens (if it has any), followed by a string description and followed by any subclasses (if there are any). Nodes inherit properties from the base classes; for example since the start and end properties are defined in the base class AST_Node, then every node contains those properties.

The parser will instantiate the most specific subclass; for example you will never find an object of type AST_Node in the AST; that's just the base class. You won't find an AST_Statement either, since every kind of statement has its own dedicated subclass.

The AST nodes

The following hierarchy is generated by your browser using introspection from the UglifyJS objects. Click a node to get a brief description of it. See below for some information on AST_Token, also take a look at the scope analyzer for more information about properties in red and SymbolDef.

AST_Node {
  AST_Statement {
    AST_Debugger
    AST_Directive
    AST_SimpleStatement
    AST_Block {
      AST_BlockStatement
      AST_Scope {
        AST_Toplevel
        AST_Lambda {
          AST_Accessor
          AST_Function
          AST_Defun
        }
      }
      AST_Switch
      AST_SwitchBranch {
        AST_Default
        AST_Case
      }
      AST_Try
      AST_Catch
      AST_Finally
    }
    AST_EmptyStatement
    AST_StatementWithBody {
      AST_LabeledStatement
      AST_IterationStatement {
        AST_DWLoop {
          AST_Do
          AST_While
        }
        AST_For
        AST_ForIn
      }
      AST_With
      AST_If
    }
    AST_Jump {
      AST_Exit {
        AST_Return
        AST_Throw
      }
      AST_LoopControl {
        AST_Break
        AST_Continue
      }
    }
    AST_Definitions {
      AST_Var
      AST_Const
    }
  }
  AST_VarDef
  AST_Call {
    AST_New
  }
  AST_Seq
  AST_PropAccess {
    AST_Dot
    AST_Sub
  }
  AST_Unary {
    AST_UnaryPrefix
    AST_UnaryPostfix
  }
  AST_Binary {
    AST_Assign
  }
  AST_Conditional
  AST_Array
  AST_Object
  AST_ObjectProperty {
    AST_ObjectKeyVal
    AST_ObjectSetter
    AST_ObjectGetter
  }
  AST_Symbol {
    AST_SymbolAccessor
    AST_SymbolDeclaration {
      AST_SymbolVar {
        AST_SymbolFunarg
      }
      AST_SymbolConst
      AST_SymbolDefun
      AST_SymbolLambda
      AST_SymbolCatch
    }
    AST_Label
    AST_SymbolRef
    AST_LabelRef
    AST_This
  }
  AST_Constant {
    AST_String
    AST_Number
    AST_RegExp
    AST_Atom {
      AST_Null
      AST_NaN
      AST_Undefined
      AST_Hole
      AST_Infinity
      AST_Boolean {
        AST_False
        AST_True
      }
    }
  }
}

The tokenizer

For a higher-level operation, the parser works concomitantly with a tokenizer. The tokenizer is initialized to the stream of the source code text and reads one token at a time, producing an AST_Token object which has the following properties:

  • type — the type of this token; can be "num", "string", "regexp", "operator", "punc", "atom", "name", "keyword", "comment1" or "comment2".

"comment1" and "comment2" are for single-line, respectively multi-line comments.

  • file — the name of the file where this token originated from. Useful when compressing multiple files at once to generate the proper source map.

  • value — the "value" of the token; that's additional information and depends on the token type: "num", "string" and "regexp" tokens you get their literal value; for "operator" you get the operator; for "punc" it's the punctuation sign (parens, comma, semicolon etc); for "atom", "name" and "keyword" it's the name of the identifier, and for comments it's the body of the comment (excluding the initial "//" and "/*".

  • line and col — the location of this token in the original code. The line is 1-based index, and the column is the 0-based index.

  • pos and endpos — the zero-based start and end positions of this token in the original text.

  • nlb — short for "newline before", it's a boolean that tells us whether there was a newline before this node in the original source. It helps for automatic semicolon insertion. For multi-line comments in particular this will be set to true if there either was a newline before this comment, or if this comment contains a newline.

  • comments_before — this doesn't apply for comment tokens, but for all other token types it will be an array of comment tokens that were found before.

The start and end properties of AST nodes are AST_Token objects and tell you where that node begins and ends. The AST_Toplevel is the single node that might start in one file and end in another (when parsing multiple files); the parser will properly update its end property.

Read more about the scope analyzer.