官方文档
https://codeql.github.com/docs/codeql-language-guides/codeql-library-for-javascript/

0x01、概述

0x02、关于JavaScript库

将所有库分类成如下10种类型的类

  • Textual — classes that represent source code as unstructured text files(文本)
  • Lexical — classes that represent source code as a series of tokens and comments(词法和注释)
  • Syntactic — classes that represent source code as an abstract syntax tree(抽象语法树)
  • Name binding — classes that represent scopes and variables(作用域和变量)
  • Control flow — 执行过程中控制流相关的类
  • Data flow — classes that you can use to reason about data flow in JavaScript source code(数据流)
  • Type inference — classes that you can use to approximate types for JavaScript expressions and variables(表达式和变量)
  • Call graph — classes that represent the caller-callee relationship between functions(调用关系图)
  • Inter-procedural data flow — classes that you can use to define inter-procedural data flow and taint tracking analyses(污点追踪技术)
  • Frameworks — classes that represent source code entities that have a special meaning to JavaScript tools and frameworks(JavaScript工具和框架)

    一、文本层级

    文本层级是最基本的层级,

    1、文件和文件夹

Container
A file or folder.
File
A file
Folder
A folder

File 和 Folder 都继承自Container

File Location::getFile
File Locatable::getFile
Location Line::getLocation
TopLevel Line:getTopLevel
File Folder::getFile
File Container::getAFile
Folder Container::getAFolder
Container Container::getParentContainer

文件名:$HOME/vscode-codeql-starter/ql/javascript/ql/test/query-tests/Security/CWE-079/DomBasedXss/string-manipulations.js
File.getBaseName()
输出:string-manipulations.js

2、位置

Locations模块
用来获取文件名、起始行、起始列、结束行、结束列的位置。

3、行

文件中的文本行由Line类表示
Lines模块 Line类
Line.getText() 返回当前行的文本,不包括换行符

注意:默认情况下,程序中的文本行不包含在 CodeQL 数据库中。
—extract-program-text

二、词法层级

Token类和Comment类
https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Lexical_grammar

Tokens

Comments

点击查看【processon】
获取所有的注释

  1. import javascript
  2. from Comment com
  3. where not com.getFile().toString().regexpMatch(".+\\/externs\\/.+")
  4. select com.getFile(),com,com.getText()

三、语法级别

AST模块
https://codeql.github.com/codeql-standard-libraries/javascript/semmle/javascript/AST.qll/module.AST.html
抽象语法树ASTs(abstract syntax trees)

ASTNode类,包含表示抽象语法树中节点的所有实体,并定义了通用树遍历谓词:

  • ASTNode.getChild(i): returns the ith child of this AST node.
  • ASTNode.getAChild(): returns any child of this AST node.
  • ASTNode.getParent(): returns the parent node of this AST node, if any.

TOP级别

TopLevel

  • TopLevel.getNumberOfLines() returns the total number of lines (including code, comments and whitespace) in the top-level.
  • TopLevel.getNumberOfLinesOfCode() returns the number of lines of code, that is, lines that contain at least one token.
  • TopLevel.getNumberOfLinesOfComments() returns the number of lines containing or belonging to a comment.
  • TopLevel.isMinified() determines whether the top-level contains minified code, using a heuristic based on the average number of statements per line.

TopLevel子类

点击查看【processon】

语句

Stmt模块

  • Stmt: use Stmt.getContainer() to access the innermost function or top-level in which the statement is contained.
    • ControlStmt: a statement that controls the execution of other statements, that is, a conditional, loop, try or with statement; use ControlStmt.getAControlledStmt() to access the statements that it controls.
      • IfStmt: an if statement; use IfStmt.getCondition(), IfStmt.getThen() and IfStmt.getElse() to access its condition expression, “then” branch and “else” branch, respectively.
      • LoopStmt: a loop; use Loop.getBody() and Loop.getTest() to access its body and its test expression, respectively.
        • WhileStmt, DoWhileStmt: a “while” or “do-while” loop, respectively.
        • ForStmt: a “for” statement; use ForStmt.getInit() and ForStmt.getUpdate() to access the init and update expressions, respectively.
        • EnhancedForLoop: a “for-in” or “for-of” loop; use EnhancedForLoop.getIterator() to access the loop iterator (which may be a expression or variable declaration), and EnhancedForLoop.getIterationDomain() to access the expression being iterated over.
      • WithStmt: a “with” statement; use WithStmt.getExpr() and WithStmt.getBody() to access the controlling expression and the body of the with statement, respectively.
      • SwitchStmt: a switch statement; use SwitchStmt.getExpr() to access the expression on which the statement switches; use SwitchStmt.getCase(int) and SwitchStmt.getACase() to access individual switch cases; each case is modeled by an entity of class Case, whose member predicates Case.getExpr() and Case.getBodyStmt(int) provide access to the expression checked by the switch case (which is undefined for default), and its body.
      • TryStmt: a “try” statement; use TryStmt.getBody(), TryStmt.getCatchClause() and TryStmt.getFinally to access its body, “catch” clause and “finally” block, respectively.
    • BlockStmt: a block of statements; use BlockStmt.getStmt(int) to access the individual statements in the block.
    • ExprStmt: an expression statement; use ExprStmt.getExpr() to access the expression itself.
    • JumpStmt: a statement that disrupts structured control flow, that is, one of break, continue, return and throw; use predicate JumpStmt.getTarget() to determine the target of the jump, which is either a statement or (for return and uncaught throw statements) the enclosing function.
      • BreakStmt: a “break” statement; use BreakStmt.getLabel() to access its (optional) target label.
      • ContinueStmt: a “continue” statement; use ContinueStmt.getLabel() to access its (optional) target label.
      • ReturnStmt: a “return” statement; use ReturnStmt.getExpr() to access its (optional) result expression.
      • ThrowStmt: a “throw” statement; use ThrowStmt.getExpr() to access its thrown expression.
    • FunctionDeclStmt: a function declaration statement; see below for available member predicates.
    • ClassDeclStmt: a class declaration statement; see below for available member predicates.
    • DeclStmt: a declaration statement containing one or more declarators which can be accessed by predicate DeclStmt.getDeclarator(int).

点击查看【processon】

表达式

Exp模块

  • Expr: use Expr.getEnclosingStmt() to obtain the innermost statement to which this expression belongs; Expr.isPure() determines whether the expression is side-effect-free.
    • Identifier: an identifier; use Identifier.getName() to obtain its name.
    • Literal: a literal value; use Literal.getValue() to obtain a string representation of its value, and Literal.getRawValue() to obtain its raw source text (including surrounding quotes for string literals).
    • ThisExpr: a “this” expression.
    • SuperExpr: a “super” expression.
    • ArrayExpr: an array expression; use ArrayExpr.getElement(i) to obtain the ith element expression, and ArrayExpr.elementIsOmitted(i) to check whether the ith element is omitted.
    • ObjectExpr: an object expression; use ObjectExpr.getProperty(i) to obtain the ith property in the object expression; properties are modeled by class Property, which is described in more detail below.
    • FunctionExpr: a function expression; see below for available member predicates.
    • ArrowFunctionExpr: an ECMAScript 2015-style arrow function expression; see below for available member predicates.
    • ClassExpr: a class expression; see below for available member predicates.
    • ParExpr: a parenthesized expression; use ParExpr.getExpression() to obtain the operand expression; for any expression, Expr.stripParens() can be used to recursively strip off any parentheses
    • SeqExpr: a sequence of two or more expressions connected by the comma operator; use SeqExpr.getOperand(i) to obtain the ith sub-expression.
    • ConditionalExpr: a ternary conditional expression; member predicates ConditionalExpr.getCondition(), ConditionalExpr.getConsequent() and ConditionalExpr.getAlternate() provide access to the condition expression, the “then” expression and the “else” expression, respectively.
    • InvokeExpr: a function call or a “new” expression; use InvokeExpr.getCallee() to obtain the expression specifying the function to be called, and InvokeExpr.getArgument(i) to obtain the ith argument expression.
      • CallExpr: a function call.
      • NewExpr: a “new” expression.
      • MethodCallExpr: a function call whose callee expression is a property access; use MethodCallExpr.getReceiver to access the receiver expression of the method call, and MethodCallExpr.getMethodName() to get the method name (if it can be determined statically).
    • PropAccess: a property access, that is, either a “dot” expression of the form e.f or an index expression of the form e[p]; use PropAccess.getBase() to obtain the base expression on which the property is accessed (e in the example), and PropAccess.getPropertyName() to determine the name of the accessed property; if the name cannot be statically determined, getPropertyName() does not return any value.
      • DotExpr: a “dot” expression.
      • IndexExpr: an index expression (also known as computed property access).
    • UnaryExpr: a unary expression; use UnaryExpr.getOperand() to obtain the operand expression.
    • BinaryExpr: a binary expression; use BinaryExpr.getLeftOperand() and BinaryExpr.getRightOperand() to access the operand expressions.
    • Assignment: assignment expressions, either simple or compound; use Assignment.getLhs() and Assignment.getRhs() to access the left- and right-hand side, respectively.
    • UpdateExpr: an increment or decrement expression; use UpdateExpr.getOperand() to obtain the operand expression.
    • YieldExpr: a “yield” expression; use YieldExpr.getOperand() to access the (optional) operand expression; use YieldExpr.isDelegating() to check whether this is a delegating yield*.
    • TemplateLiteral: an ECMAScript 2015 template literal; TemplateLiteral.getElement(i) returns the ith element of the template, which may either be an interpolated expression or a constant template element.
    • TaggedTemplateExpr: an ECMAScript 2015 tagged template literal; use TaggedTemplateExpr.getTag() to access the tagging expression, and TaggedTemplateExpr.getTemplate() to access the template literal being tagged.
    • TemplateElement: a constant template element; as for literals, use TemplateElement.getValue() to obtain the value of the element, and TemplateElement.getRawValue() for its raw value
    • AwaitExpr: an “await” expression; use AwaitExpr.getOperand() to access the operand expression.

函数

Function
https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Functions
https://codeql.github.com/codeql-standard-libraries/javascript/semmle/javascript/Functions.qll/type.Functions$Function.html
Function.getIdentifier() 获取函数名
Function.getParameter(i) 获取参数
Function.getBody() 获取函数体
image.png

1、函数声明(函数语句) FunctionDeclStmt (a subclass of Stmt),

  1. function name([param,[, param,[..., param]]]) {
  2. [statements]
  3. }

name
函数名.
param
传递给函数的参数的名称,一个函数最多可以有255个参数。
statements
组成函数体的声明语句。

  1. function greet() {
  2. console.log("Hi");
  3. }

2、函数表达式 FunctionExpr (a subclass of Expr)

  1. var myFunction = function name([param[, param[, ...param]]]) {
  2. statements
  3. }
  1. var greet = function() {
  2. console.log("Hi");
  3. };

当函数只使用一次时,通常使用IIFE (Immediately Invokable Function Expressions)。

  1. (function() {
  2. statements
  3. })();

3、箭头函数表达式 (=>) ArrowFunctionExpr (also a subclass of Expr),

  1. ([param][, param]) = > {
  2. statements
  3. }
  4. param = > expression

param
参数名称. 零参数需要用()表示. 只有一个参数时不需要括号. (例如 foo => 1)
statements or expression
多个声明statements需要用大括号括起来,而单个表达式时则不需要。表达式expression也是该函数的隐式返回值。

  1. var greet2 =
  2. () = > console.log("Hi") // arrow function expression

类和成员

image.png
ClassDefinition
类声明语句 ClassDeclStmt
类表达式 ClassExpr
MemberDefinition
成员字段定义FieldDefinition
成员方法定义 MethodDefinition

ClassDefinition常用谓词

ClassDefinition.getIdentifier() 返回类名
ClassDefinition.getSuperClass()返回超类
ClassDefinition.getMember(n) 返回类的成员
ClassDefinition.getMethod(n) 返回类的成员方法
ClassDefinition.getField(n) 返回类的成员字段
ClassDefinition.getConstructor() 返回类的构造器

MemberDefinition常用谓词

MemberDefinition.getName() 返回成员名
MemberDefinition.getInit() 返回成员初始化表达式

声明

image.png

属性

Property类是ASTNode的子类,但是既不是Expr也不是Stmt。
Property.getName() 返回属性名
Property.getInit() 返回初始化值
image.png

模块

四、名称绑定

五、控制流

六、类型引用

七、数据流

八、调用关系

九、Inter-procedural data flow

十、框架

AngularJS

HTTP framework libraries

Node.js

NPM

React

Databases