官方文档
https://codeql.github.com/docs/codeql-language-guides/codeql-library-for-javascript/
0x01、概述
0x02、关于JavaScript库
将所有库分类成如下10种类型的类
- Textual — classes that represent source code as unstructured text files(文本)
- Lexical — classes that represent source code as a series of tokens and comments(词法和注释)
- Syntactic — classes that represent source code as an abstract syntax tree(抽象语法树)
- Name binding — classes that represent scopes and variables(作用域和变量)
- Control flow — 执行过程中控制流相关的类
- Data flow — classes that you can use to reason about data flow in JavaScript source code(数据流)
- Type inference — classes that you can use to approximate types for JavaScript expressions and variables(表达式和变量)
- Call graph — classes that represent the caller-callee relationship between functions(调用关系图)
- Inter-procedural data flow — classes that you can use to define inter-procedural data flow and taint tracking analyses(污点追踪技术)
- Frameworks — classes that represent source code entities that have a special meaning to JavaScript tools and frameworks(JavaScript工具和框架)
一、文本层级
文本层级是最基本的层级,1、文件和文件夹
Container
A file or folder.
File
A file
Folder
A folder
File 和 Folder 都继承自Container
File Location::getFile
File Locatable::getFile
Location Line::getLocation
TopLevel Line:getTopLevel
File Folder::getFile
File Container::getAFile
Folder Container::getAFolder
Container Container::getParentContainer
文件名:$HOME/vscode-codeql-starter/ql/javascript/ql/test/query-tests/Security/CWE-079/DomBasedXss/string-manipulations.js
File.getBaseName()
输出:string-manipulations.js
2、位置
Locations模块
用来获取文件名、起始行、起始列、结束行、结束列的位置。
3、行
文件中的文本行由Line类表示
Lines模块 Line类
Line.getText() 返回当前行的文本,不包括换行符
注意:默认情况下,程序中的文本行不包含在 CodeQL 数据库中。
—extract-program-text
二、词法层级
Token类和Comment类
https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Lexical_grammar
Tokens
Comments
点击查看【processon】
获取所有的注释
import javascript
from Comment com
where not com.getFile().toString().regexpMatch(".+\\/externs\\/.+")
select com.getFile(),com,com.getText()
三、语法级别
AST模块
https://codeql.github.com/codeql-standard-libraries/javascript/semmle/javascript/AST.qll/module.AST.html
抽象语法树ASTs(abstract syntax trees)
ASTNode类,包含表示抽象语法树中节点的所有实体,并定义了通用树遍历谓词:
ASTNode.getChild(i)
: returns thei
th child of this AST node.ASTNode.getAChild()
: returns any child of this AST node.ASTNode.getParent()
: returns the parent node of this AST node, if any.
TOP级别
TopLevel.getNumberOfLines()
returns the total number of lines (including code, comments and whitespace) in the top-level.TopLevel.getNumberOfLinesOfCode()
returns the number of lines of code, that is, lines that contain at least one token.TopLevel.getNumberOfLinesOfComments()
returns the number of lines containing or belonging to a comment.TopLevel.isMinified()
determines whether the top-level contains minified code, using a heuristic based on the average number of statements per line.
TopLevel子类
- TopLevel
- Script: a stand-alone file or HTML
<script>
element- ExternalScript: a stand-alone JavaScript file
- InlineScript: code embedded inline in an HTML
<script>
tag
- CodeInAttribute: a code block originating from an HTML attribute value
- EventHandlerCode: code from an event handler attribute such as
onload
- JavaScriptURL: code from a URL with the
javascript:
scheme
- EventHandlerCode: code from an event handler attribute such as
- Externs: a JavaScript file containing externs definitions
- Script: a stand-alone file or HTML
语句
Stmt模块
- Stmt: use
Stmt.getContainer()
to access the innermost function or top-level in which the statement is contained.- ControlStmt: a statement that controls the execution of other statements, that is, a conditional, loop,
try
orwith
statement; useControlStmt.getAControlledStmt()
to access the statements that it controls.- IfStmt: an
if
statement; useIfStmt.getCondition()
,IfStmt.getThen()
andIfStmt.getElse()
to access its condition expression, “then” branch and “else” branch, respectively. - LoopStmt: a loop; use
Loop.getBody()
andLoop.getTest()
to access its body and its test expression, respectively.- WhileStmt, DoWhileStmt: a “while” or “do-while” loop, respectively.
- ForStmt: a “for” statement; use
ForStmt.getInit()
andForStmt.getUpdate()
to access the init and update expressions, respectively. - EnhancedForLoop: a “for-in” or “for-of” loop; use
EnhancedForLoop.getIterator()
to access the loop iterator (which may be a expression or variable declaration), andEnhancedForLoop.getIterationDomain()
to access the expression being iterated over.
- WithStmt: a “with” statement; use
WithStmt.getExpr()
andWithStmt.getBody()
to access the controlling expression and the body of the with statement, respectively. - SwitchStmt: a switch statement; use
SwitchStmt.getExpr()
to access the expression on which the statement switches; useSwitchStmt.getCase(int)
andSwitchStmt.getACase()
to access individual switch cases; each case is modeled by an entity of class Case, whose member predicatesCase.getExpr()
andCase.getBodyStmt(int)
provide access to the expression checked by the switch case (which is undefined fordefault
), and its body. - TryStmt: a “try” statement; use
TryStmt.getBody()
,TryStmt.getCatchClause()
andTryStmt.getFinally
to access its body, “catch” clause and “finally” block, respectively.
- IfStmt: an
- BlockStmt: a block of statements; use
BlockStmt.getStmt(int)
to access the individual statements in the block. - ExprStmt: an expression statement; use
ExprStmt.getExpr()
to access the expression itself. - JumpStmt: a statement that disrupts structured control flow, that is, one of
break
,continue
,return
andthrow
; use predicateJumpStmt.getTarget()
to determine the target of the jump, which is either a statement or (forreturn
and uncaughtthrow
statements) the enclosing function.- BreakStmt: a “break” statement; use
BreakStmt.getLabel()
to access its (optional) target label. - ContinueStmt: a “continue” statement; use
ContinueStmt.getLabel()
to access its (optional) target label. - ReturnStmt: a “return” statement; use
ReturnStmt.getExpr()
to access its (optional) result expression. - ThrowStmt: a “throw” statement; use
ThrowStmt.getExpr()
to access its thrown expression.
- BreakStmt: a “break” statement; use
- FunctionDeclStmt: a function declaration statement; see below for available member predicates.
- ClassDeclStmt: a class declaration statement; see below for available member predicates.
- DeclStmt: a declaration statement containing one or more declarators which can be accessed by predicate
DeclStmt.getDeclarator(int)
.- VarDeclStmt, ConstDeclStmt, LetStmt: a
var
,const
orlet
declaration statement.
- VarDeclStmt, ConstDeclStmt, LetStmt: a
- ControlStmt: a statement that controls the execution of other statements, that is, a conditional, loop,
表达式
Exp模块
- Expr: use
Expr.getEnclosingStmt()
to obtain the innermost statement to which this expression belongs;Expr.isPure()
determines whether the expression is side-effect-free.- Identifier: an identifier; use
Identifier.getName()
to obtain its name. - Literal: a literal value; use
Literal.getValue()
to obtain a string representation of its value, andLiteral.getRawValue()
to obtain its raw source text (including surrounding quotes for string literals).- NullLiteral, BooleanLiteral, NumberLiteral, StringLiteral, RegExpLiteral: different kinds of literals.
- ThisExpr: a “this” expression.
- SuperExpr: a “super” expression.
- ArrayExpr: an array expression; use
ArrayExpr.getElement(i)
to obtain thei
th element expression, andArrayExpr.elementIsOmitted(i)
to check whether thei
th element is omitted. - ObjectExpr: an object expression; use
ObjectExpr.getProperty(i)
to obtain thei
th property in the object expression; properties are modeled by class Property, which is described in more detail below. - FunctionExpr: a function expression; see below for available member predicates.
- ArrowFunctionExpr: an ECMAScript 2015-style arrow function expression; see below for available member predicates.
- ClassExpr: a class expression; see below for available member predicates.
- ParExpr: a parenthesized expression; use
ParExpr.getExpression()
to obtain the operand expression; for any expression,Expr.stripParens()
can be used to recursively strip off any parentheses - SeqExpr: a sequence of two or more expressions connected by the comma operator; use
SeqExpr.getOperand(i)
to obtain thei
th sub-expression. - ConditionalExpr: a ternary conditional expression; member predicates
ConditionalExpr.getCondition()
,ConditionalExpr.getConsequent()
andConditionalExpr.getAlternate()
provide access to the condition expression, the “then” expression and the “else” expression, respectively. - InvokeExpr: a function call or a “new” expression; use
InvokeExpr.getCallee()
to obtain the expression specifying the function to be called, andInvokeExpr.getArgument(i)
to obtain thei
th argument expression.- CallExpr: a function call.
- NewExpr: a “new” expression.
- MethodCallExpr: a function call whose callee expression is a property access; use
MethodCallExpr.getReceiver
to access the receiver expression of the method call, andMethodCallExpr.getMethodName()
to get the method name (if it can be determined statically).
- PropAccess: a property access, that is, either a “dot” expression of the form
e.f
or an index expression of the forme[p]
; usePropAccess.getBase()
to obtain the base expression on which the property is accessed (e
in the example), andPropAccess.getPropertyName()
to determine the name of the accessed property; if the name cannot be statically determined,getPropertyName()
does not return any value. - UnaryExpr: a unary expression; use
UnaryExpr.getOperand()
to obtain the operand expression.- NegExpr (“-“), PlusExpr (“+”), LogNotExpr (“!”), BitNotExpr (“~”), TypeofExpr, VoidExpr, DeleteExpr, SpreadElement (“…”): various types of unary expressions.
- BinaryExpr: a binary expression; use
BinaryExpr.getLeftOperand()
andBinaryExpr.getRightOperand()
to access the operand expressions.- Comparison: any comparison expression.
- EqualityTest: any equality or inequality test.
- EqExpr (“==”), NEqExpr (“!=”): non-strict equality and inequality tests.
- StrictEqExpr (“===”), StrictNEqExpr (“!==”): strict equality and inequality tests.
- LTExpr (“<”), LEExpr (“<=”), GTExpr (“>”), GEExpr (“>=”): numeric comparisons.
- EqualityTest: any equality or inequality test.
- LShiftExpr (“<<”), RShiftExpr (“>>”), URShiftExpr (“>>>”): shift operators.
- AddExpr (“+”), SubExpr (“-“), MulExpr (“”), DivExpr (“/”), ModExpr (“%”), ExpExpr (“*”): arithmetic operators.
- BitOrExpr (“|”), XOrExpr (“^”), BitAndExpr (“&”): bitwise operators.
- InExpr: an
in
test. - InstanceofExpr: an
instanceof
test. - LogAndExpr (“&&”), LogOrExpr (“||”): short-circuiting logical operators.
- Comparison: any comparison expression.
- Assignment: assignment expressions, either simple or compound; use
Assignment.getLhs()
andAssignment.getRhs()
to access the left- and right-hand side, respectively.- AssignExpr: a simple assignment expression.
- CompoundAssignExpr: a compound assignment expression.
- AssignAddExpr, AssignSubExpr, AssignMulExpr, AssignDivExpr, AssignModExpr, AssignLShiftExpr, AssignRShiftExpr,AssignURShiftExpr, AssignOrExpr, AssignXOrExpr, AssignAndExpr, AssignExpExpr: different kinds of compound assignment expressions.
- UpdateExpr: an increment or decrement expression; use
UpdateExpr.getOperand()
to obtain the operand expression.- PreIncExpr, PostIncExpr: an increment expression.
- PreDecExpr, PostDecExpr: a decrement expression.
- YieldExpr: a “yield” expression; use
YieldExpr.getOperand()
to access the (optional) operand expression; useYieldExpr.isDelegating()
to check whether this is a delegatingyield*
. - TemplateLiteral: an ECMAScript 2015 template literal;
TemplateLiteral.getElement(i)
returns thei
th element of the template, which may either be an interpolated expression or a constant template element. - TaggedTemplateExpr: an ECMAScript 2015 tagged template literal; use
TaggedTemplateExpr.getTag()
to access the tagging expression, andTaggedTemplateExpr.getTemplate()
to access the template literal being tagged. - TemplateElement: a constant template element; as for literals, use
TemplateElement.getValue()
to obtain the value of the element, andTemplateElement.getRawValue()
for its raw value - AwaitExpr: an “await” expression; use
AwaitExpr.getOperand()
to access the operand expression.
- Identifier: an identifier; use
函数
Function类
https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Functions
https://codeql.github.com/codeql-standard-libraries/javascript/semmle/javascript/Functions.qll/type.Functions$Function.htmlFunction.getIdentifier()
获取函数名Function.getParameter(i)
获取参数Function.getBody()
获取函数体
1、函数声明(函数语句) FunctionDeclStmt (a subclass of Stmt),
function name([param,[, param,[..., param]]]) {
[statements]
}
name
函数名.
param
传递给函数的参数的名称,一个函数最多可以有255个参数。
statements
组成函数体的声明语句。
function greet() {
console.log("Hi");
}
2、函数表达式 FunctionExpr (a subclass of Expr)
var myFunction = function name([param[, param[, ...param]]]) {
statements
}
var greet = function() {
console.log("Hi");
};
当函数只使用一次时,通常使用IIFE (Immediately Invokable Function Expressions)。
(function() {
statements
})();
3、箭头函数表达式 (=>) ArrowFunctionExpr (also a subclass of Expr),
([param][, param]) = > {
statements
}
param = > expression
param
参数名称. 零参数需要用()表示. 只有一个参数时不需要括号. (例如 foo => 1)
statements or expression
多个声明statements需要用大括号括起来,而单个表达式时则不需要。表达式expression也是该函数的隐式返回值。
var greet2 =
() = > console.log("Hi") // arrow function expression
类和成员
ClassDefinition
类声明语句 ClassDeclStmt
类表达式 ClassExpr
MemberDefinition
成员字段定义FieldDefinition
成员方法定义 MethodDefinition
ClassDefinition常用谓词
ClassDefinition.getIdentifier() 返回类名
ClassDefinition.getSuperClass()返回超类
ClassDefinition.getMember(n) 返回类的成员
ClassDefinition.getMethod(n) 返回类的成员方法
ClassDefinition.getField(n) 返回类的成员字段
ClassDefinition.getConstructor() 返回类的构造器
MemberDefinition常用谓词
MemberDefinition.getName() 返回成员名
MemberDefinition.getInit() 返回成员初始化表达式
声明
属性
Property类是ASTNode的子类,但是既不是Expr也不是Stmt。
Property.getName() 返回属性名
Property.getInit() 返回初始化值