函数参数有默认值就会引发内部产生 varblock 作用域!
why?具体细节?lookup策略会有什么样的变化呢?
- v8/src/parsing/preparser.cc ```cpp if (!formals.is_simple) { BuildParameterInitializationBlock(formals); }
//……. if (!formals.is_simple) { inner_scope = NewVarblockScope();//varblock 同时为声明作用域和block,而且。。 inner_scope->set_start_position(position()); }
3. v8/src/parsing/parser-base.h
这里有详情, 关于默认参数,剩余参数 --> varblock
```cpp
template <typename Impl>
void ParserBase<Impl>::ParseFunctionBody(
StatementListT* body, IdentifierT function_name, int pos,
const FormalParametersT& parameters, FunctionKind kind,
FunctionSyntaxKind function_syntax_kind, FunctionBodyType body_type) {
if (IsResumableFunction(kind)) impl()->PrepareGeneratorVariables();
DeclarationScope* function_scope = parameters.scope;
DeclarationScope* inner_scope = function_scope;
// Building the parameter initialization block declares the parameters.
// TODO(verwaest): Rely on ArrowHeadParsingScope instead.
if (V8_UNLIKELY(!parameters.is_simple)) {
if (has_error()) return;
BlockT init_block = impl()->BuildParameterInitializationBlock(parameters);
if (IsAsyncFunction(kind) && !IsAsyncGeneratorFunction(kind)) {
init_block = impl()->BuildRejectPromiseOnException(init_block);
}
body->Add(init_block);
if (has_error()) return;
inner_scope = NewVarblockScope();
inner_scope->set_start_position(scanner()->location().beg_pos);
}
StatementListT inner_body(pointer_buffer());
{
BlockState block_state(&scope_, inner_scope);
if (body_type == FunctionBodyType::kExpression) {
ExpressionT expression = ParseAssignmentExpression();
if (IsAsyncFunction(kind)) {
BlockT block = factory()->NewBlock(1, true);
impl()->RewriteAsyncFunctionBody(&inner_body, block, expression);
} else {
inner_body.Add(
BuildReturnStatement(expression, expression->position()));
}
} else {
DCHECK(accept_IN_);
DCHECK_EQ(FunctionBodyType::kBlock, body_type);
// If we are parsing the source as if it is wrapped in a function, the
// source ends without a closing brace.
Token::Value closing_token =
function_syntax_kind == FunctionSyntaxKind::kWrapped ? Token::EOS
: Token::RBRACE;
if (IsAsyncGeneratorFunction(kind)) {
impl()->ParseAndRewriteAsyncGeneratorFunctionBody(pos, kind,
&inner_body);
} else if (IsGeneratorFunction(kind)) {
impl()->ParseAndRewriteGeneratorFunctionBody(pos, kind, &inner_body);
} else if (IsAsyncFunction(kind)) {
ParseAsyncFunctionBody(inner_scope, &inner_body);
} else {
ParseStatementList(&inner_body, closing_token);
}
if (IsDerivedConstructor(kind)) {
ExpressionParsingScope expression_scope(impl());
inner_body.Add(factory()->NewReturnStatement(impl()->ThisExpression(),
kNoSourcePosition));
expression_scope.ValidateExpression();
}
Expect(closing_token);
}
}
scope()->set_end_position(end_position());
bool allow_duplicate_parameters = false;
CheckConflictingVarDeclarations(inner_scope);
if (V8_LIKELY(parameters.is_simple)) {
DCHECK_EQ(inner_scope, function_scope);
if (is_sloppy(function_scope->language_mode())) {
impl()->InsertSloppyBlockFunctionVarBindings(function_scope);
}
allow_duplicate_parameters =
is_sloppy(function_scope->language_mode()) && !IsConciseMethod(kind);
} else {
DCHECK_NOT_NULL(inner_scope);
DCHECK_EQ(function_scope, scope());
DCHECK_EQ(function_scope, inner_scope->outer_scope());
impl()->SetLanguageMode(function_scope, inner_scope->language_mode());
if (is_sloppy(inner_scope->language_mode())) {
impl()->InsertSloppyBlockFunctionVarBindings(inner_scope);
}
inner_scope->set_end_position(end_position());
if (inner_scope->FinalizeBlockScope() != nullptr) {
BlockT inner_block = factory()->NewBlock(true, inner_body);
inner_body.Rewind();
inner_body.Add(inner_block);
inner_block->set_scope(inner_scope);
impl()->RecordBlockSourceRange(inner_block, scope()->end_position());
if (!impl()->HasCheckedSyntax()) {
const AstRawString* conflict = inner_scope->FindVariableDeclaredIn(
function_scope, VariableMode::kLastLexicalVariableMode);
if (conflict != nullptr) {
impl()->ReportVarRedeclarationIn(conflict, inner_scope);
}
}
impl()->InsertShadowingVarBindingInitializers(inner_block);
}
}
ValidateFormalParameters(language_mode(), parameters,
allow_duplicate_parameters);
if (!IsArrowFunction(kind)) {
// Declare arguments after parsing the function since lexical 'arguments'
// masks the arguments object. Declare arguments before declaring the
// function var since the arguments object masks 'function arguments'.
function_scope->DeclareArguments(ast_value_factory());
}
impl()->DeclareFunctionNameVar(function_name, function_syntax_kind,
function_scope);
inner_body.MergeInto(body);
}
- v8/src/zone/zone.h ```cpp // The Zone supports very fast allocation of small chunks of // memory. The chunks cannot be deallocated individually, but instead // the Zone supports deallocating all chunks in one fast // operation. The Zone is used to hold temporary data structures like // the abstract syntax tree, which is deallocated after compilation. // // Note: There is no need to initialize the Zone; the first time an // allocation is attempted, a segment of memory will be requested // through the allocator. // // Note: The implementation is inherently not thread safe. Do not use // from multi-threaded code.
class V8_EXPORT_PRIVATE Zone final {…}
// ZoneObject is an abstraction that helps define classes of objects // allocated in the Zone. Use it as a base class; see ast.h. class ZoneObject {….}
5. v8/src/ast/ast-value-factory.h
```cpp
Ast(Raw|Cons)String and AstValueFactory are for storing strings and
values independent of the V8 heap and internalizing them later. During
parsing, they are created and stored outside the heap, in AstValueFactory.
After parsing, the strings and values are internalized (moved into the V8
heap).
Ast(Raw|Cons)String 和 AstValueFactory 用于存储字符串和
独立于 V8 堆的值并稍后将它们内部化。在解析过程中,它们被创建并存储在堆外的 AstValueFactory 中。
解析后,字符串和值被内化(移动到 V8 堆中)。
class AstRawString final : public ZoneObject {...}
- v8/include/v8-isolate.h
```cpp
Isolate represents an isolated instance of the V8 engine.
V8 isolates have completely separate states.
Objects from one isolate must not be used in other isolates.
The embedder can create multiple isolates and use them in parallel in multiple threads.
An isolate can be entered by at most one thread at any given time.
The Locker/Unlocker API must be used to synchronize.
Isolate 表示 V8 引擎的一个孤立实例。 V8 隔离具有完全独立的状态。 来自一个隔离区的对象不得用于其他隔离区。嵌入器可以创建多个隔离并在多个线程中并行使用它们。 在任何给定时间,一个隔离区最多可以由一个线程进入。必须使用 Locker/Unlocker API 进行同步。
7. v8/include/v8-context.h
```cpp
/**
* A sandboxed execution context with its own set of built-in objects
* and functions.
*/
class V8_EXPORT Context : public Data {
public:
/**
* Returns the global proxy object.
*
* Global proxy object is a thin wrapper whose prototype points to actual
* context's global object with the properties like Object, etc. This is done
* that way for security reasons (for more details see
* https://wiki.mozilla.org/Gecko:SplitWindow).
*
* Please note that changes to global proxy object prototype most probably
* would break VM---v8 expects only global object as a prototype of global
* proxy object.
*/
Local<Object> Global();
/**
* Detaches the global object from its context before
* the global object can be reused to create a new context.
*/
void DetachGlobal();
/**
* Creates a new context and returns a handle to the newly allocated
* context. 创建一个新的上下文并返回该上下文的句柄. Local就是一种轻量级句柄,<Context>
告诉你,这是一个Context实例的句柄
*
//要在哪个v8实例中创建这个上下文
* \param isolate The isolate in which to create the context.
*
//可能要扩展的配置
* \param extensions An optional extension configuration containing
* the extensions to be installed in the newly created context.
*
//一个可选的对象模板,用于创建新上下文的 global对象 将从该模板中创建。
* \param global_template An optional object template from which the
* global object for the newly created context will be created.
*
* \param global_object An optional global object to be reused for
* the newly created context. This global object must have been
* created by a previous call to Context::New with the same global
* template. The state of the global object will be completely reset
* and only object identify will remain.
*/
static Local<Context> New(
Isolate* isolate, ExtensionConfiguration* extensions = nullptr,
MaybeLocal<ObjectTemplate> global_template = MaybeLocal<ObjectTemplate>(),
MaybeLocal<Value> global_object = MaybeLocal<Value>(),
DeserializeInternalFieldsCallback internal_fields_deserializer =
DeserializeInternalFieldsCallback(),
MicrotaskQueue* microtask_queue = nullptr);
......
}
//v8/include/v8-data.h
/**
能够驻留在V8堆中的对象 的超类
* The superclass of objects that can reside on V8's heap.
*/
class V8_EXPORT Data {......}
v8/src/objects/property-descriptor.cc 处理js对象的属性描述符
bool PropertyDescriptor::ToPropertyDescriptor(Isolate* isolate,
Handle<Object> obj,
PropertyDescriptor* desc){......}
v8/src/objects/js-objects.cc
这里面很多Object上的静态方法的源码实现
//static。 实现Object.defineProperty(obj, key, discriptor)
Object JSReceiver::DefineProperty(Isolate* isolate, Handle<Object> object,
Handle<Object> key,
Handle<Object> attributes) {
v8/src/parsing/parser-base.h
ParserBase<Impl>::ParseHoistableDeclaration(
int pos, ParseFunctionFlags flags, ZonePtrList<const AstRawString>* names,
bool default_export) {
CheckStackOverflow();
......
//看到了吗!!!
// In ES6, a function behaves as a lexical binding, except in
// a script scope, or the initial scope of eval or another function.
VariableMode mode =
(!scope()->is_declaration_scope() || scope()->is_module_scope())
? VariableMode::kLet
: VariableMode::kVar;
// Async functions don't undergo sloppy mode block scoped hoisting, and don't
// allow duplicates in a block. Both are represented by the
// sloppy_block_functions_. Don't add them to the map for async functions.
// Generators are also supposed to be prohibited; currently doing this behind
// a flag and UseCounting violations to assess web compatibility.
VariableKind kind = is_sloppy(language_mode()) &&
!scope()->is_declaration_scope() &&
flags == ParseFunctionFlag::kIsNormal
? SLOPPY_BLOCK_FUNCTION_VARIABLE
: NORMAL_VARIABLE;
return impl()->DeclareFunction(variable_name, function, mode, kind, pos,
end_position(), names);
}
Local句柄和Persistent句柄都由HandleScopes控制管理,一定不是分配在堆,而是在栈中。
而Context是在堆中分配的
Scope继承自ZoneObject(用来帮助定义来自Zone分配的对象),Zone中借助Allocator统一快速分配内存空间,Zone可以快速、安全的分配及管理内存空间
Context(利用static Local
- 卧槽,再精读一下scopes.cc,大发现啊,好多东西都弄懂了!! ```cpp //都是在说声明xxxx,分配xxxx;都看了一遍,大有收获 //参数(必须先分配)(还有arguments,忘记它是在什么时候分配了) 变量 函数本身(必须最后),分配的位置及顺序,等等
脚本作用域中会隐式的分配一个全局对象<br />在DeclareFormalParameters函数中说了:非简单参数,然后为每个参数一一声明一个temporary,并存储值进去,最后相应的同名变量也会被声明为let
home object好像是指类的基类。
对象字面量也会建立一个block作用域,但是之后会有操作把他remove掉
```cpp
//parser-base.h
typename ParserBase<Impl>::ExpressionT ParserBase<Impl>::ParseObjectLiteral() {
......
Variable* home_object = nullptr;
if (block_scope->needs_home_object()) {
home_object = block_scope->DeclareHomeObjectVariable(ast_value_factory());
block_scope->set_end_position(end_position());
} else {
//普通函数字面量,就会走这个逻辑,然后在FinalizeBlockScope中remove了新创的block作用域
block_scope = block_scope->FinalizeBlockScope();
DCHECK_NULL(block_scope);
}
}
- ast.h
```cpp
// Creates a FunctionLiteral representing a top-level script, the
// result of an eval (top-level or otherwise), or the result of calling
// the Function constructor.//为顶级 脚本/EVAL,或者函数调用创建一个ast
FunctionLiteral NewScriptOrEvalFunctionLiteral(
DeclarationScope scope, const ScopedPtrList
& body, int expectedproperty_count, int parameter_count) { return zone->New (
}zone_, ast_value_factory_->empty_cons_string(), ast_value_factory_,
scope, body, expected_property_count, parameter_count, parameter_count,
FunctionSyntaxKind::kAnonymousExpression,
FunctionLiteral::kNoDuplicateParameters,
FunctionLiteral::kShouldLazyCompile, 0, /* has_braces */ false,
kFunctionLiteralIdTopLevel);
private: // This zone may be deallocated upon returning from parsing a function body // which we can guarantee is not going to be compiled or have its AST // inspected. // See ParseFunctionLiteral in parser.cc for preconditions. Zone* zone_;
13. v8/src/interpreter/bytecodes.h 字节码的定义,
void BytecodeGenerator::GenerateBytecodeBody();是生成字节码的入口,<br />最终进入VisitStatements(literal->body());,从这里开始生成bytecode,在生成byteocde之前要先使用AstNode->XXXtype()获取子类的具体类型,
RUNTIME_FUNCTION(Runtime_DeclareGlobals)是由宏模板定义的全局功能,该函数先检查参数的正确性,然后进入**DeclareGlobals()**完成具体功能。在我们的测试样例中,DeclareGlobals()是获得console的全局对象。得到该对象后返回到汇编码状态接着执行,获得该对象的log属性,也就是console.log,检测到它的参数JsPrint(6)是函数,然后编译此函数,得到一份字节码序列,开启一个新的执行单元,请读者自行跟踪。
ast上的每个节点都继承自AstNode<br />ast上存的是VariableProxy,如果被解析完毕,再访问可以拿到Variable *var, 否则拿到的是AstRawString* raw_name_; 我们所谓的ast其实是parse tree修剪后的树<br />ast在v8中其实就是FunctionLiteral<br />而ParseStatementList()是开始分析程序语句。while (peek() == Token::STRING)这条语句,peek是取得token字的类型,这里取来的token是Token::FUNCTION,所以值为假,进入while (peek() != end_token)循环,执行ParseStatementListItem()方法,在这个方法中进入Token::FUNCTION对应的分析功能,代码如下:
```cpp
ParserBase<Impl>::ParseHoistableDeclaration(
ZonePtrList<const AstRawString>* names, bool default_export) {
Consume(Token::FUNCTION);//cache机制
int pos = position();
ParseFunctionFlags flags = ParseFunctionFlag::kIsNormal;
if (Check(Token::MUL)) {
flags |= ParseFunctionFlag::kIsGenerator;
}
return ParseHoistableDeclaration(pos, flags, names, default_export);
}
Consume()是“token字缓存”机制的具体实现,从缓存中取出一个token开始分析,如果缓存缺失(cache miss),则驱动词法分析器(Scanner)开始工作。从Consume取token的方法原理是使Scanner类中的current成员指向next成员,再利用next_next判断是否扫描下一个token字,请读者自行查阅代码。
取出token字function、类型函数(Token::FUNCTION),接下来判断该函数属于哪种类型(FunctionKind),FunctionKind的具体代码如下:…..没必要看了ECMA的规范
ParseFunctionLiteral(),这个方法名字表明了它的主要功能是对函数内容进行语议分析。函数名字分析完成后,进入这个方法分析函数体的内容,先判断这个方法是否符合延迟分析条件—比如:不是IIFE。
ps:
声明会每次都被添加进declarations list(链表)中,但是这个声明set的Variable是最新的那个
普通块作用域里的变量会被分配到上层的声明式作用域中
预解析会先生成一个作用域,但绝不会分配内存!!!
何时在context中,何时在local中,忘记及时记载了
arguments也是创建时就初始化
Declare没有就先在Zone通过hashmap查,没查到就申请一个变量Variable,然后把这个变量在放回zone的hashmap里。
解析器入口:void Parser::ParseProgram(Isolate isolate, Handle[Script](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/script.h;drc=5aac2f3910c7ab06931bf741d3878845e929003c;l=33) script,
ParseInfo info, MaybeHandle[ScopeInfo](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/scope-info.h;drc=5aac2f3910c7ab06931bf741d3878845e929003c;l=53) maybe_outer_scope_info) {
DCHECK_EQ(script->id(), flags().script_id());
里面先做了词法分析(Scanner做的),然后做语法分析,ast在result中
语法分析入口:FunctionLiteral Parser::DoParseProgram(Isolate isolate, ParseInfo* info) {…}
- stack frame介绍
//从这可以看到对形参实参数量不统一时栈帧怎么处理的
void Builtins::Generate_InterpreterPushArgsThenCallImpl(
MacroAssembler* masm, ConvertReceiverMode receiver_mode,
InterpreterPushArgsMode mode) {
......
}
Number of arguments的作用与上面的一样,记录callee需要的实参数量,
v8在调用函数(callee)之前都会把实际参数压栈,callee会把局部变量压栈。stack frame是一块保留区域,用于存放实参、callee的返回值、局部变量和寄存器。stack frame的创建步骤如下:其实感觉跟C/Golang一样?
(1) callee的实际参数。如果有,压栈。
(2) callee的返回地址压栈。
(3) callee开始执行,EBP压栈。
(4) 设置EBP等于ESP。注:EBP现在代表callee的栈基址。
(5) 如果有局部变量,修改ESP预留栈空间。
(6) 如果需要保存寄存器,压栈。
- 隐藏类 map 的实现在:class Map : public HeapObject {……}
注意:每一个JavaScript对象的存储空间的第一个位置都是一个Map(地图)指针,也就是每个js对象都有Map,Map大小不因js对象不同而改变,始终是80字节,存储内容也如上所示,保持不变。它用来描述JS对象的形状,相同形状的不同js对象共同一个Map。“形状相同”是类型一样,内部成员存储布局也一样,
Map的创建和回收由V8的Heap负责管理,下面是创建Map的入口函数:
AllocationResult Heap::AllocateRaw(int size_in_bytes, AllocationType type,
AllocationOrigin origin, AllocationAlignment alignment) { ...... }
AllocateRaw()分配内存后返回到AllocateMap(),对内存进行初始化,
//是对Map的初始化,按最开始给出的May layout对每个字段(bit位、byte位、short位等)进行初始化。
//代码8,9,10,13行对JSObject对象中的InObject数据进行初始化,“InObject”是存储在JSObject对象内部的数据,
//访问这些数据更快。代码28返回Map,至此Map生成完毕,后续会通过这个Map访问图1中的存储空间
Map Factory::InitializeMap(Map map, InstanceType type, int instance_size,
ElementsKind elements_kind,
int inobject_properties) {......}
Map初始化:在V8的启动阶段,CreateInitialMaps()对所有Javascript类型分别建立对应的空Map,“空Map”说明了创建某个JS类型数据所需的最小内存空间。这样,开发者创建javascript对象时,V8先用对应的空Map申请一段最小空间,随时开发者对JS对象添加成员,Map也会发生改变。
bool Heap::CreateInitialMaps() {......}
关于 JSFunction 不要产生误会,并不是js中的function,
一段C语言程序要经过编译(Compilation)、汇编(Assembly)和链接(Linking)之后才能执行。不太严谨但很形象的类比:字节码流类似汇编之后的结果(V8称之为SharedFunction),JSFunction类似链接之后的程序,所以说JSFunction是可以执行的实例
关于 builtin
Builtin(Built-in function)是编译好的内置代码块(chunk),存储在snapshot_blob.bin文件中,V8启动时以反序列化方式加载,运行时可以直接调用。Builtins功能共计600多个,细分为多个子类型,涵盖了解释器、字节码、执行单元等多个V8核心功能
Const pool
先说明一个概念Constant pool,常量池,在Javascript编译期生成,用于存储常量数据 的 一个字符串类型数组
关于str.属性
图3中可以看到,调用堆栈只有两层,因为DebugPrint由Builtin::LdaNamedPropertyNoFeedback调用,退出后还要回到汇编代码。
图2中的args[0]是’hello world!’,它是之前声明的全局变量s,注意看它的类型依旧就是ONE_BYTE_INTERNALIZED_STRING;再看args[1]是substring,是从常量池中读取的,它的类型同上;最后看args[2],它的类型是JS_FUNCTION,这就是获取的substring()方法的地址指针,注意与args[1]的区别。
到此,转换过程完毕,我们并没有看到字符串s的类型在V8中发生过变化,但也没有阻碍获取字符串substring()方法。在V8内部,对String类型做了进一步的详细区分,定义很多不同的字符串类型,上述代码每一个XXX_STRING_TYPE代表一种字符串类型。
(2) 全局字符串s的类型没有因为’.’操作发生变化。这绝对没违背书中所描述的技术原理,只是V8的具体实现方式不同而已。
(3) 从V8源码的角度来讲来,String类的父类是Name,Name的父类是HeapObject,最后是Ojbect类,这么说来,String类本身就是一个堆对象方法。但这和书中说的不是一个概念,书中强调:对字符串进行‘.’操作时,它就不再是字符串,而是一个对象了,注意区别。Javascript源码先转成V8的内部字符串,内部字符串编译后生成Sharedfunction,Sharedfunction绑定Context等信息后生成JSfunction后交给执行单元。
(1) Javascript源码进入V8后需要转码;
(2) Javascript源码在V8内的表示是Source类,全称是v8::internal::source;
(3) 先查编译缓存,缓存缺失时启动编译;
(4) 语法分析器先启动(result = parser.ParseProgram(isolate, info);),Token缺失时启动词法分析器。关于Token和AST
ParseVariableStatement()的作用是对语句进行分析。一条语句,可以是变量定义、函数定义等,js源码是由很多语句组成,所以会反复调用ParseStatementListItem(),最终生成语法树。
要点总结:
(1) 被忽略的秘诀是有限状态自动机,语法分过程的实现原理是有限自动机,在C++中使用switch case实现。弄明白各种宏模板和switch case,再看v8编译会事半功倍;
(2) 以函数为单位生成语法树,每个函数生成一棵抽象语法树;
(3) 因为有lazy编译技术,函数执行时才会编译;
(4) 抽象语法树保存在FunctionLiteral类结构中;
(5) 语法分析器驱动词法分析器工作,词法分析的token定义主要由以下几个宏模板组成。
所以作用域到底是在什么时候生成的??边生成AST边生成作用域吗?
是的边AST边Scope,而且Scope完成还是早一步于AST,其实想想也是;
- classScriptContextTable : publicFixedArray {…}这玩意儿,就是当多个脚本的时候,会把跨脚本的顶级词法声明放入表中,这样就能跨脚本访问了啊。但不能重复声明词法性质变量。为什么不放var呢?因为脚本上下文里的顶级var变量已经被放到了global作用域了, 在window里了