虽然写过大大小小好几个 node 应用,也写了不少小的脚本工具,但一直没有尝试过开发 Node C++ Addon,总感觉自己还是个面向 NPM 的 API 工程师,最近有兴趣研究如何在 Node 里实现原生阻塞异步,顺着这个开始想动手接触下 Addon 的开发,之前买完一直落灰的 《Node.js:来一打 C++ 扩展》 终于派上了用场。

Why C++ Addon?

在进入正题之前,先思考一个问题,为什么需要写 C++ Addon?它能解决哪些 JS 无法解决的问题或是带来什么收益?
核心的几个要点:

  1. 性能:对于密集计算的场景,C++ 这种静态语言 AOT 的执行性能会比 JS 动态语言 JIT 的性能一般来讲还是会快得多(不过这也是有前提的,同样的逻辑实现,如果 C++ 代码写得比较烂,也未必会比 JS 更快甚至反而慢,后面我们会看到这样的例子);
  2. 效能:C++ 发展历史更为悠久,借助于 C++ Addon 可以快速把 C++ 已有的一些算法实现封装起来供 JS 调用,而不必 JS 重新照逻辑开发一遍,充分吸收开源社区已有红利,就类似 WASM 搭上 C++/Go/Rust 等语言的顺风车;
  3. 能力:Node.js 虽然已经为 JS 运行时提供了非常充分的环境支持了,但多一层封装必然意味着对底层掌控度的下降,比如 Node.js 里基于 libuv 为 JS 实现了事件循环机制,但这一套对 JS 是无感的,从 JS 层面你也无法控制事件循环的执行,而通过 C++ 更底层的 node/v8 的 API,可以做更多 JS 无法做到的事情,比如实现语法上同步底层却异步调用的效果,就像 fibjs 一样,这其实就是开篇的问题,后面我们会继续深入研究怎么做。

所以对于 C++ Addon 我们还是需要抱着一个客观的视角来看待它,就像对待 WASM 一样,它有它适合应用的场景,然而谁都不是银弹可以一招鲜吃遍天。在不同的场景选择适合的技术方案和工具,才是一名合格工程师的专业素养。

C++ Addon 本质是啥?

我们先感受一下一个 C++ Addon 包,几个比较典型的 C++ Addon 包:

  • node-sass:作为针对 C 版本的 sass 编译器 LibSass 提供对应 Node.js 的绑定实现;
  • node-canvas:基于 2D 图形库 Cairo 的 Node.js 扩展实现;
  • fsevents:作为 chokidar 的依赖,从 Native 原生提供 MacOS 平台的文件系统事件监听。

以 node-sass 为例,我们安装一下:

  1. dickeylthdev/test/abc» tnpm i node-sass [18:18:48]
  2. Installed 1 packages
  3. Linked 177 latest versions
  4. [1/1] scripts.install node-sass@* run "node scripts/install.js", root: "/Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass"
  5. Downloading binary from https://cdn.npm.taobao.org/dist/node-sass/v5.0.0/darwin-x64-72_binding.node
  6. Download complete
  7. Binary saved to /Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass/vendor/darwin-x64-72/binding.node
  8. Caching binary to /Users/dickeylth/.npminstall_tarball/node-sass/5.0.0/darwin-x64-72_binding.node
  9. [1/1] scripts.install node-sass@* finished in 5s
  10. [1/1] scripts.postinstall node-sass@* run "node scripts/build.js", root: "/Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass"
  11. Binary found at /Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass/vendor/darwin-x64-72/binding.node
  12. Testing binary
  13. Binary is fine
  14. [1/1] scripts.postinstall node-sass@* finished in 500ms
  15. Run 1 scripts
  16. deprecate node-sass@5.0.0 request@^2.88.0 request has been deprecated, see https://github.com/request/request/issues/3142
  17. deprecate node-sass@5.0.0 request@2.88.2 har-validator@~5.1.3 this
  18. All packages installed (192 packages installed from npm registry, used 9s(network 3s), speed 108.49KB/s, json 177(327.43KB), tarball 0B)

可以看到日志里第 5 行,Downloading binary from https://cdn.npm.taobao.org/dist/node-sass/v5.0.0/darwin-x64-72_binding.node 是从 npm 源下载的之前针对 mac 平台预编译好的二进制文件,看看安装后的目录结构:

  1. dickeylthdev/test/abc» tree ./node_modules/node-sass -L 3 [18:21:56]
  2. ./node_modules/node-sass
  3. ├── bin
  4. ├── emcc
  5. └── node-sass
  6. ├── binding.gyp
  7. ├── lib
  8. └── *.js
  9. ├── node_modules
  10. ├── *
  11. └── true-case-path -> ../../_true-case-path@1.0.3@true-case-path
  12. ├── package.json
  13. ├── scripts
  14. ├── build.js
  15. ├── install.js
  16. ├── prepublish.js
  17. └── util
  18. ├── src
  19. └── *
  20. ├── test
  21. └── *.js
  22. └── vendor
  23. └── darwin-x64-72
  24. └── binding.node
  25. 61 directories, 73 files

可以看到下载到了 vendor 目录底下,这个 binding.node 就是 src 下的 C/C++ 源码编译后的二进制模块。如果你试试安装其他几个包,会发现现在主流的做法都是优先在 CI 阶段分环境完成 Addon 的预编译,在用户安装包的时候直接从远程拉取预编译好的二进制包文件,而不是用户 npm install 后在本地来编译,为啥呢?你试试就会发现以目前的网速,从网络拉取预编译好的二进制包文件比本地编译快多了。。。

  1. dickeylthdev/test/abc» SKIP_SASS_BINARY_DOWNLOAD_FOR_CI=true tnpm i node-sass [14:56:51]
  2. Installed 1 packages
  3. Linked 177 latest versions
  4. [1/1] scripts.install node-sass@* run "node scripts/install.js", root: "/Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass"
  5. Skipping downloading binaries on CI builds
  6. [1/1] scripts.install node-sass@* finished in 327ms
  7. [1/1] scripts.postinstall node-sass@* run "node scripts/build.js", root: "/Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass"
  8. Building: /usr/local/bin/node /Users/dickeylth/dev/test/abc/node_modules/_node-gyp@7.1.2@node-gyp/bin/node-gyp.js rebuild --verbose --libsass_ext= --libsass_cflags= --libsass_ldflags= --libsass_library=
  9. gyp info it worked if it ends with ok
  10. gyp verb cli [
  11. gyp verb cli '/usr/local/bin/node',
  12. gyp verb cli '/Users/dickeylth/dev/test/abc/node_modules/_node-gyp@7.1.2@node-gyp/bin/node-gyp.js',
  13. gyp verb cli 'rebuild',
  14. gyp verb cli '--verbose',
  15. gyp verb cli '--libsass_ext=',
  16. gyp verb cli '--libsass_cflags=',
  17. gyp verb cli '--libsass_ldflags=',
  18. gyp verb cli '--libsass_library='
  19. gyp verb cli ]
  20. gyp info using node-gyp@7.1.2
  21. gyp info using node@12.20.1 | darwin | x64
  22. gyp verb command rebuild []
  23. gyp verb command clean []
  24. gyp verb clean removing "build" directory
  25. gyp verb command configure []
  26. gyp verb download using dist-url https://tnpm-hz.oss-cn-hangzhou.aliyuncs.com/dist/node
  27. gyp verb find Python Python is not set from command line or npm configuration
  28. gyp verb find Python Python is not set from environment variable PYTHON
  29. gyp verb find Python checking if "python3" can be used
  30. gyp verb find Python - executing "python3" to get executable path
  31. gyp verb find Python - executable path is "/usr/local/opt/python@3.9/bin/python3.9"
  32. gyp verb find Python - executing "/usr/local/opt/python@3.9/bin/python3.9" to get version
  33. gyp verb find Python - version is "3.9.1"
  34. ...
  35. c++ -o Release/obj.target/binding/src/sass_types/string.o ../src/sass_types/string.cpp '-DNODE_GYP_MODULE_NAME=binding' '-DUSING_UV_SHARED=1' '-DUSING_V8_SHARED=1' '-DV8_DEPRECATION_WARNINGS=1' '-DV8_DEPRECATION_WARNINGS' '-DV8_IMMINENT_DEPRECATION_WARNINGS' '-D_DARWIN_USE_64_BIT_INODE=1' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' '-DOPENSSL_NO_PINSHARED' '-DOPENSSL_THREADS' '-DBUILDING_NODE_EXTENSION' -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/include/node -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/src -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/deps/openssl/config -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/deps/openssl/openssl/include -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/deps/uv/include -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/deps/zlib -I/Users/dickeylth/Library/Caches/node-gyp/12.20.1/deps/v8/include -I../../_nan@2.14.2@nan -I../src/libsass/include -O3 -gdwarf-2 -mmacosx-version-min=10.7 -arch x86_64 -Wall -Wendif-labels -W -Wno-unused-parameter -std=c++11 -stdlib=libc++ -fno-rtti -fno-exceptions -fno-strict-aliasing -MMD -MF ./Release/.deps/Release/obj.target/binding/src/sass_types/string.o.d.raw -c
  36. c++ -bundle -undefined dynamic_lookup -Wl,-no_pie -Wl,-search_paths_first -mmacosx-version-min=10.7 -arch x86_64 -L./Release -stdlib=libc++ -o Release/binding.node Release/obj.target/binding/src/binding.o Release/obj.target/binding/src/create_string.o Release/obj.target/binding/src/custom_function_bridge.o Release/obj.target/binding/src/custom_importer_bridge.o Release/obj.target/binding/src/sass_context_wrapper.o Release/obj.target/binding/src/sass_types/boolean.o Release/obj.target/binding/src/sass_types/color.o Release/obj.target/binding/src/sass_types/error.o Release/obj.target/binding/src/sass_types/factory.o Release/obj.target/binding/src/sass_types/list.o Release/obj.target/binding/src/sass_types/map.o Release/obj.target/binding/src/sass_types/null.o Release/obj.target/binding/src/sass_types/number.o Release/obj.target/binding/src/sass_types/string.o Release/sass.a
  37. gyp info ok
  38. Installed to /Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass/vendor/darwin-x64-72/binding.node
  39. [1/1] scripts.postinstall node-sass@* finished in 3m

中间编译环节的日志太长了就不全部展示了,整个编译下来花了近 3 分钟,而对比从秒级的网络拉取显然太慢了。不过通过这个过程我们可以先感受一下 C++ Addon 包编译的过程,最后编译完成后可以看到输出的目录也是 Installed to /Users/dickeylth/dev/test/abc/node_modules/_node-sass@5.0.0@node-sass/vendor/darwin-x64-72/binding.node

这个 binding.node 文件就是编译好的二进制文件,想要查看这个文件的内容,我们可以安装一个 VSCode 插件:hexdump,让我们大概来看看长什么样子:

  1. Offset: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
  2. 00000000: CF FA ED FE 07 00 00 01 03 00 00 00 08 00 00 00 Ozm~............
  3. 00000010: 0C 00 00 00 20 07 00 00 85 80 01 00 00 00 00 00 ................
  4. 00000020: 19 00 00 00 C8 02 00 00 5F 5F 54 45 58 54 00 00 ....H...__TEXT..
  5. 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  6. 00000040: 00 C0 1C 00 00 00 00 00 00 00 00 00 00 00 00 00 .@..............
  7. 00000050: 00 C0 1C 00 00 00 00 00 05 00 00 00 05 00 00 00 .@..............
  8. 00000060: 08 00 00 00 00 00 00 00 5F 5F 74 65 78 74 00 00 ........__text..
  9. 00000070: 00 00 00 00 00 00 00 00 5F 5F 54 45 58 54 00 00 ........__TEXT..
  10. 00000080: 00 00 00 00 00 00 00 00 D0 2C 00 00 00 00 00 00 ........P,......
  11. 00000090: 57 50 17 00 00 00 00 00 D0 2C 00 00 04 00 00 00 WP......P,......
  12. 000000a0: 00 00 00 00 00 00 00 00 00 04 00 80 00 00 00 00 ................
  13. 000000b0: 00 00 00 00 00 00 00 00 5F 5F 73 74 75 62 73 00 ........__stubs.
  14. 000000c0: 00 00 00 00 00 00 00 00 5F 5F 54 45 58 54 00 00 ........__TEXT..
  15. 000000d0: 00 00 00 00 00 00 00 00 28 7D 17 00 00 00 00 00 ........(}......
  16. 000000e0: 3A 05 00 00 00 00 00 00 28 7D 17 00 01 00 00 00 :.......(}......
  17. 000000f0: 00 00 00 00 00 00 00 00 08 04 00 80 00 00 00 00 ................
  18. 00000100: 06 00 00 00 00 00 00 00 5F 5F 73 74 75 62 5F 68 ........__stub_h
  19. 00000110: 65 6C 70 65 72 00 00 00 5F 5F 54 45 58 54 00 00 elper...__TEXT..
  20. 00000120: 00 00 00 00 00 00 00 00 64 82 17 00 00 00 00 00 ........d.......
  21. 00000130: FE 07 00 00 00 00 00 00 64 82 17 00 02 00 00 00 ~.......d.......
  22. 00000140: 00 00 00 00 00 00 00 00 00 04 00 80 00 00 00 00 ................
  23. 00000150: 00 00 00 00 00 00 00 00 5F 5F 63 6F 6E 73 74 00 ........__const.
  24. 00000160: 00 00 00 00 00 00 00 00 5F 5F 54 45 58 54 00 00 ........__TEXT..
  25. 00000170: 00 00 00 00 00 00 00 00 70 8A 17 00 00 00 00 00 ........p.......
  26. 00000180: 41 36 00 00 00 00 00 00 70 8A 17 00 04 00 00 00 A6......p.......
  27. 00000190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  28. 000001a0: 00 00 00 00 00 00 00 00 5F 5F 63 73 74 72 69 6E ........__cstrin
  29. 000001b0: 67 00 00 00 00 00 00 00 5F 5F 54 45 58 54 00 00 g.......__TEXT..
  30. 000001c0: 00 00 00 00 00 00 00 00 B1 C0 17 00 00 00 00 00 ........1@......
  31. 000001d0: 44 45 00 00 00 00 00 00 B1 C0 17 00 00 00 00 00 DE......1@......
  32. 000001e0: 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................
  33. 000001f0: 00 00 00 00 00 00 00 00 5F 5F 67 63 63 5F 65 78 ........__gcc_ex
  34. 00000200: 63 65 70 74 5F 74 61 62 5F 5F 54 45 58 54 00 00 cept_tab__TEXT..
  35. 00000210: 00 00 00 00 00 00 00 00 F8 05 18 00 00 00 00 00 ........x.......
  36. 00000220: 34 31 01 00 00 00 00 00 F8 05 18 00 02 00 00 00 41......x.......
  37. 00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  38. 00000240: 00 00 00 00 00 00 00 00 5F 5F 75 6E 77 69 6E 64 ........__unwind

注意最前面 4 个字节的标识符“CF FA ED FE”,即 0xcffaedfe,我们看一下 macOS 里头文件的定义

  1. /* Constant for the magic field of the mach_header_64 (64-bit architectures) */
  2. #define MH_MAGIC_64 0xfeedfacf /* the 64-bit mach magic number */
  3. #define MH_CIGAM_64 NXSwapInt(MH_MAGIC_64)

第 3 行的 NXSwapInt 是 Mach-O 文件在 64 位机器上的魔数的转换(大端模式 => 小端模式),结果就是 0xfeedfacf => 0xcffaedfe,表示如果有个文件的前 4 个字节是这样的,就是在 mac 平台上是一个 Mach-O 文件,还可以通过 macOS 自带的 otool 查看一下:

  1. dickeylthdev/node-playground/addon-test» otool -hv ./build/Release/addon.node [10:11:19]
  2. ./build/Release/addon.node:
  3. Mach header
  4. magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
  5. MH_MAGIC_64 X86_64 ALL 0x00 BUNDLE 13 1360 NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK

可以看到 .node 文件在 mac 上的文件类型是 BUNDLE,也就是 Bundle 类型的动态库,用于在运行时加载,允许被作为插件动态扩展程序的行为,大家可能更熟悉的是 .node 在 windows 平台上实质上就是一个 dll 文件。到这里我们对 .node 文件的本质有了基本的了解,具体的 .node 文件的加载大家感兴趣可以看参考资料里的链接。

实现 C++ Addon 的几种途径

时间已经来到了 2021 年,实现 C++ Addon 有以下三种选择:

  1. C++ 原生:用最底层的 node 和 v8 源码提供的对象和 API 实现逻辑;
  2. NAN(Native Abstractions for Node.js):由于 Node.js 和 V8 的底层 API 也处于不断变化过程之中,使用原生开发如果遇到依赖的 API 有废弃的情况就会在高版本的 Node.js 上无法编译,因此 2013 年推出了 NAN 这么个东西,主要为 Node.js 和 V8 跨版本提供了封装的宏,让开发者不用关心各版本之间 API 的差异;
  3. N-API:NAN 解决了底层 API 不稳定的问题,但同一份代码在不同版本的 Node.js 下还是需要重新编译;2017 年推出的新的 N-API 更进一步把 Node.js 所有底层数据结构全部黑盒化抽象成对应接口,这些接口是稳定的、ABI(Application Binary Interface) 化的,这样不同版本 Node.js 下只要 ABI 的版本号一致编译好的 C++ Addon 就可以直接使用而不需要重新编译,再也不用直接使用 V8 提供的数据类型了。

接下来就从 C++ 原生开发一步步了解 C++ Addon 吧。

参考资料