其他工具 - magic-string - 《前端工程》

基本用法说明

magic-string 是一个字符串工具，提供了一些方法去操作字符串，以及生成source map。
作者是Rich Harris，看Github首页就能看出又是大佬一枚（svelte、rollup…）

按照就用yarn或者npm按照就好。
把待操作的字符串作为参数传给MagicString的构造函数就好：

基本用法说明

var s = new MagicString( 'problems = 99' );

snip 相当于子串，返回新的magic string对象，s持有的引用不变 ```javascript const str = ‘yxnne is a real man’ const s = new MagicString(str)

log(s.snip(0, 6).toString()) // yxnne log(s.toString()) // yxnne is a real man


- 按照索引替换字符：overwrite方法
```javascript
const str = 'yxnne is a real man'
const s = new MagicString(str)
log(s.overwrite(0, 6, 'wx ').toString()) // wx is a real man
log(s.toString()) // wx is a real man

这个结果可以猜到，原来的s打印出来也改变了，这时候再去调用一次overwrite：


log(s.overwrite(0, 6, '666 ').toString()) // 666 is a real man
log(s.toString()) // 666 is a real man

神奇的点在哪里呢？发现没 overwirte的索引始终是按照第一次的初始字符串来计算的，也就是说我们从’wx is a real man’变为’666 is a real man’并不是取下标[0,2)，而依旧是[0, 6)

（其他的字符串方法参照官网：https://github.com/Rich-Harris/magic-string#methods）

批量合并：bundle

下面的代码中我们通过addSource添加了两段string：var answer = 42;和console.log( answer )：

const MagicString = require('magic-string')
const log = console.log
// 初始化bundle
const bundle = new MagicString.Bundle();
bundle.addSource({
  filename: 'foo.js', // filename 为了sourcemap
  content: new MagicString('var answer = 42;') // 内容
});
bundle.addSource({
  filename: 'bar.js',
  content: new MagicString('console.log( answer )')
});

我们期望把增加的这两段代码合并起来，同时在最终生成的字符串前后包裹一个函数，并且使函数体内容有缩进：

bundle.indent()  // 统一加上缩进
  .prepend('(function () {\n') // 前面添加
  .append('\n}());'); // 后面添加
log(bundle.toString())

最终打印结果：

(function () {
  var answer = 42;
  console.log( answer )
}());

生成sourcemap ```javascript var map = s.generateMap({ source: ‘source.js’, file: ‘converted.js.map’, includeContent: true }); // generates a v3 sourcemap

require( ‘fs’ ).writeFile( ‘converted.js’, s.toString() ); require( ‘fs’ ).writeFile( ‘converted.js.map’, map.toString() );

<a name="p1SUY"></a>
### 源码
其实就是想看看这个代码是怎么实现上面overwirte的这种特性.
<a name="C6klD"></a>
#### constructor
先看下构造函数吧，在src/MagicString.js:
```javascript
// 核心接受的参数是string    
constructor(string, options = {}) {
    // 构造了一个chunk对象
        const chunk = new Chunk(0, string.length, string);
      // 定义属性
        Object.defineProperties(this, {
            original:              { writable: true, value: string }, // 原始串
            outro:                 { writable: true, value: '' },
            intro:                 { writable: true, value: '' },
            firstChunk:            { writable: true, value: chunk }, // chunk
            lastChunk:             { writable: true, value: chunk }, // chunk
            lastSearchedChunk:     { writable: true, value: chunk }, // chunk
            byStart:               { writable: true, value: {} },
            byEnd:                 { writable: true, value: {} },
            filename:              { writable: true, value: options.filename },
            indentExclusionRanges: { writable: true, value: options.indentExclusionRanges },
            sourcemapLocations:    { writable: true, value: new BitSet() },
            storedNames:           { writable: true, value: {} },
            indentStr:             { writable: true, value: guessIndent(string) }
        });
        if (DEBUG) {
            Object.defineProperty(this, 'stats', { value: new Stats() });
        }
        this.byStart[0] = chunk;
        this.byEnd[string.length] = chunk;
    }

上面构造函数，接受的参数核心是string，构造函数中做的事情有：

构造了一个chunk对象（什么是chunk后面再看）
定义了一堆属性，其中：
1. original就是最初的字符串
2. 发现了三个和chunk相关的属性firstChunk、lastChunk、lastSearchedChunk他们的初始值都是刚刚构造的chunk对象
3. byStart[0]、byEnd[string.length]也都赋值为刚刚的chunk对象

那么，chunk是啥？

Chunk

我们看src/Chunk.js中的构造函数：

constructor(start, end, content) {
  this.start = start;
  this.end = end;
  this.original = content;
  this.intro = '';
  this.outro = '';
  this.content = content;
  this.storeName = false;
  this.edited = false;
  // we make these non-enumerable, for sanity while debugging
  Object.defineProperties(this, {
    previous: { writable: true, value: null },
    next:     { writable: true, value: null }
  });
}

结合刚刚在MagicString的constructor中使用const chunk = new Chunk(0, string.length, string); 可以看出，chunk也是一个字符串的封装，其中chunk.start、chunk.end表示了位置信息，chunk.content是字符串内容。
另外在，contructor中还有一个非常重要的点：

Object.defineProperties(this, {
  previous: { writable: true, value: null },
  next:     { writable: true, value: null }
});

previous、next这…这是双向链表呀。
看到这里，大致就能猜测出MagicString的工作原理了，就是通过chunk形成的链表对应原始字符串的一个部分，相当于每一个chunk就是一个“补丁”。
为了印证这个想法，看看最终MagicString字符串的输出是怎么输出的吧，那就是toString。

toString方法

我们每次得到结果调用的都是这个重载方法：

toString() {
  let str = this.intro;
  let chunk = this.firstChunk;
  while (chunk) {
    str += chunk.toString();
    chunk = chunk.next;
  }
  return str + this.outro;
}

可以看到，确实是如上文所想，magicstring的toString方法就是将chunk链表遍历一遍，分别调用其chunk.toString然后做出的字符串拼接。

现在我们已经基本理解了magicstring的数据结构

clone方法

clone方法顾名思义，自然是返回一个克隆的magicstring对象，那么，我们很自然的要把里面涉及的数据结构属性都克隆一遍：


clone() {
  // new一个新对象
  const cloned = new MagicString(this.original, { filename: this.filename });
  // 拿到old对象的第一个chunk
  let originalChunk = this.firstChunk;
  // 初始化新对象的几个chunk，chunk也克隆一份
  let clonedChunk = (cloned.firstChunk = cloned.lastSearchedChunk = originalChunk.clone());
  // 从第一个chunk开始遍历 分别克隆并组装数据结构
  while (originalChunk) {
    cloned.byStart[clonedChunk.start] = clonedChunk;
    cloned.byEnd[clonedChunk.end] = clonedChunk;
    const nextOriginalChunk = originalChunk.next;
    const nextClonedChunk = nextOriginalChunk && nextOriginalChunk.clone();
    if (nextClonedChunk) {
      clonedChunk.next = nextClonedChunk;
      nextClonedChunk.previous = clonedChunk;
      clonedChunk = nextClonedChunk;
    }
    originalChunk = nextOriginalChunk;
  }
  cloned.lastChunk = clonedChunk;
  // 先不管
  if (this.indentExclusionRanges) {
    cloned.indentExclusionRanges = this.indentExclusionRanges.slice();
  }
  // 先不管
  cloned.sourcemapLocations = new BitSet(this.sourcemapLocations);
  cloned.intro = this.intro;
  cloned.outro = this.outro;
  return cloned;
}

上面代码中核心做的事情就是：构造了一个新的对象，拿到旧对象的第一个chunk，初始化新对象的几个chunk，然后构造新对象的chunk链表：从第一个chunk开始遍历，分别克隆，然后链接起来。

snip方法

snip(start, end) {
  const clone = this.clone();
  clone.remove(0, start);
  clone.remove(end, clone.original.length);
  return clone;
}

snip方法看上去逻辑比较简单：就是clone一个对象，移除0-start和end-length的区间，也就是说保留(start, end]之间的这段。
不过remove的逻辑其实挺复杂的，简单看了下，主要也是在操作chunk，这就不深究了。

overwrite方法

overwrite方法其实就是我最想弄明白的方法，我最初就是想知道：
const str = ‘yxnne is a real man’
const s = new MagicString(str)
这样的magic string，第一次overwrite之后的结果我可以理解：
log(s.overwrite(0, 6, ‘wx ‘).toString()) // wx is a real man
log(s.toString()) // wx is a real man
不过再次overwrite，传的下标还是0，6但是能达到预期的效果：
log(s.overwrite(0, 6, ‘666 ‘).toString()) // 666 is a real man
log(s.toString()) // 666 is a real man

现在基本是知道怎么做的了，但是最好还是结合代码整体看下：

overwrite(start, end, content, options) {
  // 1. 边界处理
  if (typeof content !== 'string') throw new TypeError('replacement content must be a string');
  while (start < 0) start += this.original.length;
  while (end < 0) end += this.original.length;
  if (end > this.original.length) throw new Error('end is out of bounds');
  if (start === end)
    throw new Error('Cannot overwrite a zero-length range – use appendLeft or prependRight instead');
  if (DEBUG) this.stats.time('overwrite');
  // 2. 拆分
  this._split(start);
  this._split(end);
  if (options === true) { } // 处理options 忽略
  const storeName = options !== undefined ? options.storeName : false;
  const contentOnly = options !== undefined ? options.contentOnly : false;
  if (storeName) {
    const original = this.original.slice(start, end);
    this.storedNames[original] = true;
  }
  // 3. 构造chunk链，重点
  const first = this.byStart[start];
  const last = this.byEnd[end];
  if (first) {
    if (end > first.end && first.next !== this.byStart[first.end]) {
      throw new Error('Cannot overwrite across a split point');
    }
    first.edit(content, storeName, contentOnly);
    if (first !== last) {
      let chunk = first.next;
      while (chunk !== last) {
        chunk.edit('', false);
        chunk = chunk.next;
      }
      chunk.edit('', false);
    }
  } else {
    // must be inserting at the end
    const newChunk = new Chunk(start, end, '').edit(content, storeName);
    // TODO last chunk in the array may not be the last chunk, if it's moved...
    last.next = newChunk;
    newChunk.previous = last;
  }
  if (DEBUG) this.stats.timeEnd('overwrite');
  return this;
}

上面代码核心部分已经标注出来：

边界处理：处理下下标，也支持了负数下标

// 负数下标  
while (start < 0) start += this.original.length;
while (end < 0) end += this.original.length;

_split拆分

这个方法的调用很重要，源代码就不贴了，只描述下做了件什么事：_split(index),就是按照index，把当前index命中的下标对应的chunk给它变成两个new chunk，并且首位链接起来。
比如现在是一个新的字符串abcdefg 下标就是0，1，2，3，4，5，6，最开始magic string只有一个chunk，我们记作chunk:0-6，现在假设调用_split(3)，那么这时候生成了两个chunk1:0-3， chunk2：3-6,而且要让他们成为双链表，即：chunk1.next = chunk2 、 chunk2.next = chunk1
还有一件重要的事情就是记录这些chunk的起始位置，在magicstring的结构里面有属性this.byStart、this.byEnd，他们都是数组，下标表示原始字符串字符的下标，值就是chunk或者null，如果某一个下标做为一段chunk的开始，就设置这个chunk为值，这样来记录chunk位置。
比如上面的例子中，生成chunk1:0-3， chunk2：3-6，那么:
this.byStart[0] = chunk1， this.byStart[3] = chunk2

替换chunk

有了上面这一步作为基础，第三步很重要，但是确很好理解了，简单阐述下就是要把需要重写的chunk替换掉。
overwirte(start, end, content)，假设最简单的情况，在overwirte之前，chunk只有一个，那么被start和end来_split拆分后就生成了首尾相连的三个chunk，这时候要替换的就是中间的chunk, content是他的内容：

 const newChunk = new Chunk(start, end, '').edit(content, storeName);

但是怎么找到这个chunk呢？
上文说了，this.byStart这个属性记录就是我们的chunk，即：this.byStart[start]
从这种简单的情况推演出复杂的情况，其实也是差不多，就是拆分chunk，找到chunk，替换chunk，当然还涉及了我需要替换的内容跨越了好几个chunk的问题，这种源码里面也做了处理，就不展开了。

最后再简单总结下，magic-string这个libary就是通过chunks双向链表这种数据结构来完成它的操作的，类似打补丁的感觉。