HuffmanTree 霍夫曼编码树
Huffman编码:为字母分配代码,代码长度取决于字母的相对使用频率或权重
Huffman树定义:Hf树每个叶子结点对应一个字母,叶子结点的权重就是对应字母的出现频率,它是一颗满树
意义:按照最小外部路径权重建立一棵树.一个叶结点的加权路径长度(WPL)定义为权重乘以深度.具有最小外部路径权重的二叉树就是,对于给定叶结点集合,WPL之和最小的二叉树,权重大的叶结点深度小,权重小的叶结点深度大.
实现
建立n个结点的Hf树.首先创建n个初始的Hf树,每个树只包含单一叶结点,然后取出其中权重最小的两棵树合成一个树,两个叶结点作为新结点的子结点,新结点的权重即为子结点权重和
本例使用基于堆的优先队列创建Hf树
类结构: HuffNode->HuffTree->Heap
Heap类(最小堆)
import edu.princeton.cs.algs4.StdOut;public class Heap<Key extends Comparable<Key>> {private int n;//实际大小private int maxsize;//最大private Key[] heap;public Heap(Key[] heap, int n, int maxsize){this.n = n;this.maxsize = maxsize;this.heap = heap;}public int size(){ return n;}private boolean isleaf(int pos){ return pos >= n/2 && pos < n;}private int lc(int pos){ return 2*pos+1;}private int rc(int pos){ return 2*pos+2;}private int parent(int pos){return (pos-1)/2 ;}//构造堆有序public void buildHeap(){for (int i = n/2-1; i >= 0; i--){sink(i);}}private boolean less(int i, int j){ return heap[i].compareTo(heap[j])<0 ;}private void swap(int i, int j){Key temp = heap[i];heap[i] = heap[j];heap[j] = temp;}private void swim(int pos){while (pos > 1 && less(pos, parent(pos))){swap(pos, parent(pos));pos = parent(pos);}}private void sink(int pos){while (!isleaf(pos)){int j = lc(pos), rc = rc(pos);if (less(rc, j) && rc < n) j = rc;if (less(pos, j)) return;swap(j, pos);pos = j;}}public void insert(Key k){if (size() == maxsize){StdOut.println("the heap is full");return;}int curr = n++;heap[curr] = k;swim(curr);}public Key removefirst(){Key first = heap[0];swap(0, --n);if(n != 0){sink(0);heap[n] = null;}//若heap[0] = null,最后的hufftree也被清空return first;}}
- 和algs4中Heap算法不同之处在于此处数组无空头元素,所以在索引和removefirst()等方法有边界变化;且此处buildHeap()方法针对数组直接构造堆有序
 
HuffNode(abstract)类
public abstract class HuffNode<T> {public abstract int weight();public abstract boolean isLeaf();}
LeafNode叶子结点类
public class LeafNode<T> extends HuffNode<T> {private T word;//字符private int weight;//频度public LeafNode(T word, int weight){this.word = word;this.weight = weight;}@Overridepublic int weight(){ return weight;}@Overridepublic boolean isLeaf(){return true;}}
IntlNode内部结点类
public class IntlNode extends HuffNode {private int weight;private HuffNode lc;private HuffNode rc;public IntlNode(HuffNode lc, HuffNode rc){weight = lc.weight()+rc.weight();this.lc = lc;this.rc = rc;}@Overridepublic boolean isLeaf(){ return false;}@Overridepublic int weight(){ return weight;}public HuffNode lc(){ return lc;}public void setLc(HuffNode lc){this.lc = lc; }public HuffNode rc(){ return rc;}public void setRc(HuffNode rc){this.rc = rc;}}
HuffTree类
public class HuffTree<T> implements Comparable<HuffTree<T>>{private HuffNode<T> root;//内部结点Hf树public HuffTree(HuffTree<T> lc, HuffTree<T> rc){root = new IntlNode(lc.root(), rc.root());}//叶结点Hf树public HuffTree(T word, int wgt){root = new LeafNode<T>(word, wgt);}public int weight(){ return root.weight();}public HuffNode root(){ return root;}//实现内部比较方法@Overridepublic int compareTo(HuffTree<T> another){if (root.weight() < another.weight()) return -1;else if (root.weight() > another.weight()) return 1;else return 0;}}
- implements Comparables接口,实现CompareTo()方法,若泛型HuffTree,则compareTo中参数也必须为HuffTree
 - 实际上比较的就是HuffTree的权重和,所以HuffTree类实现接口和compareTo()方法
 - Comparator和Comparable之间的区别
 
test类和静态buildHuff()
import edu.princeton.cs.algs4.StdIn;public class test<T> {public static void main(String[] args){HuffTree treeArray[] = new HuffTree[100];int i = 0;while (! StdIn.isEmpty()){treeArray[i++] = new HuffTree(StdIn.readChar(), StdIn.readInt());StdIn.readChar();}buildHuff(treeArray,i);}public static HuffTree buildHuff(HuffTree [] treeArray, int count){Heap forest = new Heap(treeArray, count, count);forest.buildHeap();HuffTree temp1, temp2, temp3 = null;while (forest.size() > 1){temp1 = (HuffTree) forest.removefirst();temp2 = (HuffTree) forest.removefirst();temp3 = new HuffTree(temp1, temp2);forest.insert(temp3);}return temp3;}}
- 第一个readChar()读入word,第二个用readInt()读入weight则中间的空格会被忽略,需要额外一次readChar()读取两个word组中间的空格或回车
 
