公式

The minor compaction StoreFile selection logic is size based, and selects a file for compaction when the
file <= sum(smaller_files) * hbase.hstore.compaction.ratio.


例子

Minor Compaction File Selection - Example #1 (Basic Example)This example mirrors an example from the unit test TestCompactSelection.
  • hbase.hstore.compaction.ratio = 1.0f
  • hbase.hstore.compaction.min = 3 (files)
  • hbase.hstore.compaction.max = 5 (files)
  • hbase.hstore.compaction.min.size = 10 (bytes)
  • hbase.hstore.compaction.max.size = 1000 (bytes)

The following StoreFiles exist: 100, 50, 23, 12, and 12 bytes apiece (oldest to newest). With the above parameters, the files that would be selected for minor compaction are 23, 12, and 12.
Why?

  • 100 → No, because sum(50, 23, 12, 12) * 1.0 = 97.
  • 50 → No, because sum(23, 12, 12) * 1.0 = 47.
  • 23 → Yes, because sum(12, 12) * 1.0 = 24.
  • 12 → Yes, because the previous file has been included, and because this does not exceed the max-file limit of 5
  • 12 → Yes, because the previous file had been included, and because this does not exceed the max-file limit of 5.

Minor Compaction File Selection - Example #2 (Not Enough Files ToCompact)

This example mirrors an example from the unit test TestCompactSelection.

  • hbase.hstore.compaction.ratio = 1.0f
  • hbase.hstore.compaction.min = 3 (files)
  • hbase.hstore.compaction.max = 5 (files)
  • hbase.hstore.compaction.min.size = 10 (bytes)
  • hbase.hstore.compaction.max.size = 1000 (bytes)

The following StoreFiles exist: 100, 25, 12, and 12 bytes apiece (oldest to newest). With the above parameters, no compaction will be started.
Why?

  • 100 → No, because sum(25, 12, 12) * 1.0 = 47
  • 25 → No, because sum(12, 12) * 1.0 = 24
  • 12 → No. Candidate because sum(12) * 1.0 = 12, there are only 2 files to compact and that is less than the threshold of 3
  • 12 → No. Candidate because the previous StoreFile was, but there are not enough files to compact

Minor Compaction File Selection - Example #3 (Limiting Files To Compact)

This example mirrors an example from the unit test TestCompactSelection.

  • hbase.hstore.compaction.ratio = 1.0f
  • hbase.hstore.compaction.min = 3 (files)
  • hbase.hstore.compaction.max = 5 (files)
  • hbase.hstore.compaction.min.size = 10 (bytes)
  • hbase.hstore.compaction.max.size = 1000 (bytes)

The following StoreFiles exist: 7, 6, 5, 4, 3, 2, and 1 bytes apiece (oldest to newest). With the above parameters, the files that would be selected for minor compaction are 7, 6, 5, 4, 3.
Why?

  • 7 → Yes, because sum(6, 5, 4, 3, 2, 1) * 1.0 = 21. Also, 7 is less than the min-size
  • 6 → Yes, because sum(5, 4, 3, 2, 1) * 1.0 = 15. Also, 6 is less than the min-size.
  • 5 → Yes, because sum(4, 3, 2, 1) * 1.0 = 10. Also, 5 is less than the min-size.
  • 4 → Yes, because sum(3, 2, 1) * 1.0 = 6. Also, 4 is less than the min-size.
  • 3 → Yes, because sum(2, 1) * 1.0 = 3. Also, 3 is less than the min-size.
  • 2 → No. Candidate because previous file was selected and 2 is less than the min-size, but the max-number of files to compact has been reached.
  • 1 → No. Candidate because previous file was selected and 1 is less than the min-size, but max-number of files to compact has been reached.