Skip to content

mv timed grooming

Matthew Von-Maszewski edited this page Sep 30, 2015 · 8 revisions

Status

  • merged to master -
  • code complete - August 20, 2015
  • development started - August 20, 2015

History / Context

Basho's changes to leveldb's compaction strategy have previously focus on heavy write loads. This branch is instead focused on light to medium write loads. Its logic does not have an opportunity to activate during heavy write loads. Therefore this branch is an additional strategy, not a replacement.

The existing strategy creates more efficient compactions by waiting until roughly six overlapping .sst table files exist at level 0, then compacting all of them into one overlapping .sst table file at level 1. Similarly the strategy waits for six .sst table files at level 1 before compacting into level 2. The strategy is very effective under heavy write loads, both single database (vnode) loads and multiple database loads.

There is a downside to the existing strategy for light and medium write loads. Read performance temporarily drops noticeabley if there have been no compactions and then suddenly one or more databases (vnodes) start a six file compaction. Read performance would be more consistent if light and medium write loads compacted smaller sets of files more often. This branch initiates smaller compaction sets based upon elapsed time.

Branch Description

.gitignore

This change is a correction to exclude sst_rewrite from git's checkin analysis. The change is to fix a previous branch. It is not directly related to timed compactions.

db/db_impl.cc

Unit testing uncovered a race condition relating to shutdown and background processing of DBImpl::BackgroundCall2() and DBImpl::BackgroundImmCompactCall(). The two routines now contain a test for a shutdown scenario and no longer report an error to the LOG file in that situation.

**DBImpl::MakeRoomForWrite() contains logic for accelerating when write buffers flush. THIS LOGIC WAS CHECKED IN TO FACILITATE TESTING. IT PERFORMED POORLY IN TESTING AND MUST BE REMOVED. All changes within this function will be reverted. **

Clone this wiki locally