Skip to content

expiry_beta1

Matthew Von-Maszewski edited this page Sep 1, 2016 · 16 revisions

Expiry Beta 1 - Fundamental Features

Beta period over. See eleveldb/leveldb tags 2.0.27 for implementation with iterator bug fix

Status

  • beta period ended - August 22, 2016
  • announced and available - July 7, 2016
  • preparation - July 5, 2016

Feature Set

This beta release contains the most basic, most rudimentary set of object expiry features. The release is for not for production. There exists the possibility that this new code could unintentionally delete data records that need to not be deleted. Basho is making this code available for experimental work and preproduction feedback.

The leveldb expiry beta 1 applies the expiry metadata to all newly written data objects. There are no Riak level options to enable / disable expiry by object. The C/C++ API does allow specification of non-expiry, write-time expiry, and explicit timestamp expiry during object write operations.

Features:

  • global leveldb expiry of key/value objects based upon the last update of the object (write-time expiry)

  • option to erase entire file of expired objects instead of selected removal of expired objects during leveldb compactions (whole file expiry)

  • explicit timestamp expiry that is individualized to each key/value object (explicit-time expiry)

  • leveldb API support for non-Riak applications to set and read both write-time expiry and explicit timestamp expiry information

Detailed design notes are found here.

Future Features

  • Riak specific: selection of expiry options based upon Riak Buckets and Bucket Types

  • Riak specific: Riak client access to write time and explicit timestamp metadata supporting expiry

  • Riak specific: logging and/or archiving of expired objects and/or whole files expired

  • Riak specific: write time and explicit expiry metadata propagated by handoff, replication, and repair operations

Known Bugs / Limitations

WARNING: This code changes the format of keys within the .sst table files and the .sst table file meta data within leveldb's MANIFEST file. Only recent releases of leveldb prior to this beta release are capable of safely reading the new format (specifically eleveldb and leveldb tags 2.0.22). If you use this expiry release and then desire to fallback to a prior release, you need 2.0.22 if you are want to retain test data from expiry_beta1.

  • Google's original block cache code does not actively release cached file blocks for closed or deleted .sst table files. This behavior makes leveldb appear to have a memory leak as files quietly expire but memory footprint increases to its configured maximum (configured in leveldb::Options).

  • Riak specific: key/value objects transferred via handoff, multi-datacenter replication, or rebuilt by read repair/AAE have their write-time expiry restarted.

  • Riak specific: the expiry options apply to all leveldb instances in a multi-backend configuration

  • Riak TS specific: the prepared eleveldb.so patch files do NOT work with Riak TS

  • Bug report August 19, 2016: The expiry code is not operational for iterator operations via DB::NewIterator() interface. Fix is contained in branch mv-expiry-iter-bug. Likely on branch develop by end of day August 20th.

Example configurations

  • leveldb C/C++ API users:
    #include "leveldb/options.h"
    #include "leveldb_os/expiry_os.h"

    leveldb::Options opts;
    leveldb::ExpiryPtr_t expiry;  // reference counted pointer

    expiry.assign(new leveldb::ExpiryModuleOS);

    expiry->expiry_enabled=true;
    expiry->expiry_minutes=30;
    expiry->whole_file_expiry=true;

    opts.expiry_module.assign(expiry);
  • Riak users' advanced.config: Note: for beta, advanced.config contains the expiry options. Future releases will have the options within riak.conf like other eleveldb/leveldb options.
    [
        {eleveldb, [
           {expiry_enabled, true},
           {expiry_minutes, 21},
           {whole_file_expiry, true}
        ]}
    ].

Prebuilt patches for Riak eleveldb.so

If a package upgrade is not possible in your environment, the LevelDB shared library can be patched.

Do not apply the eleveldb.so patch to Riak TS. This patch will prevent Riak TS from functioning.

Installation and Removal Instructions

This patch contains natively compiled code. The eleveldb.so file must be installed to the eleveldb priv directory and cannot be added to basho-patches.

By default, this directory is in the following locations per OS:

  • RHEL/CentOS - /usr/lib64/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/,
  • Debian/Ubuntu - /usr/lib/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/,
  • Solaris - /opt/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/,
  • SmartOS - /opt/local/lib/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/,
  • FreeBSD - /usr/local/lib/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/,
  • On other platforms it may be /usr/lib/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/.

To install this patch, on each node in the cluster you must:

  1. Stop the node: riak stop.
  2. Change to the eleveldb priv directory (similar to /usr/lib64/riak/lib/eleveldb-2.1.10-0-g0537ca9/priv/)
  3. Rename the original leveldb library ( eleveldb.so) to eleveldb.so.orig.
  4. Copy the provided eleveldb.so to the directory and verify correct permissions.
  5. If possible, verify that the md5 sum of the eleveldb.so located in the eleveldb priv directory is correct.
  6. Start the node: riak start.

To back out of this patch, on each node in the cluster you must:

  1. Stop the node: riak stop.
  2. Remove the patched eleveldb.so file from the eleveldb priv directory.
  3. Rename eleveldb.so.orig to eleveldb.so.
  4. Start the node: riak start.

Verifying the Patch Installation

When the patch is installed, the LevelDB LOG files will report that version expiry_beta1 is installed. The LOG files for each running vnode will have a log line similar to the following:

    2016/07/05-18:42:50.544293 7ffaaf3b1700             Version: expiry_beta1

Source code

Source code is available from both the basho/eleveldb and basho/leveldb repositories at github.com. The tag expiry_beta1 is set within both repositories to mark the source code related to this beta test. Explicit links for tar.gz download are:

Feedback

Please send all feedback to [email protected] (or [email protected] if necessary).

Clone this wiki locally