fix uninitialized variable in xz compressor that causes compress-test to fail memcheck #339

prohaska7 · 2015-11-11T15:12:01Z

compress-test fails with valgrind due to conditional jump depended on an uninitialized value. the uninitialized value occurs because the lzma encoder object is not fully initialized by its constructor. the bug fix is to initialize the variables in the encoder's constructor. see http://prohaska7.blogspot.com/2015/11/uninitialized-data-problem-in-lzma.html for details.

the test is to run 'make check' on the third party xz software.

the alternative fix is to add valgrind suppressions for the errors.

note that the fix has been made to the head of the xz git repo http://git.tukaani.org/?p=xz.git;a=summary to be release in some version post v5.2.2.

tokuft currently includes xz version 4.999.9beta from Aug 2009. perhaps tokuft should either use lzma system library (whatever version) or update to the latest version.

here is the valgrind report:

==2368== Conditional jump or move depends on uninitialised value(s)
==2368== at 0x4F5F21D: lz_encoder_prepare (lz_encoder.c:222)
==2368== by 0x4F5F8DA: lzma_lz_encoder_init (lz_encoder.c:516)
==2368== by 0x4F5F0CE: lzma_raw_coder_init (filter_common.c:212)
==2368== by 0x4F52FF1: block_encode_normal (block_buffer_encoder.c:192)
==2368== by 0x4F52FF1: lzma_block_buffer_encode (block_buffer_encoder.c:258)
==2368== by 0x4F4F63D: lzma_stream_buffer_encode (stream_buffer_encoder.c:93)
==2368== by 0x4F4F4A3: lzma_easy_buffer_encode (easy_buffer_encoder.c:27)
==2368== by 0x4F046E9: toku_compress(toku_compression_method, unsigned char_, unsigned long_, unsigned char const_, unsigned long) (compress.cc:141)
==2368== by 0x4022A7: test_compress_buf_method(unsigned char_, int, toku_compression_method) (compress-test.cc:54)
==2368== by 0x4023B4: test_compress_i(int, toku_compression_method, unsigned long_, unsigned long_) (compress-test.cc:66)
==2368== by 0x4024E8: test_compress(toku_compression_method, unsigned long_, unsigned long_) (compress-test.cc:83)
==2368== by 0x402841: test_compress_methods() (compress-test.cc:123)
==2368== by 0x402A13: test_main(int, char const**) (compress-test.cc:142)
==2368== by 0x4021DB: main (test.h:346)
==2368==
==2368== Conditional jump or move depends on uninitialised value(s)
==2368== at 0x4F5F32D: lz_encoder_prepare (lz_encoder.c:344)
==2368== by 0x4F5F8DA: lzma_lz_encoder_init (lz_encoder.c:516)
==2368== by 0x4F5F0CE: lzma_raw_coder_init (filter_common.c:212)
==2368== by 0x4F52FF1: block_encode_normal (block_buffer_encoder.c:192)
==2368== by 0x4F52FF1: lzma_block_buffer_encode (block_buffer_encoder.c:258)
==2368== by 0x4F4F63D: lzma_stream_buffer_encode (stream_buffer_encoder.c:93)
==2368== by 0x4F4F4A3: lzma_easy_buffer_encode (easy_buffer_encoder.c:27)
==2368== by 0x4F046E9: toku_compress(toku_compression_method, unsigned char_, unsigned long_, unsigned char const_, unsigned long) (compress.cc:141)
==2368== by 0x4022A7: test_compress_buf_method(unsigned char_, int, toku_compression_method) (compress-test.cc:54)
==2368== by 0x4023B4: test_compress_i(int, toku_compression_method, unsigned long_, unsigned long_) (compress-test.cc:66)
==2368== by 0x4024E8: test_compress(toku_compression_method, unsigned long_, unsigned long_) (compress-test.cc:83)
==2368== by 0x402841: test_compress_methods() (compress-test.cc:123)
==2368== by 0x402A13: test_main(int, char const**) (compress-test.cc:142)
==2368== by 0x4021DB: main (test.h:346)
==2368==

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

… an uninitialized value. the uninitialized value occurs because the lzma encoder object is not fully initialized by its constructor. the bug fix is to initialize the variables in the encoder's constructor. the test is to run 'make check' on the third party xz software. the alternative fix is to add valgrind suppressions for the errors. here is the valgrind report: ==2368== Conditional jump or move depends on uninitialised value(s) ==2368== at 0x4F5F21D: lz_encoder_prepare (lz_encoder.c:222) ==2368== by 0x4F5F8DA: lzma_lz_encoder_init (lz_encoder.c:516) ==2368== by 0x4F5F0CE: lzma_raw_coder_init (filter_common.c:212) ==2368== by 0x4F52FF1: block_encode_normal (block_buffer_encoder.c:192) ==2368== by 0x4F52FF1: lzma_block_buffer_encode (block_buffer_encoder.c:258) ==2368== by 0x4F4F63D: lzma_stream_buffer_encode (stream_buffer_encoder.c:93) ==2368== by 0x4F4F4A3: lzma_easy_buffer_encode (easy_buffer_encoder.c:27) ==2368== by 0x4F046E9: toku_compress(toku_compression_method, unsigned char*, unsigned long*, unsigned char const*, unsigned long) (compress.cc:141) ==2368== by 0x4022A7: test_compress_buf_method(unsigned char*, int, toku_compression_method) (compress-test.cc:54) ==2368== by 0x4023B4: test_compress_i(int, toku_compression_method, unsigned long*, unsigned long*) (compress-test.cc:66) ==2368== by 0x4024E8: test_compress(toku_compression_method, unsigned long*, unsigned long*) (compress-test.cc:83) ==2368== by 0x402841: test_compress_methods() (compress-test.cc:123) ==2368== by 0x402A13: test_main(int, char const**) (compress-test.cc:142) ==2368== by 0x4021DB: main (test.h:346) ==2368== ==2368== Conditional jump or move depends on uninitialised value(s) ==2368== at 0x4F5F32D: lz_encoder_prepare (lz_encoder.c:344) ==2368== by 0x4F5F8DA: lzma_lz_encoder_init (lz_encoder.c:516) ==2368== by 0x4F5F0CE: lzma_raw_coder_init (filter_common.c:212) ==2368== by 0x4F52FF1: block_encode_normal (block_buffer_encoder.c:192) ==2368== by 0x4F52FF1: lzma_block_buffer_encode (block_buffer_encoder.c:258) ==2368== by 0x4F4F63D: lzma_stream_buffer_encode (stream_buffer_encoder.c:93) ==2368== by 0x4F4F4A3: lzma_easy_buffer_encode (easy_buffer_encoder.c:27) ==2368== by 0x4F046E9: toku_compress(toku_compression_method, unsigned char*, unsigned long*, unsigned char const*, unsigned long) (compress.cc:141) ==2368== by 0x4022A7: test_compress_buf_method(unsigned char*, int, toku_compression_method) (compress-test.cc:54) ==2368== by 0x4023B4: test_compress_i(int, toku_compression_method, unsigned long*, unsigned long*) (compress-test.cc:66) ==2368== by 0x4024E8: test_compress(toku_compression_method, unsigned long*, unsigned long*) (compress-test.cc:83) ==2368== by 0x402841: test_compress_methods() (compress-test.cc:123) ==2368== by 0x402A13: test_main(int, char const**) (compress-test.cc:142) ==2368== by 0x4021DB: main (test.h:346) ==2368==

laurynas-biveinis · 2015-11-12T06:57:32Z

At least as far as MySQL is concerned, we should be using system libraries for distribution packaging

kuszmaul · 2015-11-12T12:37:11Z

Is the distribution packaging the same dissue as Rich's bug? Are you
suggesting that xz be removed from our sources and use the system-provided
xz instead? If so, be careful about making that change: the last time I
measured it, using the system's library version of the compressor resulted
in slower code. Maybe things are better now, or maybe
link-time-optimization was important to the performance of this call and it
will still be an issue, or maybe....

-Bradley

On Thu, Nov 12, 2015 at 1:57 AM, Laurynas Biveinis <[email protected]

wrote:

At least as far as MySQL is concerned, we should be using system libraries
for distribution packaging

—
Reply to this email directly or view it on GitHub
#339 (comment).

george-lorch · 2015-11-12T19:06:03Z

We should always try to use distro packages for 3rd party libs unless some truly horrible error or significant performance issue is present. Embedding 3rd party libs/source into a project where the same libs exist within a distros archives can prevent the product from being included within that distros archive. We ran into this when getting Percona Server 5.6 (without TokuDB) included into the Ubuntu distribution (it was an ssl lib IIRC that was the blocker) and we might have to submit PS 5.7 without TokuDB to Ubuntu for XenialXerus - 16.04 LTS because of the include libs. Somehow MariaDB got a pass on this last year but they got a lot of passes that we did not.

Besides the possible version mismatches and the already mentioned potential performance differences, another problem with relying on distro libs is that not all supported distros will have the dependent packages available and possibly force the user to need to build and install the libs themselves or force us to conditionally compile various lib use out for the distro. I don't know about lzma, seems to me it should be pretty widely available, but Snappy is not very likely to be available in any of the older supported distros.

Full circle is that for now we will probably have to continue embedding these (lzma, snappy) into our tree until we know for sure that all supported distros have the libs available.

I don't particularly like changing the embedded libs source at all regardless of what kind of bug is present, I would prefer to do a proper upstream bug report and absorb any new versions back into the tree if/when the bug gets fixed.

prohaska7 · 2015-11-12T19:12:52Z

The xz git repo (http://git.tukaani.org/?p=xz.git;a=summary) has this bug
fix on its head after the last tagged version v5.2.2.

There needs to be some test infrastructure that may be used to measure the
performance of the compressors so that the various versions can be
compared.

On Thu, Nov 12, 2015 at 2:06 PM, georgelorchpercona <
[email protected]> wrote:

We should always try to use distro packages for 3rd party libs unless some
truly horrible error or significant performance issue is present.
Embedding 3rd party libs/source into a project where the same libs exist
within a distros archives can prevent the product from being included
within that distros archive. We ran into this when getting Percona Server
5.6 (without TokuDB) included into the Ubuntu distribution (it was an ssl
lib IIRC that was the blocker) and we might have to submit PS 5.7 without
TokuDB to Ubuntu for XenialXerus - 16.04 LTS because of the include libs.
Somehow MariaDB got a pass on this last year but they got a lot of passes
that we did not.

Besides the possible version mismatches and the already mentioned
potential performance differences, another problem with relying on distro
libs is that not all supported distros will have the dependent packages
available and possibly force the user to need to build and install the libs
themselves or force us to conditionally compile various lib use out for the
distro. I don't know about lzma, seems to me it should be pretty widely
available, but Snappy is not very likely to be available in any of the
older supported distros.

Full circle is that for now we will probably have to continue embedding
these (lzma, snappy) into our tree until we know for sure that all
supported distros have the libs available.

I don't particularly like changing the embedded libs source at all
regardless of what kind of bug is present, I would prefer to do a proper
upstream bug report and absorb any new versions back into the tree if/when
the bug gets fixed.

—
Reply to this email directly or view it on GitHub
#339 (comment).

prohaska7 · 2016-02-03T14:42:09Z

'ctest -D ExperimentalMemCheck' fails due to the uninitialized variable in the lzma encoder. Please add this suppression so that a bunch of tests that currently fail due to this benign bug pass.

ft/valgrind.suppressions
{
    lzma_encode_is_not_valgrind_clean
    Memcheck:Cond
    fun:lz_encoder_prepare
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix uninitialized variable in xz compressor that causes compress-test to fail memcheck #339

fix uninitialized variable in xz compressor that causes compress-test to fail memcheck #339

prohaska7 commented Nov 11, 2015

laurynas-biveinis commented Nov 12, 2015

kuszmaul commented Nov 12, 2015

george-lorch commented Nov 12, 2015

prohaska7 commented Nov 12, 2015

prohaska7 commented Feb 3, 2016

fix uninitialized variable in xz compressor that causes compress-test to fail memcheck #339

Are you sure you want to change the base?

fix uninitialized variable in xz compressor that causes compress-test to fail memcheck #339

Conversation

prohaska7 commented Nov 11, 2015

laurynas-biveinis commented Nov 12, 2015

kuszmaul commented Nov 12, 2015

george-lorch commented Nov 12, 2015

prohaska7 commented Nov 12, 2015

prohaska7 commented Feb 3, 2016