Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between memory_fits(...) and memory_usage(...) #527

Closed
SSoelvsten opened this issue Aug 10, 2023 · 24 comments · Fixed by #530
Closed

Discrepancy between memory_fits(...) and memory_usage(...) #527

SSoelvsten opened this issue Aug 10, 2023 · 24 comments · Fixed by #530
Assignees
Labels
🔥 bug Something isn't working 📁 internal This is where the ✨magic✨happens

Comments

@SSoelvsten
Copy link
Owner

SSoelvsten commented Aug 10, 2023

@nhusung tried to use the bdd-benchmark set to compare a new BDD library to existing ones. However, there are problems with Adiar:

$ build/src/adiar_queens_bdd -N 11 -M 4096
11-Queens (Adiar [BDD] 4096 MiB):

   Adiar [BDD] initialisation:
   | time (ms):              1

   Decision diagram construction:
Assert failed:  memory_fits and memory_usage should agree.
Expected:       unsafe_memory_usage(ret) <= memory_bytes
Source:         /[...]/bdd-benchmark/external/adiar/src/adiar/../adiar/internal/data_structures/sorter.h, line 56

Interestingly, they couldn’t observe a fail of this assertion in debug mode. Even in release mode, when inserting a debug print just above the mentioned line, the assert did not fail.

(I’m on Linux 6.4 and use clang 15.0.7)

Originally posted by @nhusung in #388 (comment)

@SSoelvsten SSoelvsten added the 🔥 bug Something isn't working label Aug 10, 2023
@SSoelvsten
Copy link
Owner Author

This seems related to #329 / #331

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 10, 2023

@nhusung what version of Adiar/BDD Benchmarks are you using?

  • I tried to run the Queens example program on this repository with Debug mode and I do not get this assert statement failing.
  • I tried to run the latest version of the bdd-benchmark repository and I do not get that error either.

I do use Linux 6.5 and GCC v13.2, so it might also be a weird difference between the compilers. Just to be sure, can you try some of the following things for me, please?

  • If you are using an older version of the benchmarks repository (i.e. up until eb4a4e85), then try to update the version of Adiar to the latest non-breaking release, i.e. 1.2.2.

  • If it is at a later commit then you need to use one of the v2.0.0 beta versions (there are already commits forwarding to 2.0-beta.3).

    Remember to have updated your submodules.

    git submodule update --init --recursive

@SSoelvsten SSoelvsten self-assigned this Aug 10, 2023
@SSoelvsten SSoelvsten changed the title I tried to use the bdd-benchmark set to compare a new BDD library to existing ones. However, I’m experiencing problems with Adiar: Discrepancy between memory_fits(...) and memory_usage(...) Aug 10, 2023
@nhusung
Copy link
Contributor

nhusung commented Aug 10, 2023

I’m using the latest Adiar version (601adfd0) and my own fork of bdd-benchmark. I just tried the queens example from the adiar repo out, and the result is basically the same (again using clang 15.0.7):

./example/queens -M 4096 -N 11
| Initialized Adiar with 4096 MiB of memory
|
| 11-Queens : Board construction
Assert failed:	memory_fits and memory_usage should agree.
Expected:	unsafe_memory_usage(ret) <= memory_bytes
Source:		/[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h, line 56
zsh: IOT instruction (core dumped)  ./example/queens -M 4096 -N 11

However, when compiling with GCC 13.2.1, the benchmark runs fine.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 10, 2023

Thanks for confirming it is a compiler-specific issue. I do not believe it makes Adiar use more memory than it is given (if it does, you'll get a warning) so if you need something quick, you should just comment out this line (or compile with Release instead).

The Clang-specific fix will take me some time to dig through.

@nhusung
Copy link
Contributor

nhusung commented Aug 10, 2023

No, building in Release mode does not fix the problem. The problem occurs in release mode only. (I used CC=clang CXX=clang++ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release .. to configure Adiar).

Edit: This is not a severe problem for my benchmarks; I can simply use GCC-compiled binaries for Adiar. I just experienced severe performance issues with Sylvan when using GCC-compiled binaries. I have not analyzed that problem yet. I just thought that the comparison would be fairer when using the same compiler for all packages.

I also was not able to build the all target using clang. The culprit is some test of cnl. Maybe you want to replace add_subdirectory (external/cnl cnl) by add_subdirectory(external/cnl cnl EXCLUDE_FROM_ALL) in CMakeLists.txt (and add_subdirectory (external/tpie tpie) accordingly).

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 10, 2023

It definitely would be the best if you can compare them with the same compiler. If I were you, I would just remove said line of Adiar.

The fact that it crashes for Release is an error of mine. It should have been a debug-only statement. This probably shows that I ought to clean up the assertion stuff ( See #422 and now #529 ).

Thanks for the hint on EXCLUDE_FROM_ALL tip; I definitely am not an expert on CMake. See #528 .

Regards to the memory usage computations with Clang, this is something I still have to dig into.

SSoelvsten added a commit that referenced this issue Aug 10, 2023
SSoelvsten added a commit that referenced this issue Aug 10, 2023
@SSoelvsten SSoelvsten added the 📁 internal This is where the ✨magic✨happens label Aug 10, 2023
@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

The adiar::sorter<memory_type::INTERNAL, T> class uses a tpie::array<T> inside of it. Specifically, the error occurs in the memory_fits(...) function which essentially is a wrapper for the tpie::array<T>'s memory_fits(...). Yet, just for a sanity check, I also check that the number of elements that TPIE claims fit indeed fits into this amount of memory. That is, I check it with its (hopefully) inverse memory_usage(...) function.

The memory computations are in a helper class linear_memory_base in tpie/util.h. Here is a noteworthy snippet of it.

template <typename child_t> 
struct linear_memory_base {
	static constexpr memory_size_type memory_usage(memory_size_type size) noexcept
	{
		return static_cast<memory_size_type>(
			floor(static_cast<double>(size) * child_t::memory_coefficient()
				+ child_t::memory_overhead())
		);
	}

	static constexpr memory_size_type memory_fits(memory_size_type memory) noexcept
	{
		return static_cast<memory_size_type>(
			floor((memory - child_t::memory_overhead()) / child_t::memory_coefficient())
		);
	}

	...
};

The child_t values are given from array.h.

template <typename T, typename Allocator = allocator<T> >
class array : public linear_memory_base<array<T> > {
	...

	static constexpr double memory_coefficient() noexcept
	{
		return (double)sizeof(T);
	}

	static constexpr double memory_overhead() noexcept
	{
		return sizeof(array);
	}

	...
};

There are multiple small things that could be part of the issue

  1. The use of double may lack precision during computation that rounds a value downwards
  2. The use of floor (from <cmath.h>) rounds the result down even further.
  3. The value size is cast to a double, but the value memory is not. Furthermore, the memory_coefficient() includes an explicit (C-like) cast but not memory_overhead().

I'll put back on the detective hat and see what I can do. 🤠 From here though, I probably need to install Clang 15 locally on my machine.

@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

Yes, I think you need some version of clang to debug the problem. As I said: I couldn’t observe a fail of this assertion in debug mode. Even in release mode, when inserting a debug print (std::cerr << "..." << std::endl) just above assertion, the example worked. This makes me feel like there is some kind of undefined behavior. I just compiled Adiar with the Undefined Behavior Sanitizer and it was indeed triggered a few times:

$ example/queens -N 11 -M 4096                                 
| Initialized Adiar with 4096 MiB of memory
|
| 11-Queens : Board construction
/[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43: runtime error: addition of unsigned offset to 0x560994593118 overflowed to 0x560994593100
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43 in 
/[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43: runtime error: addition of unsigned offset to 0x560994591fa0 overflowed to 0x560994591f90
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43 in 
/[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43: runtime error: addition of unsigned offset to 0x5609947950b8 overflowed to 0x5609947950a0
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43 in 
/[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43: runtime error: addition of unsigned offset to 0x560994994e70 overflowed to 0x560994994e60
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43 in 
|  | time: 9.87168 s
|  | largest BDD  : 1027599 nodes
|  | final size: 94822 nodes
/[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43: runtime error: addition of unsigned offset to 0x5609947c7660 overflowed to 0x5609947c7648
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/array.h:56:43 in 
| 11-Queens : Counting assignments
|  | number of solutions: 2680
|  | time: 0.0287112 s
Adiar statistics

| Not gathered; please compile with 'ADIAR_STATS'.

Here is the CMake command I used: CC=clang CXX=clang++ CXXFLAGS='-fno-omit-frame-pointer -fsanitize=undefined' cmake -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=1 ..

If you set the environment variable UBSAN_OPTIONS=print_stacktrace=1, you also get a backtrace.

EDIT: I don’t know if the reported errors above are related to this bug. They should be fixed though. Here, the problem seems to be that boost calls the advance method with (size_t) -1, which causes a wrapping pointer addition. AFAIK this is not allowed in C/C++. I’ll open an issue there.

@SSoelvsten
Copy link
Owner Author

Hmm, I don't think that is related. I double-checked my sorter<...> implementation and there are no negative numbers ever used, so I do not believe I accidentally misused the tpie::array<T>.

@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

I just “reactivated” the assert by using if (unsafe_memory_usage(ret) > memory_bytes) { abort(); }. Maybe I have found the culprit:

UBSAN_OPTIONS=print_stacktrace=1 example/queens -N 11 -M 4096
| Initialized Adiar with 4096 MiB of memory
|
| 11-Queens : Board construction
/[...]/adiar/external/tpie/tpie/../tpie/util.h:108:4: runtime error: 1.84467e+19 is outside the range of representable values of type 'unsigned long'
    #0 0x5604ed2ee6f0 in tpie::linear_memory_base<tpie::array<adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, tpie::allocator<adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>>::memory_usage(unsigned long) /[...]/adiar/external/tpie/tpie/../tpie/util.h:108:4
    #1 0x5604ed2ee6f0 in adiar::internal::sorter<(adiar::memory_mode_t)1, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>::unsafe_memory_usage(unsigned long) /[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h:37:14
    #2 0x5604ed2ee6f0 in adiar::internal::sorter<(adiar::memory_mode_t)1, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>::memory_fits(unsigned long) /[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h:55:11
    #3 0x5604ed2ee6f0 in adiar::internal::sorter<(adiar::memory_mode_t)1, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>::memory_usage(unsigned long) /[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h:44:47
    #4 0x5604ed2ee6f0 in adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>::sorter(unsigned long, unsigned long, unsigned long, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>) /[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h:210:13
    #5 0x5604ed2ee6f0 in std::__detail::_MakeUniq<adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>>::__single_object std::make_unique<adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>, unsigned long&, unsigned long&, unsigned long&, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>&>(unsigned long&, unsigned long&, unsigned long&, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>&) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/unique_ptr.h:1070:34
    #6 0x5604ed2ec9a1 in std::unique_ptr<adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>, std::default_delete<adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>>> adiar::make_unique<adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>, unsigned long&, unsigned long&, unsigned long&, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>&>(unsigned long&, unsigned long&, unsigned long&, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>&) /[...]/adiar/src/adiar/../adiar/internal/memory.h:57:12
    #7 0x5604ed2ec9a1 in adiar::internal::sorter<(adiar::memory_mode_t)2, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>>::make_unique(unsigned long, unsigned long, unsigned long, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>) /[...]/adiar/src/adiar/../adiar/internal/data_structures/sorter.h:182:14
    #8 0x5604ed2ec9a1 in adiar::internal::levelized_priority_queue<adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>, 1u, (adiar::memory_mode_t)2, adiar::internal::shared_file_ptr<adiar::internal::levelized_file<adiar::internal::node, false>>, 2ul, std::less<unsigned int>, false, 0u>::init_buckets() /[...]/adiar/src/adiar/../adiar/internal/data_structures/levelized_priority_queue.h:468:45
    #9 0x5604ed2ebffe in adiar::internal::levelized_priority_queue<adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>, 1u, (adiar::memory_mode_t)2, adiar::internal::shared_file_ptr<adiar::internal::levelized_file<adiar::internal::node, false>>, 2ul, std::less<unsigned int>, false, 0u>::levelized_priority_queue(adiar::internal::dd const (&) [2], unsigned long, unsigned long, adiar::stats_t::levelized_priority_queue_t&) /[...]/adiar/src/adiar/../adiar/internal/data_structures/levelized_priority_queue.h:422:7
    #10 0x5604ed2e3059 in adiar::apply_prod2_policy::unreduced_t adiar::internal::__prod2_ra<adiar::apply_prod2_policy, adiar::internal::levelized_priority_queue<adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>, adiar::internal::request_data_lt<1ul, adiar::internal::request_data<(unsigned char)2, adiar::internal::with_parent, (unsigned char)0, (unsigned char)2>>, 1u, (adiar::memory_mode_t)2, adiar::internal::shared_file_ptr<adiar::internal::levelized_file<adiar::internal::node, false>>, 2ul, std::less<unsigned int>, false, 0u>>(adiar::apply_prod2_policy::reduced_t const&, adiar::apply_prod2_policy::reduced_t const&, std::function<adiar::internal::__uid<adiar::internal::ptr_uint64> (adiar::internal::ptr_uint64 const&, adiar::internal::ptr_uint64 const&)> const&, unsigned long, unsigned long) /[...]/adiar/src/adiar/../adiar/internal/algorithms/prod2.h:252:12
    #11 0x5604ed2e0119 in adiar::apply_prod2_policy::unreduced_t adiar::internal::prod2<adiar::apply_prod2_policy>(adiar::apply_prod2_policy::reduced_t const&, adiar::apply_prod2_policy::reduced_t const&, std::function<adiar::internal::__uid<adiar::internal::ptr_uint64> (adiar::internal::ptr_uint64 const&, adiar::internal::ptr_uint64 const&)> const&) /[...]/adiar/src/adiar/../adiar/internal/algorithms/prod2.h:740:16
    #12 0x5604ed2df295 in adiar::bdd_apply(adiar::bdd const&, adiar::bdd const&, std::function<adiar::internal::__uid<adiar::internal::ptr_uint64> (adiar::internal::ptr_uint64 const&, adiar::internal::ptr_uint64 const&)> const&) /[...]/adiar/src/adiar/bdd/apply.cpp:121:12
    #13 0x5604ed1fb1a8 in adiar::bdd_and(adiar::bdd const&, adiar::bdd const&) /[...]/adiar/src/adiar/../adiar/bdd.h:196:12
    #14 0x5604ed1fb1a8 in adiar::bdd::operator&=(adiar::bdd&&) /[...]/adiar/src/adiar/bdd/bdd.cpp:105:20
    #15 0x5604ed1d78a9 in n_queens_B() /[...]/adiar/example/queens.cpp:164:9
    #16 0x5604ed1d8559 in main /[...]/adiar/example/queens.cpp:381:22
    #17 0x7fc626027ccf  (/usr/lib/libc.so.6+0x27ccf) (BuildId: 8f3291ff820508330af3a5ccb75de43d474ef19a)
    #18 0x7fc626027d89 in __libc_start_main (/usr/lib/libc.so.6+0x27d89) (BuildId: 8f3291ff820508330af3a5ccb75de43d474ef19a)
    #19 0x5604ed1a35a4 in _start (/[...]/adiar/build-sanitize/example/queens+0x1085a4) (BuildId: 6503149e3e0828dd03163e3d9477f88792dfc077)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /[...]/adiar/external/tpie/tpie/../tpie/util.h:108:4 in 

@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

Hmm, I don't think that is related. I double-checked my sorter<...> implementation and there are no negative numbers ever used, so I do not believe I accidentally misused the tpie::array<T>.

No, no, it’s a TPIE internal thing. It happens during internal_priority_queue::push().

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

Line 108 of <tpie/util.h> seems much more likely to be the culprit, since that is the memory_usage function. Based on the stacktrace the problem is in the initialization of a bucket in the adiar::levelized_priority_queue<...> within the Apply sweep (the random-access version).

The problem stems from the memory_usage function which (quite ironically) tries to do some extra arithmetic to safe-guard against tpie::memory_usage(...) overflowing.

      const tpie::memory_size_type max_value = std::numeric_limits<tpie::memory_size_type>::max();
      const tpie::memory_size_type max_elem = memory_fits(max_value);
      if (no_elements > max_elem) {
        return max_value;
      }
      return unsafe_memory_usage(no_elements);

Indeed the value 1.84467e+19 is a 64-bit unsigned max value. That is, in your compilation with Clang 15, the input, an intermediate value, or the final result is cast to an unsigned long rather than an unsigned long long as is expected (tpie::memory_size_type itself is just a typedef of std::size_t). Hence, it overflows, creating undefined behaviour.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

I have created the following unit test in my fork of TPIE which passes with GCC 13 but not Clang 16 on my Fedora 38 machine.

bool array_memory_prediction_test() {
	...

	{ // Test for maximum 'unsigned long long' value
		unsigned long long i_mem = std::numeric_limits<unsigned long long>::max();

		const memory_size_type size  = tpie::array<int>::memory_fits(i_mem);
		const memory_size_type o_mem = tpie::array<int>::memory_usage(size);

		TEST_ENSURE(o_mem <= i_mem,
					"memory_fits() and memory_usage() expected to be their safe inverse");
	}

	return true;

Specifically, I am able to get it to fail with CMake set up as follows

make -D CMAKE_C_COMPILER=clang -D CMAKE_CXX_COMPILER=clang++ -D CMAKE_LINKER=lld -D CMAKE_BUILD_TYPE=RelWithDebInfo ..

In fact, it looks like the build type is relevant to show this issue. If it is not set, then the problem does not occur.

Specifically it works fine for values in [64, 1024] but not the maximum value of tpie::size_type, tpie::memory_size_type, unsigned long long nor std::size_t.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

I am sadly unable to get ubsan to work on my Fedora machine (I get linker errors, as it tries to find ubsan in an empty directory). I've cleaned up the code on this branch to split the one-liner into four different statements. I was hoping this would allow me to figure out where the cast to unsigned long was.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

Yet, with a little binary search I can tell you that the bug does not occur with std::numeric_limits<std::size_t>::max() >> 11 for a tpie::array<int> (but does when right-shifting by 10). This is equivalent to 8196 TiB. So with a right-shift of 12 we should be safe for a quick hotfix.

@SSoelvsten SSoelvsten mentioned this issue Aug 11, 2023
@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

I am sadly unable to get ubsan to work on my Fedora machine (I get linker errors, as it tries to find ubsan in an empty directory).

Strange 🤔 I just installed clang etc. in a F38 docker image and UBSan works out of the box. I only have problems with printing the stack trace (possibly related to llvm-symbolizer in my Docker container), so here is the output for ead90ebe from my native Arch installation:

/[...]/tpie/tpie/../tpie/util.h:110:40: runtime error: 1.84467e+19 is outside the range of representable values of type 'unsigned long'
    #0 0x55ef5b68885f in tpie::linear_memory_base<tpie::array<int, tpie::allocator<int>>>::memory_usage(unsigned long) /[...]/tpie/tpie/../tpie/util.h:110:40
    #1 0x55ef5b68885f in array_memory_prediction_test() /[...]/tpie/test/unit/test_array.cpp:248:34
    #2 0x55ef5b68e264 in tpie::tests& tpie::tests::test<bool (*)()>(bool (*)(), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /[...]/tpie/tpie/../tpie/unittest.h:442:17
    #3 0x55ef5b68bd56 in main /[...]/tpie/test/unit/test_array.cpp:523:4
    #4 0x7f3338027ccf  (/usr/lib/libc.so.6+0x27ccf) (BuildId: 8f3291ff820508330af3a5ccb75de43d474ef19a)
    #5 0x7f3338027d89 in __libc_start_main (/usr/lib/libc.so.6+0x27d89) (BuildId: 8f3291ff820508330af3a5ccb75de43d474ef19a)
    #6 0x55ef5b64ff04 in _start (/[...]/tpie/build/test/unit/ut-array+0x72f04) (BuildId: aca5afdf9052bc5aad3bc8c0f0a2e1ab14d9c051)

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

So it is either in the call to floor(...) (which should overload to the double input type) or it is in the static_cast<memory_size_type>(...); I cannot expect the return from memory_size_type to memory_size_type creates an issue.

I probably did something wrong on my end, but I've spent 7 hours trying to fix this - and it's not really worth it. This whole problem is due to me being extremely paranoid about overflows in the first place. Since I primarily use memory_fits(...) on realistic numbers there is no need to try and safe-guard against such stupidly large values. The PR #530 ought to have fixed this, right? If anything I probably even should drop all assertions and the magic value tpie_memory_max.

@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

So it is either in the call to floor(...) (which should overload to the double input type) or it is in the static_cast<memory_size_type>(...); I cannot expect the return from memory_size_type to memory_size_type creates an issue.

UBSan complains about the static_cast.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 11, 2023

I read it the same way too, but knowing that I know very little, I was not personally going to rule out the floor, haha. 😆

@nhusung
Copy link
Contributor

nhusung commented Aug 11, 2023

The sane way for performing the conversion would probably be something as ugly as:

double d = fmin(floor(memory_d), static_cast<double>(std::numeric_limits<memory_size_type>::max() / 2 + 1));
return static_cast<memory_size_type>(d);

§7.3.10p1 of the (draft of) the C++20 standard explicitly states:

A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates;
that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented
in the destination type.

I know Adiar/TPIE too little to know why the approximate values are so large. But undefined behavior must not occur by any means. An allocation larger than std::numeric_limits<memory_size_type>::max() / 2 + 1 doesn’t make a lot of sense to me, so I would think that clamping here is fine. The largest double ≤ std::numeric_limits<memory_size_type>::max() would be work as well, but I’m not sure how to compute this double. Casting std::numeric_limits<memory_size_type>::max() to double does not seem to work due to rounding imprecision.

An alternative would be std::lround(), but this only works for signed integers. And the docs state:

If the result of the rounding is outside the range of the return type, FE_INVALID is raised and an implementation-defined value is returned

TL;DR: Safe double-to-integer conversions seem to be a mess in C++

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 12, 2023

Yes, before the fix in #530 , I tried to safe-guard against memory_usage creating a result beyond numeric_limits<memory_size_type>::max() by calling the inverse function memory_fits with the largest possible number of elements (being the 64-bit maximum value too). Yet, this essentially has resulted in me triggering this undefined behaviour.

I have moved your fix onto the TPIE branch I mentioned above. I'll test later whether it makes the unit test pass. I am unsure whether it will be accepted by the TPIE maintainers, considering that this function is called reasonably often and it fixes a technically unrealistic use-case (if it wasn't for me being a dummy). An alternative fix they might be ok with is to insert an assertion.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 12, 2023

Most of all: thanks for all the help and support with tracking down this issue. Your fork includes a lot of good changes to the benchmarking repository. Do you mind if I try to cherry-pick some of these changes (of course trying to keep your authorship of the commit). There are also other things with the benchmarking repository that I really want to fix (the use of EXIT, FLUSH and ugly INFO to mention a few). I primarily mention them to apologize in advance. 🙂

Furthermore, looking at your newly added CNF example, one may argue that the construct_clauses function is a bad usage of Adiar (see adiar::bdd_ite's remark). Usually, a BDD package would use O(N) time to create a clause of N variables since it only needs to add a new root. Adiar will in such a case use O(N2 log N2) time; for each call you will essentially copy over the entire input and attempt to reduce it. For cases like this, Adiar provides a builder which allows you to do it in O(N) time (hence all adapters simulate a builder too).

There are reasonable arguments to keep it as-is, but you should be warned that Adiar will run surprisingly badly (in a way that arguably is unrepresentative).

@nhusung
Copy link
Contributor

nhusung commented Aug 12, 2023

First of all, I really have to thank you for all your work here and for fixing the issues so quickly. BDD-Benchmark was a huge timesaver for me, because otherwise I would have had to implement all the adapters and benchmarks on my own. If you can cherry-pick some of the changes I made to BDD-Benchmark, I’d really appreciate it (I also don’t mind should you mess up the authorship). Some time ago I thought that I should probably create pull requests but then I felt like my git history was a bit messed up (e.g. one commit making Cudd’s ZDD implementation fail) and I would need to do some rebasing.

Thanks for the hint concerning the CNF benchmark. I didn’t immediately get the sense behind the builder (I haven’t read a lot of Adiar’s docs so far) but I was already suspecting some reason like that. I think I’ll rewrite the function in the near future. I don’t expect the difference to be big, though, since the clauses usually consist of ≤ 10 variables.

@SSoelvsten
Copy link
Owner Author

SSoelvsten commented Aug 12, 2023

Thanks! I'll try to cherry-pick as much of your fixes for CMake and the adapters as I can. That is, assuming I don't run into too much of a merge conflict; otherwise, I might leave it for a reimplementation.

Adiar is kind of weird in the sense that it gets faster (per node processed) as inputs and outputs get larger (until disk I/O is the bottle-neck even for Adiar). For tiny instances, there is a major relative performance difference between Adiar and other BDD packages which accumulates. With v1.2 we fixed this as much as is possible without changing the fundamental algorithm. For more information, see our ATVA 2023 paper on the optimisation in v1.2 (the extended paper is available right now on arXiv).

You will also notice, that CAL has a jump from being pretty good to much worse around "moderate" instance sizes. This is essentially for the same reason: their breadth-first algorithms are not competitive for small BDDs. Hence, for small inputs (< 219) they switch to the classic depth-first recursive algorithms. In Adiar we do not have such a thing (yet... in #444 I have what I believe to be a good idea that works for Adiar)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔥 bug Something isn't working 📁 internal This is where the ✨magic✨happens
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants