Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct/class that require cacheline alignment may not work when using new (before c++17). #2416

Open
ehds opened this issue Oct 19, 2023 · 6 comments

Comments

@ehds
Copy link
Contributor

ehds commented Oct 19, 2023

Is your feature request related to a problem? (你需要的功能是否与某个问题有关?)
当前 brpc 代码默认使用的 C++ 标准为 11,不支持 align new (since c++17 https://en.cppreference.com/w/cpp/memory/new/operator_new) .

如果某个 class 指定了 alignment 要求(例如 BAIDU_CACHELINE_ALIGNMENT),代码中使用 new 的方式来分配其对象时,地址有可能并不是严格按照其对齐方式的。

要让一个变量或结构体按cacheline对齐,可以include <butil/macros.h>后使用BAIDU_CACHELINE_ALIGNMENT宏,请自行grep brpc的代码了解用法。

https://github.com/apache/brpc/blob/master/docs/cn/atomic_instructions.md#cacheline

例如以下代码,使用当前的编译选项,就可能会出现错误。

class BAIDU_CACHELINE_ALIGNMENT A {
    int i;
};

int main() {
    for(size_t i =0 ;i<100;i++) {
        A* a = new A();
        // maybe fail.
        assert((reinterpret_cast<uintptr_t>(a) & (63)) == 0);
     }

Describe the solution you'd like (描述你期望的解决方法)

使用 new 来分配指定对齐要求的类时,内存地址应满足对齐的要求。
升级为 C++17 标准,或者开启 -faligned_new (gcc 7.4+,clang 7.1.0+ 都已经支持) .

目前来看是强行关闭了该警告信息(不知道具体原因).

brpc/CMakeLists.txt

Lines 70 to 72 in f3fe5fc

if(NOT (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7.0))
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-aligned-new")
endif()

Describe alternatives you've considered (描述你想到的折衷方案)
对于需要对齐的类,在使用 new 分配内存时使用 aligned_alloc/posix_memalign 等函数申请 alignment 内存,再使用 Placement new 指定内存空间进行初始化.

Additional context/screenshots (更多上下文/截图)
os: 20.04.1-Ubuntu
compiler: clang version 10.0.0-4ubuntu1
cpu: x86_64, cache_alignment : 64 byte

例如对于 class BAIDU_CACHELINE_ALIGNMENT/*note*/ Socket 类:

image

socket 的地址为 0x00005555567f1530, 并不是 64 byte 对齐,违反了要求,可能引起 false-sharing.

@ehds ehds changed the title Struct that require cacheline alignment may not work when using new (before c++17). struct/class that require cacheline alignment may not work when using new (before c++17). Oct 19, 2023
@qycyfjy
Copy link

qycyfjy commented Oct 21, 2023

没开-faligned-new还用new来动态分配__attribute__((aligned()))标记的类, 真是aligned了个寂寞

图片

@ehds
Copy link
Contributor Author

ehds commented Oct 22, 2023

没开-faligned-new还用new来动态分配__attribute__((aligned()))标记的类, 真是aligned了个寂寞

是的,这里还有另外一个隐含的问题。
如果 brpc 作为第三方库,编译时未加 -faligned-new, 但项目中使用了其头文件中的声明的某个类(cache line alignment),并且项目编译使用了 aligned_new, 而该类的析构函数(在 brpc 三方库中定义)调用 delete 释放对象空间时也不会调用

void operator delete  ( void* ptr, std::size_t sz,  std::align_val_t al ) noexcept;

也就是说 new (aligned_new) 和 delete 不匹配,delete 的 size 可能会小于实际 new 的 size (为类对齐要求的 padding)。
这里可能会产生一些问题, 例如一些项目中覆盖了 malloc/free,new/delete 定制化了 allocator,会导致申请和释放的大小不一致。 或者在开启了 ASAN 检查时,会检查出 new/delete 类型不匹配。

@uvletter
Copy link

=================================================================
==155368==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x621000030d00 in thread T22 (brpc_timer):
  object passed to delete has wrong type:
  size of the allocated type:   4096 bytes;
  size of the deallocated type: 4096 bytes.
  alignment of the allocated type:   default-aligned;
  alignment of the deallocated type: 64 bytes.
    #0 0x7ff6e41a35d5 in operator delete(void*, unsigned long, std::align_val_t) (/usr/lib64/libasan.so.5+0x1105d5)
    #1 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:166
    #2 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:161
    #3 0x150d34f in butil::detail::ThreadExitHelper::~ThreadExitHelper() external/com_github_apache_brpc/src/butil/thread_local.cpp:41
    #4 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:77
    #5 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:76
    #6 0x7ff6e3c68ca1 in __nptl_deallocate_tsd /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:155
    #7 0x7ff6e3c68eb2 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:314
    #8 0x7ff6e2690b0c in clone (/lib64/libc.so.6+0xfeb0c)

0x621000030d00 is located 0 bytes inside of 4096-byte region [0x621000030d00,0x621000031d00)
allocated by thread T22 (brpc_timer) here:
    #0 0x7ff6e41a1767 in operator new(unsigned long, std::nothrow_t const&) (/usr/lib64/libasan.so.5+0x10e767)
    #1 0x1227afe in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::get_or_create_tls_agent(int) external/com_github_apache_brpc/src/bvar/detail/agent_group.h:150
    #2 0x1227afe in bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::get_or_create_tls_agent() external/com_github_apache_brpc/src/bvar/detail/combiner.h:297
    #3 0x1227afe in bvar::Reducer<long, bvar::detail::AddTo<long>, bvar::detail::MinusFrom<long> >::operator<<(long) external/com_github_apache_brpc/src/bvar/reducer.h:194
    #4 0x1227afe in int bthread::TaskGroup::start_background<true>(unsigned long*, bthread_attr_t const*, void* (*)(void*), void*) external/com_github_apache_brpc/src/bthread/task_group.cpp:453
    #5 0x11fa28f in bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, void* (*)(void*), void*) external/com_github_apache_brpc/src/bthread/bthread.cpp:150
    #6 0x11fa28f in bthread_start_background external/com_github_apache_brpc/src/bthread/bthread.cpp:230
    #7 0xaac90d in braft::RepeatedTimerTask::on_timedout(void*) external/com_meituan_mraft/src/braft/repeated_timer_task.cpp:120
    #8 0x1200a04 in bthread::TimerThread::Task::run_and_delete() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:282
    #9 0x1202fb5 in bthread::TimerThread::run() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:395
    #10 0x12075ef in bthread::TimerThread::run_this(void*) external/com_github_apache_brpc/src/bthread/timer_thread.cpp:122
    #11 0x7ff6e3c68ea4 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:307

I met the same problem. I compile my project in C++17, while the compilation option for BRPC is C++11, so ASAN detects the dismatch between new and delete.

@ehds
Copy link
Contributor Author

ehds commented Feb 26, 2024

I met the same problem. I compile my project in C++17, while the compilation option for BRPC is C++11, so ASAN detects the dismatch between new and delete.

#2421 Try this fix.

@uvletter
Copy link

uvletter commented Feb 26, 2024 via email

@yanglimingcn
Copy link
Contributor

yanglimingcn commented Feb 29, 2024

请问针对这种c++不同版本不匹配的问题,c++官方哪里有个统一的列表说明吗?想看看是否还有别的坑。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants