Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output of reduce_scatter is incorrect #303

Open
wdfnst opened this issue Mar 23, 2021 · 1 comment
Open

output of reduce_scatter is incorrect #303

wdfnst opened this issue Mar 23, 2021 · 1 comment

Comments

@wdfnst
Copy link

wdfnst commented Mar 23, 2021

#include <iostream>
#include <memory>

#include "gloo/allreduce_ring.h"
#include "gloo/reduce_scatter.h"
#include "gloo/rendezvous/context.h"
#include "gloo/rendezvous/file_store.h"
#include "gloo/rendezvous/prefix_store.h"
#include "gloo/transport/tcp/device.h"
int main(){
  int num_elements = 12;
  int buffer_data[] = {1, 2, 3, 4, 5, 6, 11, 12, 13, 14, 15, 16};
  std::vector<int*> sendbuf;
  std::cout << context->rank << "before-send:";
  for (int i = 0; i < num_elements; i++) {
      sendbuf.push_back( &((int*)buffer_data)[i]);
//       std::cout << ((int*)buffer_data)[i] << " ";
      std::cout << *(sendbuf[i]) << " ";
  }
  std::cout << std::endl;
  std::vector<int> recvcountsbuff({6, 6});
  gloo::ReduceScatterHalvingDoubling<int> rs_hd(
          context,
          sendbuf,
          num_elements,
          recvcountsbuff
          );
  rs_hd.run();
  std::cout << context->rank << "after-send:";
  for (int i = 0; i < num_elements / 2; i++) {
      std::cout << ((int*)buffer_data)[i] << " ";
//       std::cout << *(sendbuf[i]) << " ";
  }
  std::cout << std::endl;`
return 0;
}

compile:

g++ -lstdc++ --std=c++11 example2.cc libgloo.a -ldl -pthread -o example2

run on node-1:
env PREFIX="aaa" SIZE=2 RANK=0 ./example2
run on node-2:
env PREFIX="aaa" SIZE=2 RANK=1 ./example2

expected output:
rank-0: [2, 4, 6, 8, 10, 12]
rank-1: [11, 24, 26, 28, 30, 32]
actual output:
rank-0: [862031072 862031072 862031072 862031072 862031072 862031072]
rank-1: [197264104 197264104 197264104 197264104 197264104 197264104]

@maxhgerlach
Copy link

Your arguments to the constructor of ReduceScatterHalvingDoubling don't make much sense like this.

Something like the following should work:

  gloo::ReduceScatterHalvingDoubling<int> rs_hd(
          context,
          std::vector<int*>{buffer_data},  // <- vector holding just one pointer
          num_elements,
          recvcountsbuff
          );
  rs_hd.run();

Afterwards buffer_data will hold the scattered reduced data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants