Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong output for igebs2d, igebr2d (matrix broadcast) in ScaLAPACK #98

Open
j7168908jx opened this issue May 21, 2024 · 0 comments
Open

Comments

@j7168908jx
Copy link

j7168908jx commented May 21, 2024

I am trying to write some C++ code that calls ScaLAPACK and I encountered this problem.

After extracting the problem to a minimum example, I want to broadcast a general matrix, which here is a special case, that is 1x1, to a process grid, applying in each row of this grid.

For example, the grid is 2 rows x 1 cols, and process 1 wants to broadcast a value to process 0.

And here I have this minimum ~50 lines example showing this weird result:

// build with: (but version does not matter I think)
// mpicxx --std=c++17 test.cpp -L/opt/scalapack-2.2.0 -L/opt/LAPACK/3.10.0/ -L/opt/OpenBLAS/0.3.19/lib64 -lscalapack -llapack -lopenblas -lgfortran -o test.x
// run with:
// OMP_NUM_THREADS=1 LD_LIBRARY_PATH=/opt/LAPACK/3.10.0:/opt/OpenBLAS/0.3.19/lib64:/opt/MPICH/4.0.2/lib:$LD_LIBRARY_PATH mpirun -np 2 ./test.x


#include <cstdlib>
#include <cassert>
#include <string>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <vector>
#include "mpi.h"
#include <Eigen/Dense>
#include <thread>
#include <chrono>


extern "C" {

  void Cblacs_get(const int ictxt, const int what, int *val);
  void Cblacs_pinfo(int *myrank, int *nprocs);
  void Cblacs_gridinit(int *ictxt, const char *order, const int nprow, const int npcol);
  void Cblacs_gridinfo(const int ictxt, int *nprow, int *npcol, int *myrow, int *mycol);
  void Cblacs_gridexit(const int ictxt);
  void descinit_(int *desc,
      const int *m, const int *n, const int *mb, const int *nb, const int *irsrc, const int *icsrc, const int *ictxt, const int *lld, int *info);

  void Cigebs2d(
    const int ConTxt, const char *scope, const char *top,
    const int m, const int n, const int *A, const int lda);
  void Cigebr2d(
    const int ConTxt, const char *scope, const char *top,
    const int m, const int n, int *A, const int lda,
    const int rsrc, const int csrc);


  // compute LOCr or LOCc (local size of data for distributed array)
  int numroc_(const int *n, const int *nb, const int *iproc, const int *isrcproc, const int *nprocs);

}

int main(int argc, char **argv) {
  MPI_Init(&argc, &argv);

  int myid, numprocs;
  int ictxt, myrow, mycol;
  int nprow = 2, npcol = 1;
  int magic = 4;
  Cblacs_pinfo(&myid, &numprocs);
  Cblacs_get(0, 0, &ictxt);
  Cblacs_gridinit(&ictxt, "Row", nprow, npcol);
  Cblacs_gridinfo(ictxt, &nprow, &npcol, &myrow, &mycol);
  std::this_thread::sleep_for(std::chrono::seconds(magic-myid));
  std::cout << "[" << myid << "] :" << "ictxt: " << ictxt << std::endl;
  std::cout << "[" << myid << "] :" << "nprow and npcol: " << nprow << " " << npcol << std::endl;
  std::cout << "[" << myid << "] :" << "myrow and mycol: " << myrow << " " << mycol << std::endl;

  char charc = 'c', chars = ' ';

  int vcurrow = 1;
  int sendv, recvv;
  if (myid == vcurrow) {
    sendv = 2;
    Cigebs2d(ictxt, &charc, &chars, 1, 1, &sendv, 1);
    std::cout << "[" << myid << "] :" << "sendv: " << sendv << std::endl;

  } else {
    Cigebr2d(ictxt, &charc, &chars, 1, 1, &recvv, 1, mycol, vcurrow);
    std::cout << "[" << myid << "] :" << "recvv: " << recvv << std::endl;

  }

  Cblacs_gridexit(ictxt);
  MPI_Finalize();
  return 0;
}

The output is


[1] :ictxt: 0
[1] :nprow and npcol: 2 1
[1] :myrow and mycol: 1 0
[0] :ictxt: 0
[0] :nprow and npcol: 2 1
[0] :myrow and mycol: 0 0
[0] :recvv: 4
[1] :sendv: 4

which totally confuses me. And also I found that the output value is actually, the value in the magic variable. Why?

(The reason I use the sleep code is to make the output more clear)

Also, I tried to replace igebr2d, igebs2d with MPI_Bcast. That works well (only tested in this example)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant