Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid casting (masking?) between R internal objects and std/arma vectors #16

Open
gvegayon opened this issue Nov 19, 2019 · 0 comments
Open

Comments

@gvegayon
Copy link
Member

The problem

In general, to my understanding, moving from SEXP objects to some types of classes in C++ like std::vector or arma::vec is in general efficient as, if the argument is marked as by reference &, the underlying data structure doesn't change. On the other hand, this operation does seem to take some time as when computing likelihood function of more complex models in which the full enumeration is needed the full iteration takes about half a second (anecdotally), which builds up.

Moreover, a certain bottle-neck is in memory allocation. Memory allocation happens at various stages, in particular:

  1. For each element of the vector of sufficient statistics (which may be reduced b/c using iso-statistics).
  2. For each parameter/network when computing the gradient.

These two can imply allocating and deallocating large arrays several times throughout the optimization process, as each call to the likelihood function right now requires it so.

The solution

In an ideal world, we would allocate memory only once and then reuse those structures again and again. This should reduce the overhead time associated with this operation significantly.

Likewise, it would be nice if we need to pass all the arrays used to compute the likelihood function only once, at the first call of the function. Since we are not modifying any of these elements (read-only during the optimization), we can keep them stored in a pointer at C++.

For this, we would need to create a class holding the following information:

  • (std::vector< arma::mat >) The support of the sufficient statistics, one per network.
  • (std::vector< double >) The vector of exp() values associated with every realization of the sufficient statistic.
  • (std::vector< arma::uvec >) Vector of vectors of weights associated with each realization of the sufficient statistics.

(TBC)

@gvegayon gvegayon mentioned this issue Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant