Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python implementation refactor, numba kernel optimizations, some GPU implementations #8

Open
wants to merge 60 commits into
base: master
Choose a base branch
from

Conversation

din14970
Copy link
Owner

@din14970 din14970 commented Jan 2, 2023

After more than a year I finally found a bit of spare time to work on this project a bit more. I have taken your implementation @berkels and made some improvements in terms of:

  • general pythonic code quality & formatting, descriptive naming, separation into modules/classes, use of type hints, ...
  • improved performance on the CPU by optimizing numba jit functions, utilization of multiple cores in some of the kernels, and caching of some pre-computed results so they don't get calculated again.
  • support for running the calculation on the GPU using cupy to replace numpy and numba.cuda kernels to replace numba.jit (see the cuda_kernels module). Unfortunately at this point it seems GPU performs worse in most cases due to data transfers and allocations.
  • unit tests that show the code still works the same way as before and as expected.

What still needs to happen:

  • further work out the user facing API that currently resides in implementation/implementation.py, in the JNNR class, support writing intermediate results to a file.
  • human readable docs would be nice that explain how the code actually works and implements what is written in the papers. This will not be in the scope of this PR though.
  • I don't think the multistage optimization works as intended yet, as described in https://arxiv.org/pdf/1901.01709.pdf. Particularly, I'm not sure how the bias correction step is supposed to be implemented. This is in JNNR.run. Some help here would be appreciated.
  • benchmarking against matchseries.

@berkels I would really appreciate if you could skim over this if/when you find time and comment on:

  • the bias reduction step issue
  • naming of functions and variables, descriptions in docstrings, ... (is it accurate/correct?)
  • tips on further improving performance
  • anything else you notice

din14970 and others added 30 commits April 28, 2021 20:45
…_integrate_pd_over_cells_single and is how one typically iterates in FE
… hand that does "constant in normal direction" extension
…ion problem as non-linear least squares problem
…mization problem as non-linear least squares problem
…rix vector mutliplication with a constant matrix and a constant shift
…1 (will make the multi level descent easier)
…rmalized integration domain instead of in pixels (will make the multi level descent easier)
@din14970 din14970 requested a review from berkels January 2, 2023 21:17
@din14970 din14970 self-assigned this Jan 2, 2023
)
self.state.deformations = ComplexSignal2D(dp.stack(corrected_images))
self.state.completed_stages = stage + 1
L *= self.config.regularization.factor_stage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems slightly different from what my C++ implementation is doing. There, the first stage uses lambda, all other stages use lambda*extraStagesLambdaFactor, i.e. the regularization parameter is only changed when moving from stage 1 to stage 2. It's not changed further when going from stage 2 to 3..

@berkels
Copy link
Collaborator

berkels commented Jan 3, 2023

The bias correction step indeed still seems to be missing. Let $$N[\psi]:=\frac12\sum_{i=1}^{n}\int_\Omega\lVert\phi_i(\psi(x))-x\rVert^2\mathrm{d}x.$$ To compute the minimizing $\phi$, we do a gradient flow of $N$ with respect to $\psi$. This needs the first variation
$$\langle N'[\psi,\phi_1,\ldots,\phi_n],\zeta\rangle=\sum_{i=1}^{n}\int_\Omega (D\phi_i)(\psi(x))^T(\phi_i(\psi(x))-x)\cdot\zeta\mathrm{d}x.$$ This is very similar to how the registration itself is done. This just doesn't have a regularizer and includes all deformations, not just one.

@berkels
Copy link
Collaborator

berkels commented Jan 3, 2023

Note $N$ can be seen as sum of data terms typical for registration. Let $x_1$ and $x_2$ denote the components of $x$, i.e. $x=(x_1,x_2)$. Furthermore, denote the components of $\phi_i(x)$ with $\phi_i(x)=(\phi_{i,1}(x),\phi_{i,2}(x))$. Then,
$$N[\psi]=\frac12\sum_{i=1}^{n}\sum_{j=1}^2\int_\Omega(\phi_{i,j}(\psi(x))-x_j)^2\mathrm{d}x.$$
and
$$\langle N'[\psi,\phi_1,\ldots,\phi_n],\zeta\rangle=\sum_{i=1}^{n}\sum_{j=1}^2\int_\Omega(\phi_{i,j}(\psi(x))-x_j)\nabla\phi_{i,j}(\psi(x)) \cdot\zeta.$$
Using this, you should be able to implement this using the data term and its gradient you implemented for the registration. There the images would be $\phi_{i,j}$ and $x_j$. The C++ code uses the interpretation above, but reusing your existing registration should work as well and take less effort.

@berkels
Copy link
Collaborator

berkels commented Jan 3, 2023

BTW: I'm very happy to see that you made a lot of progress on this!

@berkels
Copy link
Collaborator

berkels commented Jan 3, 2023

FYI, the C++ code handles the bias correction in SeriesMatching::reduceDeformations in projects/electronMicroscopy/matchSeries.h. I don't think that though this is very helpful as reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants