Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize search for solutions #3

Open
jmitchell opened this issue Jan 14, 2017 · 4 comments
Open

Parallelize search for solutions #3

jmitchell opened this issue Jan 14, 2017 · 4 comments

Comments

@jmitchell
Copy link
Owner

jmitchell commented Jan 14, 2017

Backtracking problems are embarrassingly parallel, and Backtrex should exploit that internally.

After the core API is stabilized, including implementing #2, split the search space into multiple sub-problems and delegate them to a worker pool. As solutions are discovered, have workers report them back so they can be concatenated to the solution stream.

The only clear drawback is the order in which solutions are discovered would no longer be consistent, but providing an API for sequential search (like the existing one) fixes that.

The benefits of supporting parallelism include scaling to the limits of a single BEAM node, and potentially beyond to more nodes until communication latency becomes the bottleneck. If sub-problem delegation and work stealing are implemented in a way that optimizes communication between coordinating processes to other coordinating processes or workers, the potential horizontal scalability could go far.

More detailed walkthrough

Rough sketch of the concept with the Sudoku solver in mind:

  1. Spawn a pool of worker processes.
  2. Ask the Sudoku.Solver for the next unknown cell and its possible values.
  3. While there are worker processes ready for more work, assign one the original problem, except with the current unknown cell assigned to one of the possible values that hasn't been explored.
  4. If all possible values for the current cell have been assigned and more workers are unoccupied, allow each of them to ask for an unexplored sub-problem from one of the occupied workers.
  5. Whenever a worker process finds a solution, it sends it back to the behaviour.
  6. Whenever a worker finishes exploring the entire solution space of its sub-problem, re-enqueue it to the worker pool (may need to explicitly trigger work stealing described in step 4).

Existing tools?

OTP behaviours and newer Elixir behaviours, like GenStage, may be well suited for implementing this concept. They may even offer better strategies. Research what's out there, and consider options while avoiding assumptions about the topology of the search space (number of unknowns, number of values for each unknown, and time require to compute them), the number workers, or the delegation strategy.

Desired invariants

  1. The union of solution spaces of all delegated sub-problems must be the solution space of the original problem (nothing is missed).
  2. The intersection of every explored sub-problem with every other explored sub-problem is the empty set (no work is repeated).
  3. After the requested number of solutions have been found, searching should suspend at a resumable checkpoint. Lazy Streams should help here. (Consider adding a function to suspend work before all the requested solutions have been found.)

Invariants 1 and 2 may seem obvious, but it can be challenging when processes crash. It may even be impossible when certain coordinating processes crash. Invariant 1 is a strong requirement, whereas bounded compromise on 2 (rework) may be acceptable.

@jmitchell
Copy link
Owner Author

Existing backtracking research may clarify details of the design. I've started a wiki page to collect resources and thoughts on how they do or don't apply to this problem.

@jmitchell
Copy link
Owner Author

Study this optimized, parallel N-queens solver in Rust.

@jmitchell
Copy link
Owner Author

Finish #12 (Generate profiling reports) first.

@cognivore
Copy link

👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants