Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jacobi #641

Closed
wants to merge 1 commit into from
Closed

Jacobi #641

wants to merge 1 commit into from

Conversation

yodada
Copy link
Collaborator

@yodada yodada commented Mar 29, 2022

This PR merges Jacobi kernel code, which can be found at software/spmd/bsg_cuda_lite_runtime/jacobi/

Jacobi 3D takes an input of Nx * Ny * Nz. This implementation is unrolled along Nx, and distributes Ny and Nz along tileX and tileY respectively. So the minimal valid input is 64 * 18 * 10. Another valid input is 126 * 18 *10. Note there along Nx it reads 64 inputs and generates 62 outputs. There is overlapping between steps.

@drichmond
Copy link
Contributor

Merged kernel code into bespoke-silicon-group/bsg_replicant#779

@drichmond drichmond closed this Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants