You should see in your private repositories
-
test_homework2_solution.py
: The test suit used to grade the assignments. The performance was tested by timing yourgauss_seidel
code for 4 different problem sizes: 128, 512, 1024 and 2048. More details on this below in the specific issues section. -
output.txt
: The raw output when I ran the tests -
result.txt
: The parsed test and timing results. I have not used thejacobi
timings in your performance grades.
You should also see comments in your Canvas grade book for any errors I caught. The points were scaled to the rubric determined for Homework 2.
-
Not initializing arrays. Most students had the declaration part correct. But many of you while using your
linalg.c
functions likemat_vec
within yourjacobi
andgauss_seidel
functions would pass it arrays that have not been initialized to 0. Remember that the test suit works withwrappers.py
, which hasout
initialized to 0. While grading, I have assigned a generous partial credit for everyone who had correct logic and failed thejacobi
andgauss_seidel
tests due to this error. But this is a very dangerous mistake to make. -
Inefficient code:
jacobi
andgauss_seidel
functions for many of you had either- Quite a few temporary storage arrays being declared and used. If you can work with the matrix
A
in place, you can make the code more efficient and cleaner. CreatingL
,D
andU
matrices needN^2
storage each. These are just copies ofA
’s lower triangular , diagonal and upper triangular sections respectively. ”Better”: just work with the right elements ofA
itself (refer to solution code). So, always look at your algorithm, anything that is either not being changed, or won’t be needed to be preserved, use those matrices in-place - Many operations that need be done only once, done at each iteration. For example, decomposing the matrix
A
at every iteration, but the only array that changes at every iteration is the guess solution. DecomposingA
once and reusing the decomposed matrices will help with reducing the computational overhead. - Not used your
linalg.c
functions. While this did help some not have initialization errors mentioned above, using functions that have been designed to do all the steps you would need injacobi
andgauss_seidel
algorithms is much more efficient than rewriting the loops from scratch.
- Quite a few temporary storage arrays being declared and used. If you can work with the matrix
Almost all of you have this correct! The few mistakes I saw were
-
Incorrect declarations:
double vec_norm(double* v, int N) { int norm = 0; for (int i=0; i<N; ++i) { norm += v[i] * v[i]; } return norm = sqrt(norm); }
Here
norm
needs to be adouble
, not an integer. -
Missing initialization:
double vec_norm(double* v, int N) { double norm; for (int i=0; i<N; ++i) { norm += v[i] * v[i]; } return norm = sqrt(norm); }
Here
norm
needs to be initialized to0
before using it.
- Very few had these incorrect, missing initialization was the only common issue I observed.
- Most people have misunderstood what Part 2 of Question 2 was asking. There seems to also be a confusion regarding contiguous memory access and cache creating a copy of memory at and after a requested location.
- Part 1: Due to a confusion in interpreting this question, I have regraded everyone and added a point on the contiguous memory access part of the question. Please do not mix up the concepts of contiguous memory access and the "hypothetical" cache behavior question in Part 2.
- Part 2:
solve_upper_triangular
uses back substitution, we start at row N-1 and go "backwards" to row 0. If the cache creates a copy of the memory at and after row N-1, then it would have a copy of the one non-zero element in row N-1, but that is the last element. For row N-2, it would create a copy of the second-last and last column entries of U, and maybe the copy a few 0s from the last row, first few columns. This is an inefficiency, most of the cached copies are not useful in back substitution. This pattern will be observed as the rows reduce towards row 0, with the cache memory copies becoming more efficient as we work towards row 0.
Along with the general issues observed above, the most common failed test was:
- Timing tests: Each problem was run 10 times, and an average was taken across the 10 iterations. If your code took longer than 60 seconds, the test has failed (I would note that it timed out). I computed the std dev range your timing fell in (as per the Grading document rubric). Many of you passed the smaller size problems, most passed the 128 size problem with timings in the -/+ 1 std dev range. But due to efficiency issues noted above, codes would fail at 512 size problems or above. So I weighed the timings for the 128, 512, 1024 and 2048 sizes as 2:1:1:1, so as to maximize partial credit. Ignore the
jacobi
timings. - If
test_gauss_seidel
failed, I did not run any timing tests. Unfortunately, I can’t assign any partial performance points (it doesn’t matter how fast you can do something incorrectly). I have made this up by assigning very generous partial credit in the “Passes Tests” grade. - Some of you copied your working
jacobi
logic/code intogauss_seidel
. I could not assign any partial credit for thegauss_seidel
tests and performance tests.