Matrix-Multiplication-Assignment Performance optimization of Diagonal Matrix Multiplication(DMM) of two N * N matrices. Using single thread Using multi thread Using CUDA