We plan to try optimizing the implementation of the cuBSIM4load_kernel, but first we wanted to see if maybe there was already a "wish list" of tasks you would rather have done on CUSPICE. According to the profilling result, cuBSIM4load_kernel takes up 45% of the time and csrMv_kerne takes up 40%, and the CPU spends 80.54% of the time waiting.
#Cudalaunch nvprof code#
So far we have profiled the code using the Nvidia profiler with the "c5315_ann.net" file. We want to work on the cuda version of ngspice as a class project of parallel computing.