Enhanced CNN performance by parallelizing the forward convolution layer. Achieved a 13.9x speedup, reducing execution time from 25s to 1.8s using OpenMP. Leveraged GPU parallelism with CUDA through ...
Parallelizing a C++ CNN framework. Contribute to iorais/CNN-CPP-Parallel development by creating an account on GitHub.