docs/namespacecutlass_1_1debug.html
| | CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers |
cutlass::debug Namespace Reference
|
| | template<typename Fragment > | | CUTLASS_DEVICE void | dump_fragment (Fragment const &frag, int N=0, int M=0, int S=1) | | | | template<typename Element > | | CUTLASS_DEVICE void | dump_shmem (Element const *ptr, size_t size, int S=1) | | |
template<typename Fragment >
| CUTLASS_DEVICE void cutlass::debug::dump_fragment | ( | Fragment const & | frag, |
| | | int | N = 0, |
| | | int | M = 0, |
| | | int | S = 1 |
| | ) | | |
The first N threads dump the first M elements from their fragments with a stride of S elements. If N is not specified, dump the data of all the threads. If M is not specified, dump all the elements of the fragment.
template<typename Element >
| CUTLASS_DEVICE void cutlass::debug::dump_shmem | ( | Element const * | ptr, |
| | | size_t | size, |
| | | int | S = 1 |
| | ) | | |
Dump the shared memory contents. ptr is the begin address, size specifies the number of elements that need to be dumped, and S specifies the stride.
Generated by 1.8.11