Back to Cutlass

CUTLASS: output_tile_thread_map.h File Reference

docs/output__tile__thread__map_8h.html

4.4.25.5 KB
Original Source

| | CUTLASS

CUDA Templates for Linear Algebra Subroutines and Solvers |

Classes | Namespaces

output_tile_thread_map.h File Reference

Metaprogram for determining the mapping of output elements to threads for epilogue tiles. More...

#include "cutlass/cutlass.h"
#include "cutlass/numeric_types.h"
#include "cutlass/array.h"
#include "cutlass/layout/matrix.h"
#include "cutlass/matrix_shape.h"
#include "cutlass/tensor_ref.h"
#include "cutlass/fast_math.h"

Include dependency graph for output_tile_thread_map.h:

This graph shows which files directly or indirectly include this file:

[Go to the source code of this file.](output tile thread__map_8h_source.html)

|

Classes

| | struct | cutlass::epilogue::threadblock::OutputTileShape< Column, Row, Group, Cluster, Tile > | | | Tuple defining point in output tile. More...
| | | | struct | cutlass::epilogue::threadblock::OutputTileThreadMap< ThreadMap_, Shape_, Iterations_, Delta_, Count_ > | | | | struct | cutlass::epilogue::threadblock::detail::RowArrangement< Shape, WarpsRemaining, ElementsPerAccess, ElementSize, Is2dTile > | | | RowArrangement determines how one or more warps cover a region of consecutive rows. More...
| | | | struct | cutlass::epilogue::threadblock::detail::RowArrangement< Shape, WarpsRemaining, ElementsPerAccess, ElementSize, false > | | | RowArrangement in which each warp's access is a 1D tiled arrangement. More...
| | | | struct | cutlass::epilogue::threadblock::detail::RowArrangement< Shape, WarpsRemaining, ElementsPerAccess, ElementSize, true > | | | RowArrangement in which each warp's access is a 2D tiled arrangement. More...
| | | | struct | cutlass::epilogue::threadblock::detail::RowArrangement< Shape, WarpsRemaining, ElementsPerAccess, ElementSize, true >::Detail | | | | struct | cutlass::epilogue::threadblock::OutputTileOptimalThreadMap< Shape_, Count_, Threads, ElementsPerAccess, ElementSize > | | | | struct | cutlass::epilogue::threadblock::OutputTileOptimalThreadMap< Shape_, Count_, Threads, ElementsPerAccess, ElementSize >::Detail | | | | struct | cutlass::epilogue::threadblock::OutputTileOptimalThreadMap< Shape_, Count_, Threads, ElementsPerAccess, ElementSize >::CompactedThreadMap | | | Compacted thread map in which the 4D region is contiguous. More...
| | | | struct | cutlass::epilogue::threadblock::InterleavedOutputTileThreadMap< WarpCount_, MmaCount_, Threads, ElementsPerAccess, ElementSize > | | | | struct | cutlass::epilogue::threadblock::InterleavedOutputTileThreadMap< WarpCount_, MmaCount_, Threads, ElementsPerAccess, ElementSize >::Detail | | |

|

Namespaces

| | | cutlass | | | | | cutlass::epilogue | | | | | cutlass::epilogue::threadblock | | | | | cutlass::epilogue::threadblock::detail | | |


Generated by 1.8.11