docs/device_2gemm__splitk__parallel_8h.html
| | CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers |
device/gemm_splitk_parallel.h File Reference
Template for GEMM performing a reduction over K partitions in parallel. More...
#include "cutlass/cutlass.h"
#include "cutlass/numeric_types.h"
#include "cutlass/arch/arch.h"
#include "cutlass/device_kernel.h"
#include "cutlass/gemm/threadblock/threadblock_swizzle.h"
#include "cutlass/gemm/kernel/gemm.h"
#include "cutlass/gemm/kernel/default_gemm_splitk_parallel.h"
#include "cutlass/gemm/device/default_gemm_configuration.h"
#include "cutlass/epilogue/thread/conversion_op.h"
#include "cutlass/reduction/kernel/reduce_split_k.h"
#include "cutlass/reduction/thread/reduction_operators.h"
Include dependency graph for device/gemm_splitk_parallel.h:
[Go to the source code of this file.](device_2gemm splitk parallel_8h_source.html)
|
|
| class | cutlass::gemm::device::GemmSplitKParallel< ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, LayoutC_, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ConvertScaledOp_, ReductionOp_, ThreadblockSwizzle_, Stages, kAlignmentA, kAlignmentB, Operator_ > |
| |
| struct | cutlass::gemm::device::GemmSplitKParallel< ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, LayoutC_, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ConvertScaledOp_, ReductionOp_, ThreadblockSwizzle_, Stages, kAlignmentA, kAlignmentB, Operator_ >::Arguments |
| | Argument structure. More...
|
| |
| class | [cutlass::gemm::device::GemmSplitKParallel< ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, layout::ColumnMajor, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ConvertScaledOp_, ReductionOp_, ThreadblockSwizzle_, Stages, kAlignmentA, kAlignmentB, Operator_ >](classcutlass_1_1gemm_1_1device_1_1GemmSplitKParallel_3_01ElementA 00_01LayoutA 00_01ElementBbe7c1f7154ad5b5bf9d4d28301e2b457.html) |
| | Partial specialization for column-major output. [More...](classcutlass_1_1gemm_1_1device_1_1GemmSplitKParallel_3_01ElementA 00_01LayoutA 00_01ElementBbe7c1f7154ad5b5bf9d4d28301e2b457.html#details)
|
| |
| struct | [cutlass::gemm::device::GemmSplitKParallel< ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, layout::ColumnMajor, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ConvertScaledOp_, ReductionOp_, ThreadblockSwizzle_, Stages, kAlignmentA, kAlignmentB, Operator_ >::Arguments](structcutlass_1_1gemm_1_1device_1_1GemmSplitKParallel_3_01ElementA 00_01LayoutA 00_01Elementafcb1aeaf2035a7ac769d7acc233423b.html) |
| | Argument structure. [More...](structcutlass_1_1gemm_1_1device_1_1GemmSplitKParallel_3_01ElementA 00_01LayoutA 00_01Elementafcb1aeaf2035a7ac769d7acc233423b.html#details)
|
| |
|
| | | cutlass | | | | | cutlass::gemm | | | | | cutlass::gemm::device | | |
Generated by 1.8.11