docs/default__gemm__splitk__parallel_8h_source.html
| | CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers |
default_gemm_splitk_parallel.h
[Go to the documentation of this file.](default gemm splitk__parallel_8h.html)
1 /***************************************************************************************************
2 * Copyright (c) 2017-2019, NVIDIA CORPORATION. All rights reserved.
3 *
4 * Redistribution and use in source and binary forms, with or without modification, are permitted
5 * provided that the following conditions are met:
6 * * Redistributions of source code must retain the above copyright notice, this list of
7 * conditions and the following disclaimer.
8 * * Redistributions in binary form must reproduce the above copyright notice, this list of
9 * conditions and the following disclaimer in the documentation and/or other materials
10 * provided with the distribution.
11 * * Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used
12 * to endorse or promote products derived from this software without specific prior written
13 * permission.
14 *
15 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
16 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
17 * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE
18 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
19 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
20 * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
21 * STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
22 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
23 *
24 **************************************************************************************************/
25
36 #pragma once
37
38 #include "cutlass/cutlass.h"
39 #include "cutlass/gemm/kernel/default_gemm.h"
40 #include "[cutlass/gemm/kernel/gemm_splitk_parallel.h](kernel_2gemm splitk parallel_8h.html)"
41
43
44 namespace cutlass {
45 namespace gemm {
46 namespace kernel {
47
49
50 template <
52typename ElementA_,
54typename LayoutA_,
56int kAlignmentA,
58typename ElementB_,
60typename LayoutB_,
62int kAlignmentB,
64typename ElementC_,
66typename LayoutC_,
68typename ElementAccumulator,
70typename OperatorClass,
72typename ArchTag,
74typename ThreadblockShape,
76typename WarpShape,
78typename InstructionShape,
80typename EpilogueOutputOp,
82typename ThreadblockSwizzle,
84int Stages,
86typename Operator
87 >
88 struct DefaultGemmSplitKParallel {
89
92using Default = DefaultGemm<
93 ElementA_,
94 LayoutA_,
95 kAlignmentA,
96 ElementB_,
97 LayoutB_,
98 kAlignmentB,
99 ElementAccumulator,
100 LayoutC_,
101 ElementAccumulator,
102 OperatorClass,
103 ArchTag,
104 ThreadblockShape,
105 WarpShape,
106 InstructionShape,
107 EpilogueOutputOp,
108 ThreadblockSwizzle,
109 Stages,
110false,
111 Operator
112 >;
113
115using Mma = typename Default::Mma;
116
118using Epilogue = typename Default::Epilogue;
119
121using GemmKernel = kernel::GemmSplitKParallel<Mma, Epilogue, ThreadblockSwizzle>;
122 };
123
125
126 } // namespace kernel
127 } // namespace gemm
128 } // namespace cutlass
129
cutlass::gemm::kernel::DefaultGemm
Definition: default_gemm.h:116
Definition: aligned_buffer.h:35
cutlass::gemm::kernel::DefaultGemmSplitKParallel
Definition: default_gemm_splitk_parallel.h:88
cutlass::gemm::kernel::DefaultGemmSplitKParallel::Mma
typename Default::Mma Mma
Define the matrix multiply operator.
Definition: default_gemm_splitk_parallel.h:115
[gemm_splitk_parallel.h](kernel_2gemm splitk parallel_8h.html)
Template for GEMM performing a reduction over K partitions in parallel.
Default kernel-level GEMM definitions combine threadblock-scoped matrix multiply-add with the appropr...
cutlass::gemm::kernel::DefaultGemmSplitKParallel::Epilogue
typename Default::Epilogue Epilogue
Define the epilogue.
Definition: default_gemm_splitk_parallel.h:118
cutlass::gemm::kernel::GemmSplitKParallel
Definition: kernel/gemm_splitk_parallel.h:49
Basic include for CUTLASS.
Generated by 1.8.11