Back to Cutlass

CUTLASS: default_gemm_splitk_parallel.h Source File

docs/default__gemm__splitk__parallel_8h_source.html

4.4.26.6 KB
Original Source

| | CUTLASS

CUDA Templates for Linear Algebra Subroutines and Solvers |

default_gemm_splitk_parallel.h

[Go to the documentation of this file.](default gemm splitk__parallel_8h.html)

1 /***************************************************************************************************

2 * Copyright (c) 2017-2019, NVIDIA CORPORATION. All rights reserved.

3 *

4 * Redistribution and use in source and binary forms, with or without modification, are permitted

5 * provided that the following conditions are met:

6 * * Redistributions of source code must retain the above copyright notice, this list of

7 * conditions and the following disclaimer.

8 * * Redistributions in binary form must reproduce the above copyright notice, this list of

9 * conditions and the following disclaimer in the documentation and/or other materials

10 * provided with the distribution.

11 * * Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used

12 * to endorse or promote products derived from this software without specific prior written

13 * permission.

14 *

15 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR

16 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND

17 * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE

18 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,

19 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

20 * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,

21 * STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

22 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

23 *

24 **************************************************************************************************/

25

36 #pragma once

37

38 #include "cutlass/cutlass.h"

39 #include "cutlass/gemm/kernel/default_gemm.h"

40 #include "[cutlass/gemm/kernel/gemm_splitk_parallel.h](kernel_2gemm splitk parallel_8h.html)"

41

43

44 namespace cutlass {

45 namespace gemm {

46 namespace kernel {

47

49

50 template <

52typename ElementA_,

54typename LayoutA_,

56int kAlignmentA,

58typename ElementB_,

60typename LayoutB_,

62int kAlignmentB,

64typename ElementC_,

66typename LayoutC_,

68typename ElementAccumulator,

70typename OperatorClass,

72typename ArchTag,

74typename ThreadblockShape,

76typename WarpShape,

78typename InstructionShape,

80typename EpilogueOutputOp,

82typename ThreadblockSwizzle,

84int Stages,

86typename Operator

87 >

88 struct DefaultGemmSplitKParallel {

89

92using Default = DefaultGemm<

93 ElementA_,

94 LayoutA_,

95 kAlignmentA,

96 ElementB_,

97 LayoutB_,

98 kAlignmentB,

99 ElementAccumulator,

100 LayoutC_,

101 ElementAccumulator,

102 OperatorClass,

103 ArchTag,

104 ThreadblockShape,

105 WarpShape,

106 InstructionShape,

107 EpilogueOutputOp,

108 ThreadblockSwizzle,

109 Stages,

110false,

111 Operator

112 >;

113

115using Mma = typename Default::Mma;

116

118using Epilogue = typename Default::Epilogue;

119

121using GemmKernel = kernel::GemmSplitKParallel<Mma, Epilogue, ThreadblockSwizzle>;

122 };

123

125

126 } // namespace kernel

127 } // namespace gemm

128 } // namespace cutlass

129

cutlass::gemm::kernel::DefaultGemm

Definition: default_gemm.h:116

cutlass

Definition: aligned_buffer.h:35

cutlass::gemm::kernel::DefaultGemmSplitKParallel

Definition: default_gemm_splitk_parallel.h:88

cutlass::gemm::kernel::DefaultGemmSplitKParallel::Mma

typename Default::Mma Mma

Define the matrix multiply operator.

Definition: default_gemm_splitk_parallel.h:115

[gemm_splitk_parallel.h](kernel_2gemm splitk parallel_8h.html)

Template for GEMM performing a reduction over K partitions in parallel.

default_gemm.h

Default kernel-level GEMM definitions combine threadblock-scoped matrix multiply-add with the appropr...

cutlass::gemm::kernel::DefaultGemmSplitKParallel::Epilogue

typename Default::Epilogue Epilogue

Define the epilogue.

Definition: default_gemm_splitk_parallel.h:118

cutlass::gemm::kernel::GemmSplitKParallel

Definition: kernel/gemm_splitk_parallel.h:49

cutlass.h

Basic include for CUTLASS.


Generated by 1.8.11