Back to Cutlass

CUTLASS: default_gemv.h Source File

docs/default__gemv_8h_source.html

4.4.217.1 KB
Original Source

| | CUTLASS

CUDA Templates for Linear Algebra Subroutines and Solvers |

default_gemv.h

Go to the documentation of this file.

1 /***************************************************************************************************

2 * Copyright (c) 2017-2019, NVIDIA CORPORATION. All rights reserved.

3 *

4 * Redistribution and use in source and binary forms, with or without modification, are permitted

5 * provided that the following conditions are met:

6 * * Redistributions of source code must retain the above copyright notice, this list of

7 * conditions and the following disclaimer.

8 * * Redistributions in binary form must reproduce the above copyright notice, this list of

9 * conditions and the following disclaimer in the documentation and/or other materials

10 * provided with the distribution.

11 * * Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used

12 * to endorse or promote products derived from this software without specific prior written

13 * permission.

14 *

15 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR

16 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND

17 * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE

18 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,

19 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

20 * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,

21 * STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

22 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

23 *

24 **************************************************************************************************/

25

26 #pragma once

27

28 #include "cutlass/gemm/threadblock/gemv.h"

29 #include "[cutlass/gemm/threadblock/default_gemv_core.h](default gemv core_8h.html)"

30 #include "cutlass/gemm/threadblock/threadblock_swizzle.h"

31

32 namespace cutlass {

33 namespace gemm {

34 namespace kernel {

35

37

38 template <

40typename ThreadBlockShape_,

42typename ThreadShape_,

44typename ElementA_,

46typename LayoutA_,

48typename ElementB_,

50typename LayoutB_,

52typename ElementCD_,

54typename LayoutCD_,

56typename ElementAccumulator_ = ElementCD_>

57 struct DefaultGemv {

58

60using ThreadBlockShape = ThreadBlockShape_;

61

63using ThreadShape = ThreadShape_;

64

66using ElementA = ElementA_;

67

69using LayoutA = LayoutA_;

70

72using ElementB = ElementB_;

73

75using LayoutB = LayoutB_;

76

78using ElementAccumulator = ElementAccumulator_;

79

81using LayoutAccumulator = LayoutCD_;

82

84using ElementCD = ElementCD_;

85

87using LayoutCD = LayoutCD_;

88

89// Define the core components

90using Core = typename cutlass::gemm::threadblock::DefaultGemvCore<

91ThreadBlockShape, ThreadShape, ElementA, LayoutA, ElementB, LayoutB,

92ElementAccumulator, LayoutAccumulator>;

93

94// Define the threadblock-scoped gemv

95using ThreadBlockGemv = cutlass::gemm::threadblock::Gemv<Core>;

96

97// Iterator for multiplicand A

98using IteratorA = typename ThreadBlockGemv::IteratorA;

99

100// Iterator for multiplicand B

101using IteratorB = typename ThreadBlockGemv::IteratorB;

102

104using IteratorPolicyCD = typename platform::conditional<

105platform::is_same<LayoutCD, layout::RowMajor>::value,

106cutlass::transform::PitchLinearTilePolicyStripminedThreadContiguous<

107layout::PitchLinearShape<ThreadBlockShape::kN, ThreadBlockShape::kM>, Core::kThreadsPerN, ThreadShape::kN>,

108cutlass::transform::PitchLinearTilePolicyStripminedThreadStrided<

109layout::PitchLinearShape<ThreadBlockShape::kM, ThreadBlockShape::kN>, Core::kThreadsPerN, ThreadShape::kM>>::type;

110

112using IteratorCD = cutlass::transform::threadblock::PredicatedTileIterator<

113cutlass::MatrixShape<ThreadBlockShape::kM, ThreadBlockShape::kN>, ElementCD, LayoutCD, 0, IteratorPolicyCD>;

114

116using FragmentCD = typename IteratorCD::Fragment;

117

118// Define the threadblock swizzle

119using ThreadBlockSwizzle = cutlass::gemm::threadblock::GemvBatchedStridedThreadblockDefaultSwizzle;

120 };

121

123

124 } // namespace kernel

125 } // namespace gemm

126 } // namespace cutlass

cutlass::MatrixShape

Describes the size of a matrix tile.

Definition: matrix_shape.h:42

cutlass

Definition: aligned_buffer.h:35

cutlass::platform::is_same

std::is_same (false specialization)

Definition: platform.h:394

cutlass::gemm::kernel::DefaultGemv::ElementAccumulator

ElementAccumulator_ ElementAccumulator

Data type of accumulators.

Definition: default_gemv.h:78

cutlass::gemm::kernel::DefaultGemv::IteratorB

typename ThreadBlockGemv::IteratorB IteratorB

Definition: default_gemv.h:101

cutlass::gemm::kernel::DefaultGemv::ThreadShape

ThreadShape_ ThreadShape

Shape of warp-level matrix operation (concept: GemmShape)

Definition: default_gemv.h:63

cutlass::gemm::kernel::DefaultGemv::LayoutCD

LayoutCD_ LayoutCD

Layout of input/output matrix C/D.

Definition: default_gemv.h:87

cutlass::gemm::kernel::DefaultGemv::LayoutAccumulator

LayoutCD_ LayoutAccumulator

Data type of accumulators (same as C/D)

Definition: default_gemv.h:81

cutlass::layout::PitchLinearShape

Template defining a shape used by pitch-linear operators.

Definition: pitch_linear.h:43

cutlass::gemm::kernel::DefaultGemv::ElementA

ElementA_ ElementA

Data type of multiplicand A.

Definition: default_gemv.h:66

cutlass::gemm::kernel::DefaultGemv::FragmentCD

typename IteratorCD::Fragment FragmentCD

Fragment storage for C/D.

Definition: default_gemv.h:116

cutlass::gemm::threadblock::Gemv

Structure to compute the matrix-vector product using SIMT math instructions.

Definition: gemv.h:50

cutlass::gemm::kernel::DefaultGemv::LayoutA

LayoutA_ LayoutA

Layout of multiplicand A.

Definition: default_gemv.h:69

cutlass::gemm::threadblock::DefaultGemvCore

Definition: default_gemv_core.h:68

cutlass::platform::conditional

std::conditional (true specialization)

Definition: platform.h:325

cutlass::gemm::kernel::DefaultGemv

Definition: default_gemv.h:57

cutlass::gemm::kernel::DefaultGemv::IteratorPolicyCD

typename platform::conditional< platform::is_same< LayoutCD, layout::RowMajor >::value, cutlass::transform::PitchLinearTilePolicyStripminedThreadContiguous< layout::PitchLinearShape< ThreadBlockShape::kN, ThreadBlockShape::kM >, Core::kThreadsPerN, ThreadShape::kN >, cutlass::transform::PitchLinearTilePolicyStripminedThreadStrided< layout::PitchLinearShape< ThreadBlockShape::kM, ThreadBlockShape::kN >, Core::kThreadsPerN, ThreadShape::kM >>::type IteratorPolicyCD

Policy for the iterator that reads/writes C/D.

Definition: default_gemv.h:109

cutlass::gemm::threadblock::Gemv::IteratorA

typename Core_::IteratorA IteratorA

Iterates over A in global memory.

Definition: gemv.h:58

cutlass::transform::threadblock::PredicatedTileIterator

Definition: transform/threadblock/predicated_tile_iterator.h:133

[default_gemv_core.h](default gemv core_8h.html)

Defines basic properties needed by CTA-level batched GEMV assuming expectations about data layout of ...

cutlass::gemm::threadblock::Gemv::IteratorB

typename Core_::IteratorB IteratorB

Iterates over B in global memory.

Definition: gemv.h:61

cutlass::gemm::threadblock::GemvBatchedStridedThreadblockDefaultSwizzle

Threadblock swizzling function for batched GEMVs.

Definition: gemm/threadblock/threadblock_swizzle.h:296

gemv.h

Template for a threadblock-scoped GEMV kernel.

cutlass::gemm::kernel::DefaultGemv::IteratorA

typename ThreadBlockGemv::IteratorA IteratorA

Definition: default_gemv.h:98

threadblock_swizzle.h

Implements several possible threadblock-swizzling functions mapping blockIdx to GEMM problems...

cutlass::gemm::kernel::DefaultGemv::Core

typename cutlass::gemm::threadblock::DefaultGemvCore< ThreadBlockShape, ThreadShape, ElementA, LayoutA, ElementB, LayoutB, ElementAccumulator, LayoutAccumulator > Core

Definition: default_gemv.h:92

cutlass::gemm::kernel::DefaultGemv::ElementB

ElementB_ ElementB

Data type of multiplicand B.

Definition: default_gemv.h:72

cutlass::transform::PitchLinearTilePolicyStripminedThreadStrided

Definition: pitch_linear_thread_map.h:168

cutlass::gemm::kernel::DefaultGemv::LayoutB

LayoutB_ LayoutB

Layout of multiplicand B.

Definition: default_gemv.h:75

cutlass::transform::PitchLinearTilePolicyStripminedThreadContiguous

Definition: pitch_linear_thread_map.h:140

cutlass::gemm::kernel::DefaultGemv::ThreadBlockShape

ThreadBlockShape_ ThreadBlockShape

Shape of Threadblock-level matrix operation (concept: GemmShape)

Definition: default_gemv.h:60

cutlass::gemm::kernel::DefaultGemv::ElementCD

ElementCD_ ElementCD

Data type of input/output matrix C/D.

Definition: default_gemv.h:84


Generated by 1.8.11