docs/Operators.md
This file is automatically generated from the def files via this script. Do not modify directly and instead edit operator definitions.
For an operator input/output's differentiability, it can be differentiable, non-differentiable, or undefined. If a variable's differentiability is not specified, that variable has undefined differentiability.
| Operator | Since version | |
|---|---|---|
| <a href="#Abs">Abs</a> | <a href="Changelog.md#Abs-13">13</a>, <a href="Changelog.md#Abs-6">6</a>, <a href="Changelog.md#Abs-1">1</a> | |
| <a href="#Acos">Acos</a> | <a href="Changelog.md#Acos-22">22</a>, <a href="Changelog.md#Acos-7">7</a> | |
| <a href="#Acosh">Acosh</a> | <a href="Changelog.md#Acosh-22">22</a>, <a href="Changelog.md#Acosh-9">9</a> | |
| <a href="#Add">Add</a> | <a href="Changelog.md#Add-14">14</a>, <a href="Changelog.md#Add-13">13</a>, <a href="Changelog.md#Add-7">7</a>, <a href="Changelog.md#Add-6">6</a>, <a href="Changelog.md#Add-1">1</a> | |
| <a href="#And">And</a> | <a href="Changelog.md#And-7">7</a>, <a href="Changelog.md#And-1">1</a> | |
| <a href="#ArgMax">ArgMax</a> | <a href="Changelog.md#ArgMax-13">13</a>, <a href="Changelog.md#ArgMax-12">12</a>, <a href="Changelog.md#ArgMax-11">11</a>, <a href="Changelog.md#ArgMax-1">1</a> | |
| <a href="#ArgMin">ArgMin</a> | <a href="Changelog.md#ArgMin-13">13</a>, <a href="Changelog.md#ArgMin-12">12</a>, <a href="Changelog.md#ArgMin-11">11</a>, <a href="Changelog.md#ArgMin-1">1</a> | |
| <a href="#Asin">Asin</a> | <a href="Changelog.md#Asin-22">22</a>, <a href="Changelog.md#Asin-7">7</a> | |
| <a href="#Asinh">Asinh</a> | <a href="Changelog.md#Asinh-22">22</a>, <a href="Changelog.md#Asinh-9">9</a> | |
| <a href="#Atan">Atan</a> | <a href="Changelog.md#Atan-22">22</a>, <a href="Changelog.md#Atan-7">7</a> | |
| <a href="#Atanh">Atanh</a> | <a href="Changelog.md#Atanh-22">22</a>, <a href="Changelog.md#Atanh-9">9</a> | |
| <a href="#AveragePool">AveragePool</a> | <a href="Changelog.md#AveragePool-22">22</a>, <a href="Changelog.md#AveragePool-19">19</a>, <a href="Changelog.md#AveragePool-11">11</a>, <a href="Changelog.md#AveragePool-10">10</a>, <a href="Changelog.md#AveragePool-7">7</a>, <a href="Changelog.md#AveragePool-1">1</a> | |
| <a href="#BatchNormalization">BatchNormalization</a> | <a href="Changelog.md#BatchNormalization-15">15</a>, <a href="Changelog.md#BatchNormalization-14">14</a>, <a href="Changelog.md#BatchNormalization-9">9</a>, <a href="Changelog.md#BatchNormalization-7">7</a>, <a href="Changelog.md#BatchNormalization-6">6</a>, <a href="Changelog.md#BatchNormalization-1">1</a> | |
| <a href="#BitCast">BitCast</a> | <a href="Changelog.md#BitCast-26">26</a> | |
| <a href="#BitShift">BitShift</a> | <a href="Changelog.md#BitShift-11">11</a> | |
| <a href="#BitwiseAnd">BitwiseAnd</a> | <a href="Changelog.md#BitwiseAnd-18">18</a> | |
| <a href="#BitwiseNot">BitwiseNot</a> | <a href="Changelog.md#BitwiseNot-18">18</a> | |
| <a href="#BitwiseOr">BitwiseOr</a> | <a href="Changelog.md#BitwiseOr-18">18</a> | |
| <a href="#BitwiseXor">BitwiseXor</a> | <a href="Changelog.md#BitwiseXor-18">18</a> | |
| <a href="#Cast">Cast</a> | <a href="Changelog.md#Cast-25">25</a>, <a href="Changelog.md#Cast-24">24</a>, <a href="Changelog.md#Cast-23">23</a>, <a href="Changelog.md#Cast-21">21</a>, <a href="Changelog.md#Cast-19">19</a>, <a href="Changelog.md#Cast-13">13</a>, <a href="Changelog.md#Cast-9">9</a>, <a href="Changelog.md#Cast-6">6</a>, <a href="Changelog.md#Cast-1">1</a> | |
| <a href="#Ceil">Ceil</a> | <a href="Changelog.md#Ceil-13">13</a>, <a href="Changelog.md#Ceil-6">6</a>, <a href="Changelog.md#Ceil-1">1</a> | |
| <a href="#Col2Im">Col2Im</a> | <a href="Changelog.md#Col2Im-18">18</a> | |
| <a href="#Compress">Compress</a> | <a href="Changelog.md#Compress-11">11</a>, <a href="Changelog.md#Compress-9">9</a> | |
| <a href="#Concat">Concat</a> | <a href="Changelog.md#Concat-13">13</a>, <a href="Changelog.md#Concat-11">11</a>, <a href="Changelog.md#Concat-4">4</a>, <a href="Changelog.md#Concat-1">1</a> | |
| <a href="#ConcatFromSequence">ConcatFromSequence</a> | <a href="Changelog.md#ConcatFromSequence-11">11</a> | |
| <a href="#Constant">Constant</a> | <a href="Changelog.md#Constant-25">25</a>, <a href="Changelog.md#Constant-24">24</a>, <a href="Changelog.md#Constant-23">23</a>, <a href="Changelog.md#Constant-21">21</a>, <a href="Changelog.md#Constant-19">19</a>, <a href="Changelog.md#Constant-13">13</a>, <a href="Changelog.md#Constant-12">12</a>, <a href="Changelog.md#Constant-11">11</a>, <a href="Changelog.md#Constant-9">9</a>, <a href="Changelog.md#Constant-1">1</a> | |
| <a href="#ConstantOfShape">ConstantOfShape</a> | <a href="Changelog.md#ConstantOfShape-25">25</a>, <a href="Changelog.md#ConstantOfShape-24">24</a>, <a href="Changelog.md#ConstantOfShape-23">23</a>, <a href="Changelog.md#ConstantOfShape-21">21</a>, <a href="Changelog.md#ConstantOfShape-20">20</a>, <a href="Changelog.md#ConstantOfShape-9">9</a> | |
| <a href="#Conv">Conv</a> | <a href="Changelog.md#Conv-22">22</a>, <a href="Changelog.md#Conv-11">11</a>, <a href="Changelog.md#Conv-1">1</a> | |
| <a href="#ConvInteger">ConvInteger</a> | <a href="Changelog.md#ConvInteger-10">10</a> | |
| <a href="#ConvTranspose">ConvTranspose</a> | <a href="Changelog.md#ConvTranspose-22">22</a>, <a href="Changelog.md#ConvTranspose-11">11</a>, <a href="Changelog.md#ConvTranspose-1">1</a> | |
| <a href="#Cos">Cos</a> | <a href="Changelog.md#Cos-22">22</a>, <a href="Changelog.md#Cos-7">7</a> | |
| <a href="#Cosh">Cosh</a> | <a href="Changelog.md#Cosh-22">22</a>, <a href="Changelog.md#Cosh-9">9</a> | |
| <a href="#CumProd">CumProd</a> | <a href="Changelog.md#CumProd-26">26</a> | |
| <a href="#CumSum">CumSum</a> | <a href="Changelog.md#CumSum-14">14</a>, <a href="Changelog.md#CumSum-11">11</a> | |
| <a href="#DFT">DFT</a> | <a href="Changelog.md#DFT-20">20</a>, <a href="Changelog.md#DFT-17">17</a> | |
| <a href="#DeformConv">DeformConv</a> | <a href="Changelog.md#DeformConv-22">22</a>, <a href="Changelog.md#DeformConv-19">19</a> | |
| <a href="#DepthToSpace">DepthToSpace</a> | <a href="Changelog.md#DepthToSpace-13">13</a>, <a href="Changelog.md#DepthToSpace-11">11</a>, <a href="Changelog.md#DepthToSpace-1">1</a> | |
| <a href="#DequantizeLinear">DequantizeLinear</a> | <a href="Changelog.md#DequantizeLinear-25">25</a>, <a href="Changelog.md#DequantizeLinear-24">24</a>, <a href="Changelog.md#DequantizeLinear-23">23</a>, <a href="Changelog.md#DequantizeLinear-21">21</a>, <a href="Changelog.md#DequantizeLinear-19">19</a>, <a href="Changelog.md#DequantizeLinear-13">13</a>, <a href="Changelog.md#DequantizeLinear-10">10</a> | |
| <a href="#Det">Det</a> | <a href="Changelog.md#Det-22">22</a>, <a href="Changelog.md#Det-11">11</a> | |
| <a href="#Div">Div</a> | <a href="Changelog.md#Div-14">14</a>, <a href="Changelog.md#Div-13">13</a>, <a href="Changelog.md#Div-7">7</a>, <a href="Changelog.md#Div-6">6</a>, <a href="Changelog.md#Div-1">1</a> | |
| <a href="#Dropout">Dropout</a> | <a href="Changelog.md#Dropout-22">22</a>, <a href="Changelog.md#Dropout-13">13</a>, <a href="Changelog.md#Dropout-12">12</a>, <a href="Changelog.md#Dropout-10">10</a>, <a href="Changelog.md#Dropout-7">7</a>, <a href="Changelog.md#Dropout-6">6</a>, <a href="Changelog.md#Dropout-1">1</a> | |
| <a href="#Einsum">Einsum</a> | <a href="Changelog.md#Einsum-12">12</a> | |
| <a href="#Equal">Equal</a> | <a href="Changelog.md#Equal-19">19</a>, <a href="Changelog.md#Equal-13">13</a>, <a href="Changelog.md#Equal-11">11</a>, <a href="Changelog.md#Equal-7">7</a>, <a href="Changelog.md#Equal-1">1</a> | |
| <a href="#Erf">Erf</a> | <a href="Changelog.md#Erf-13">13</a>, <a href="Changelog.md#Erf-9">9</a> | |
| <a href="#Exp">Exp</a> | <a href="Changelog.md#Exp-13">13</a>, <a href="Changelog.md#Exp-6">6</a>, <a href="Changelog.md#Exp-1">1</a> | |
| <a href="#Expand">Expand</a> | <a href="Changelog.md#Expand-13">13</a>, <a href="Changelog.md#Expand-8">8</a> | |
| <a href="#EyeLike">EyeLike</a> | <a href="Changelog.md#EyeLike-22">22</a>, <a href="Changelog.md#EyeLike-9">9</a> | |
| <a href="#Flatten">Flatten</a> | <a href="Changelog.md#Flatten-25">25</a>, <a href="Changelog.md#Flatten-24">24</a>, <a href="Changelog.md#Flatten-23">23</a>, <a href="Changelog.md#Flatten-21">21</a>, <a href="Changelog.md#Flatten-13">13</a>, <a href="Changelog.md#Flatten-11">11</a>, <a href="Changelog.md#Flatten-9">9</a>, <a href="Changelog.md#Flatten-1">1</a> | |
| <a href="#Floor">Floor</a> | <a href="Changelog.md#Floor-13">13</a>, <a href="Changelog.md#Floor-6">6</a>, <a href="Changelog.md#Floor-1">1</a> | |
| <a href="#GRU">GRU</a> | <a href="Changelog.md#GRU-22">22</a>, <a href="Changelog.md#GRU-14">14</a>, <a href="Changelog.md#GRU-7">7</a>, <a href="Changelog.md#GRU-3">3</a>, <a href="Changelog.md#GRU-1">1</a> | |
| <a href="#Gather">Gather</a> | <a href="Changelog.md#Gather-13">13</a>, <a href="Changelog.md#Gather-11">11</a>, <a href="Changelog.md#Gather-1">1</a> | |
| <a href="#GatherElements">GatherElements</a> | <a href="Changelog.md#GatherElements-13">13</a>, <a href="Changelog.md#GatherElements-11">11</a> | |
| <a href="#GatherND">GatherND</a> | <a href="Changelog.md#GatherND-13">13</a>, <a href="Changelog.md#GatherND-12">12</a>, <a href="Changelog.md#GatherND-11">11</a> | |
| <a href="#Gemm">Gemm</a> | <a href="Changelog.md#Gemm-13">13</a>, <a href="Changelog.md#Gemm-11">11</a>, <a href="Changelog.md#Gemm-9">9</a>, <a href="Changelog.md#Gemm-7">7</a>, <a href="Changelog.md#Gemm-6">6</a>, <a href="Changelog.md#Gemm-1">1</a> | |
| <a href="#GlobalAveragePool">GlobalAveragePool</a> | <a href="Changelog.md#GlobalAveragePool-22">22</a>, <a href="Changelog.md#GlobalAveragePool-1">1</a> | |
| <a href="#GlobalLpPool">GlobalLpPool</a> | <a href="Changelog.md#GlobalLpPool-22">22</a>, <a href="Changelog.md#GlobalLpPool-2">2</a>, <a href="Changelog.md#GlobalLpPool-1">1</a> | |
| <a href="#GlobalMaxPool">GlobalMaxPool</a> | <a href="Changelog.md#GlobalMaxPool-22">22</a>, <a href="Changelog.md#GlobalMaxPool-1">1</a> | |
| <a href="#Greater">Greater</a> | <a href="Changelog.md#Greater-13">13</a>, <a href="Changelog.md#Greater-9">9</a>, <a href="Changelog.md#Greater-7">7</a>, <a href="Changelog.md#Greater-1">1</a> | |
| <a href="#GridSample">GridSample</a> | <a href="Changelog.md#GridSample-22">22</a>, <a href="Changelog.md#GridSample-20">20</a>, <a href="Changelog.md#GridSample-16">16</a> | |
| <a href="#Hardmax">Hardmax</a> | <a href="Changelog.md#Hardmax-13">13</a>, <a href="Changelog.md#Hardmax-11">11</a>, <a href="Changelog.md#Hardmax-1">1</a> | |
| <a href="#Identity">Identity</a> | <a href="Changelog.md#Identity-25">25</a>, <a href="Changelog.md#Identity-24">24</a>, <a href="Changelog.md#Identity-23">23</a>, <a href="Changelog.md#Identity-21">21</a>, <a href="Changelog.md#Identity-19">19</a>, <a href="Changelog.md#Identity-16">16</a>, <a href="Changelog.md#Identity-14">14</a>, <a href="Changelog.md#Identity-13">13</a>, <a href="Changelog.md#Identity-1">1</a> | |
| <a href="#If">If</a> | <a href="Changelog.md#If-25">25</a>, <a href="Changelog.md#If-24">24</a>, <a href="Changelog.md#If-23">23</a>, <a href="Changelog.md#If-21">21</a>, <a href="Changelog.md#If-19">19</a>, <a href="Changelog.md#If-16">16</a>, <a href="Changelog.md#If-13">13</a>, <a href="Changelog.md#If-11">11</a>, <a href="Changelog.md#If-1">1</a> | |
| <a href="#ImageDecoder">ImageDecoder</a> | <a href="Changelog.md#ImageDecoder-20">20</a> | |
| <a href="#InstanceNormalization">InstanceNormalization</a> | <a href="Changelog.md#InstanceNormalization-22">22</a>, <a href="Changelog.md#InstanceNormalization-6">6</a>, <a href="Changelog.md#InstanceNormalization-1">1</a> | |
| <a href="#IsInf">IsInf</a> | <a href="Changelog.md#IsInf-20">20</a>, <a href="Changelog.md#IsInf-10">10</a> | |
| <a href="#IsNaN">IsNaN</a> | <a href="Changelog.md#IsNaN-20">20</a>, <a href="Changelog.md#IsNaN-13">13</a>, <a href="Changelog.md#IsNaN-9">9</a> | |
| <a href="#LRN">LRN</a> | <a href="Changelog.md#LRN-13">13</a>, <a href="Changelog.md#LRN-1">1</a> | |
| <a href="#LSTM">LSTM</a> | <a href="Changelog.md#LSTM-22">22</a>, <a href="Changelog.md#LSTM-14">14</a>, <a href="Changelog.md#LSTM-7">7</a>, <a href="Changelog.md#LSTM-1">1</a> | |
| <a href="#Less">Less</a> | <a href="Changelog.md#Less-13">13</a>, <a href="Changelog.md#Less-9">9</a>, <a href="Changelog.md#Less-7">7</a>, <a href="Changelog.md#Less-1">1</a> | |
| <a href="#Log">Log</a> | <a href="Changelog.md#Log-13">13</a>, <a href="Changelog.md#Log-6">6</a>, <a href="Changelog.md#Log-1">1</a> | |
| <a href="#Loop">Loop</a> | <a href="Changelog.md#Loop-25">25</a>, <a href="Changelog.md#Loop-24">24</a>, <a href="Changelog.md#Loop-23">23</a>, <a href="Changelog.md#Loop-21">21</a>, <a href="Changelog.md#Loop-19">19</a>, <a href="Changelog.md#Loop-16">16</a>, <a href="Changelog.md#Loop-13">13</a>, <a href="Changelog.md#Loop-11">11</a>, <a href="Changelog.md#Loop-1">1</a> | |
| <a href="#LpNormalization">LpNormalization</a> | <a href="Changelog.md#LpNormalization-22">22</a>, <a href="Changelog.md#LpNormalization-1">1</a> | |
| <a href="#LpPool">LpPool</a> | <a href="Changelog.md#LpPool-22">22</a>, <a href="Changelog.md#LpPool-18">18</a>, <a href="Changelog.md#LpPool-11">11</a>, <a href="Changelog.md#LpPool-2">2</a>, <a href="Changelog.md#LpPool-1">1</a> | |
| <a href="#MatMul">MatMul</a> | <a href="Changelog.md#MatMul-13">13</a>, <a href="Changelog.md#MatMul-9">9</a>, <a href="Changelog.md#MatMul-1">1</a> | |
| <a href="#MatMulInteger">MatMulInteger</a> | <a href="Changelog.md#MatMulInteger-10">10</a> | |
| <a href="#Max">Max</a> | <a href="Changelog.md#Max-13">13</a>, <a href="Changelog.md#Max-12">12</a>, <a href="Changelog.md#Max-8">8</a>, <a href="Changelog.md#Max-6">6</a>, <a href="Changelog.md#Max-1">1</a> | |
| <a href="#MaxPool">MaxPool</a> | <a href="Changelog.md#MaxPool-22">22</a>, <a href="Changelog.md#MaxPool-12">12</a>, <a href="Changelog.md#MaxPool-11">11</a>, <a href="Changelog.md#MaxPool-10">10</a>, <a href="Changelog.md#MaxPool-8">8</a>, <a href="Changelog.md#MaxPool-1">1</a> | |
| <a href="#MaxRoiPool">MaxRoiPool</a> | <a href="Changelog.md#MaxRoiPool-22">22</a>, <a href="Changelog.md#MaxRoiPool-1">1</a> | |
| <a href="#MaxUnpool">MaxUnpool</a> | <a href="Changelog.md#MaxUnpool-22">22</a>, <a href="Changelog.md#MaxUnpool-11">11</a>, <a href="Changelog.md#MaxUnpool-9">9</a> | |
| <a href="#Mean">Mean</a> | <a href="Changelog.md#Mean-13">13</a>, <a href="Changelog.md#Mean-8">8</a>, <a href="Changelog.md#Mean-6">6</a>, <a href="Changelog.md#Mean-1">1</a> | |
| <a href="#MelWeightMatrix">MelWeightMatrix</a> | <a href="Changelog.md#MelWeightMatrix-17">17</a> | |
| <a href="#Min">Min</a> | <a href="Changelog.md#Min-13">13</a>, <a href="Changelog.md#Min-12">12</a>, <a href="Changelog.md#Min-8">8</a>, <a href="Changelog.md#Min-6">6</a>, <a href="Changelog.md#Min-1">1</a> | |
| <a href="#Mod">Mod</a> | <a href="Changelog.md#Mod-13">13</a>, <a href="Changelog.md#Mod-10">10</a> | |
| <a href="#Mul">Mul</a> | <a href="Changelog.md#Mul-14">14</a>, <a href="Changelog.md#Mul-13">13</a>, <a href="Changelog.md#Mul-7">7</a>, <a href="Changelog.md#Mul-6">6</a>, <a href="Changelog.md#Mul-1">1</a> | |
| <a href="#Multinomial">Multinomial</a> | <a href="Changelog.md#Multinomial-22">22</a>, <a href="Changelog.md#Multinomial-7">7</a> | |
| <a href="#Neg">Neg</a> | <a href="Changelog.md#Neg-13">13</a>, <a href="Changelog.md#Neg-6">6</a>, <a href="Changelog.md#Neg-1">1</a> | |
| <a href="#NonMaxSuppression">NonMaxSuppression</a> | <a href="Changelog.md#NonMaxSuppression-11">11</a>, <a href="Changelog.md#NonMaxSuppression-10">10</a> | |
| <a href="#NonZero">NonZero</a> | <a href="Changelog.md#NonZero-13">13</a>, <a href="Changelog.md#NonZero-9">9</a> | |
| <a href="#Not">Not</a> | <a href="Changelog.md#Not-1">1</a> | |
| <a href="#OneHot">OneHot</a> | <a href="Changelog.md#OneHot-11">11</a>, <a href="Changelog.md#OneHot-9">9</a> | |
| <a href="#Optional">Optional</a> | <a href="Changelog.md#Optional-15">15</a> | |
| <a href="#OptionalGetElement">OptionalGetElement</a> | <a href="Changelog.md#OptionalGetElement-18">18</a>, <a href="Changelog.md#OptionalGetElement-15">15</a> | |
| <a href="#OptionalHasElement">OptionalHasElement</a> | <a href="Changelog.md#OptionalHasElement-18">18</a>, <a href="Changelog.md#OptionalHasElement-15">15</a> | |
| <a href="#Or">Or</a> | <a href="Changelog.md#Or-7">7</a>, <a href="Changelog.md#Or-1">1</a> | |
| <a href="#Pad">Pad</a> | <a href="Changelog.md#Pad-25">25</a>, <a href="Changelog.md#Pad-24">24</a>, <a href="Changelog.md#Pad-23">23</a>, <a href="Changelog.md#Pad-21">21</a>, <a href="Changelog.md#Pad-19">19</a>, <a href="Changelog.md#Pad-18">18</a>, <a href="Changelog.md#Pad-13">13</a>, <a href="Changelog.md#Pad-11">11</a>, <a href="Changelog.md#Pad-2">2</a>, <a href="Changelog.md#Pad-1">1</a> | |
| <a href="#Pow">Pow</a> | <a href="Changelog.md#Pow-15">15</a>, <a href="Changelog.md#Pow-13">13</a>, <a href="Changelog.md#Pow-12">12</a>, <a href="Changelog.md#Pow-7">7</a>, <a href="Changelog.md#Pow-1">1</a> | |
| <a href="#QLinearConv">QLinearConv</a> | <a href="Changelog.md#QLinearConv-10">10</a> | |
| <a href="#QLinearMatMul">QLinearMatMul</a> | <a href="Changelog.md#QLinearMatMul-21">21</a>, <a href="Changelog.md#QLinearMatMul-10">10</a> | |
| <a href="#QuantizeLinear">QuantizeLinear</a> | <a href="Changelog.md#QuantizeLinear-25">25</a>, <a href="Changelog.md#QuantizeLinear-24">24</a>, <a href="Changelog.md#QuantizeLinear-23">23</a>, <a href="Changelog.md#QuantizeLinear-21">21</a>, <a href="Changelog.md#QuantizeLinear-19">19</a>, <a href="Changelog.md#QuantizeLinear-13">13</a>, <a href="Changelog.md#QuantizeLinear-10">10</a> | |
| <a href="#RNN">RNN</a> | <a href="Changelog.md#RNN-22">22</a>, <a href="Changelog.md#RNN-14">14</a>, <a href="Changelog.md#RNN-7">7</a>, <a href="Changelog.md#RNN-1">1</a> | |
| <a href="#RandomNormal">RandomNormal</a> | <a href="Changelog.md#RandomNormal-22">22</a>, <a href="Changelog.md#RandomNormal-1">1</a> | |
| <a href="#RandomNormalLike">RandomNormalLike</a> | <a href="Changelog.md#RandomNormalLike-22">22</a>, <a href="Changelog.md#RandomNormalLike-1">1</a> | |
| <a href="#RandomUniform">RandomUniform</a> | <a href="Changelog.md#RandomUniform-22">22</a>, <a href="Changelog.md#RandomUniform-1">1</a> | |
| <a href="#RandomUniformLike">RandomUniformLike</a> | <a href="Changelog.md#RandomUniformLike-22">22</a>, <a href="Changelog.md#RandomUniformLike-1">1</a> | |
| <a href="#Reciprocal">Reciprocal</a> | <a href="Changelog.md#Reciprocal-13">13</a>, <a href="Changelog.md#Reciprocal-6">6</a>, <a href="Changelog.md#Reciprocal-1">1</a> | |
| <a href="#ReduceMax">ReduceMax</a> | <a href="Changelog.md#ReduceMax-20">20</a>, <a href="Changelog.md#ReduceMax-18">18</a>, <a href="Changelog.md#ReduceMax-13">13</a>, <a href="Changelog.md#ReduceMax-12">12</a>, <a href="Changelog.md#ReduceMax-11">11</a>, <a href="Changelog.md#ReduceMax-1">1</a> | |
| <a href="#ReduceMean">ReduceMean</a> | <a href="Changelog.md#ReduceMean-18">18</a>, <a href="Changelog.md#ReduceMean-13">13</a>, <a href="Changelog.md#ReduceMean-11">11</a>, <a href="Changelog.md#ReduceMean-1">1</a> | |
| <a href="#ReduceMin">ReduceMin</a> | <a href="Changelog.md#ReduceMin-20">20</a>, <a href="Changelog.md#ReduceMin-18">18</a>, <a href="Changelog.md#ReduceMin-13">13</a>, <a href="Changelog.md#ReduceMin-12">12</a>, <a href="Changelog.md#ReduceMin-11">11</a>, <a href="Changelog.md#ReduceMin-1">1</a> | |
| <a href="#ReduceProd">ReduceProd</a> | <a href="Changelog.md#ReduceProd-18">18</a>, <a href="Changelog.md#ReduceProd-13">13</a>, <a href="Changelog.md#ReduceProd-11">11</a>, <a href="Changelog.md#ReduceProd-1">1</a> | |
| <a href="#ReduceSum">ReduceSum</a> | <a href="Changelog.md#ReduceSum-13">13</a>, <a href="Changelog.md#ReduceSum-11">11</a>, <a href="Changelog.md#ReduceSum-1">1</a> | |
| <a href="#RegexFullMatch">RegexFullMatch</a> | <a href="Changelog.md#RegexFullMatch-20">20</a> | |
| <a href="#Reshape">Reshape</a> | <a href="Changelog.md#Reshape-25">25</a>, <a href="Changelog.md#Reshape-24">24</a>, <a href="Changelog.md#Reshape-23">23</a>, <a href="Changelog.md#Reshape-21">21</a>, <a href="Changelog.md#Reshape-19">19</a>, <a href="Changelog.md#Reshape-14">14</a>, <a href="Changelog.md#Reshape-13">13</a>, <a href="Changelog.md#Reshape-5">5</a>, <a href="Changelog.md#Reshape-1">1</a> | |
| <a href="#Resize">Resize</a> | <a href="Changelog.md#Resize-19">19</a>, <a href="Changelog.md#Resize-18">18</a>, <a href="Changelog.md#Resize-13">13</a>, <a href="Changelog.md#Resize-11">11</a>, <a href="Changelog.md#Resize-10">10</a> | |
| <a href="#ReverseSequence">ReverseSequence</a> | <a href="Changelog.md#ReverseSequence-10">10</a> | |
| <a href="#RoiAlign">RoiAlign</a> | <a href="Changelog.md#RoiAlign-22">22</a>, <a href="Changelog.md#RoiAlign-16">16</a>, <a href="Changelog.md#RoiAlign-10">10</a> | |
| <a href="#Round">Round</a> | <a href="Changelog.md#Round-22">22</a>, <a href="Changelog.md#Round-11">11</a> | |
| <a href="#STFT">STFT</a> | <a href="Changelog.md#STFT-17">17</a> | |
| <a href="#Scan">Scan</a> | <a href="Changelog.md#Scan-25">25</a>, <a href="Changelog.md#Scan-24">24</a>, <a href="Changelog.md#Scan-23">23</a>, <a href="Changelog.md#Scan-21">21</a>, <a href="Changelog.md#Scan-19">19</a>, <a href="Changelog.md#Scan-16">16</a>, <a href="Changelog.md#Scan-11">11</a>, <a href="Changelog.md#Scan-9">9</a>, <a href="Changelog.md#Scan-8">8</a> | |
| <a href="#Scatter">Scatter</a> (deprecated) | <a href="Changelog.md#Scatter-11">11</a>, <a href="Changelog.md#Scatter-9">9</a> | |
| <a href="#ScatterElements">ScatterElements</a> | <a href="Changelog.md#ScatterElements-18">18</a>, <a href="Changelog.md#ScatterElements-16">16</a>, <a href="Changelog.md#ScatterElements-13">13</a>, <a href="Changelog.md#ScatterElements-11">11</a> | |
| <a href="#ScatterND">ScatterND</a> | <a href="Changelog.md#ScatterND-18">18</a>, <a href="Changelog.md#ScatterND-16">16</a>, <a href="Changelog.md#ScatterND-13">13</a>, <a href="Changelog.md#ScatterND-11">11</a> | |
| <a href="#SequenceAt">SequenceAt</a> | <a href="Changelog.md#SequenceAt-11">11</a> | |
| <a href="#SequenceConstruct">SequenceConstruct</a> | <a href="Changelog.md#SequenceConstruct-11">11</a> | |
| <a href="#SequenceEmpty">SequenceEmpty</a> | <a href="Changelog.md#SequenceEmpty-11">11</a> | |
| <a href="#SequenceErase">SequenceErase</a> | <a href="Changelog.md#SequenceErase-11">11</a> | |
| <a href="#SequenceInsert">SequenceInsert</a> | <a href="Changelog.md#SequenceInsert-11">11</a> | |
| <a href="#SequenceLength">SequenceLength</a> | <a href="Changelog.md#SequenceLength-11">11</a> | |
| <a href="#Shape">Shape</a> | <a href="Changelog.md#Shape-25">25</a>, <a href="Changelog.md#Shape-24">24</a>, <a href="Changelog.md#Shape-23">23</a>, <a href="Changelog.md#Shape-21">21</a>, <a href="Changelog.md#Shape-19">19</a>, <a href="Changelog.md#Shape-15">15</a>, <a href="Changelog.md#Shape-13">13</a>, <a href="Changelog.md#Shape-1">1</a> | |
| <a href="#Sigmoid">Sigmoid</a> | <a href="Changelog.md#Sigmoid-13">13</a>, <a href="Changelog.md#Sigmoid-6">6</a>, <a href="Changelog.md#Sigmoid-1">1</a> | |
| <a href="#Sign">Sign</a> | <a href="Changelog.md#Sign-13">13</a>, <a href="Changelog.md#Sign-9">9</a> | |
| <a href="#Sin">Sin</a> | <a href="Changelog.md#Sin-22">22</a>, <a href="Changelog.md#Sin-7">7</a> | |
| <a href="#Sinh">Sinh</a> | <a href="Changelog.md#Sinh-22">22</a>, <a href="Changelog.md#Sinh-9">9</a> | |
| <a href="#Size">Size</a> | <a href="Changelog.md#Size-25">25</a>, <a href="Changelog.md#Size-24">24</a>, <a href="Changelog.md#Size-23">23</a>, <a href="Changelog.md#Size-21">21</a>, <a href="Changelog.md#Size-19">19</a>, <a href="Changelog.md#Size-13">13</a>, <a href="Changelog.md#Size-1">1</a> | |
| <a href="#Slice">Slice</a> | <a href="Changelog.md#Slice-13">13</a>, <a href="Changelog.md#Slice-11">11</a>, <a href="Changelog.md#Slice-10">10</a>, <a href="Changelog.md#Slice-1">1</a> | |
| <a href="#SpaceToDepth">SpaceToDepth</a> | <a href="Changelog.md#SpaceToDepth-13">13</a>, <a href="Changelog.md#SpaceToDepth-1">1</a> | |
| <a href="#Split">Split</a> | <a href="Changelog.md#Split-18">18</a>, <a href="Changelog.md#Split-13">13</a>, <a href="Changelog.md#Split-11">11</a>, <a href="Changelog.md#Split-2">2</a>, <a href="Changelog.md#Split-1">1</a> | |
| <a href="#SplitToSequence">SplitToSequence</a> | <a href="Changelog.md#SplitToSequence-24">24</a>, <a href="Changelog.md#SplitToSequence-11">11</a> | |
| <a href="#Sqrt">Sqrt</a> | <a href="Changelog.md#Sqrt-13">13</a>, <a href="Changelog.md#Sqrt-6">6</a>, <a href="Changelog.md#Sqrt-1">1</a> | |
| <a href="#Squeeze">Squeeze</a> | <a href="Changelog.md#Squeeze-25">25</a>, <a href="Changelog.md#Squeeze-24">24</a>, <a href="Changelog.md#Squeeze-23">23</a>, <a href="Changelog.md#Squeeze-21">21</a>, <a href="Changelog.md#Squeeze-13">13</a>, <a href="Changelog.md#Squeeze-11">11</a>, <a href="Changelog.md#Squeeze-1">1</a> | |
| <a href="#StringConcat">StringConcat</a> | <a href="Changelog.md#StringConcat-20">20</a> | |
| <a href="#StringNormalizer">StringNormalizer</a> | <a href="Changelog.md#StringNormalizer-10">10</a> | |
| <a href="#StringSplit">StringSplit</a> | <a href="Changelog.md#StringSplit-20">20</a> | |
| <a href="#Sub">Sub</a> | <a href="Changelog.md#Sub-14">14</a>, <a href="Changelog.md#Sub-13">13</a>, <a href="Changelog.md#Sub-7">7</a>, <a href="Changelog.md#Sub-6">6</a>, <a href="Changelog.md#Sub-1">1</a> | |
| <a href="#Sum">Sum</a> | <a href="Changelog.md#Sum-13">13</a>, <a href="Changelog.md#Sum-8">8</a>, <a href="Changelog.md#Sum-6">6</a>, <a href="Changelog.md#Sum-1">1</a> | |
| <a href="#Tan">Tan</a> | <a href="Changelog.md#Tan-22">22</a>, <a href="Changelog.md#Tan-7">7</a> | |
| <a href="#Tanh">Tanh</a> | <a href="Changelog.md#Tanh-13">13</a>, <a href="Changelog.md#Tanh-6">6</a>, <a href="Changelog.md#Tanh-1">1</a> | |
| <a href="#TensorScatter">TensorScatter</a> | <a href="Changelog.md#TensorScatter-24">24</a> | |
| <a href="#TfIdfVectorizer">TfIdfVectorizer</a> | <a href="Changelog.md#TfIdfVectorizer-9">9</a> | |
| <a href="#Tile">Tile</a> | <a href="Changelog.md#Tile-13">13</a>, <a href="Changelog.md#Tile-6">6</a>, <a href="Changelog.md#Tile-1">1</a> | |
| <a href="#TopK">TopK</a> | <a href="Changelog.md#TopK-24">24</a>, <a href="Changelog.md#TopK-11">11</a>, <a href="Changelog.md#TopK-10">10</a>, <a href="Changelog.md#TopK-1">1</a> | |
| <a href="#Transpose">Transpose</a> | <a href="Changelog.md#Transpose-25">25</a>, <a href="Changelog.md#Transpose-24">24</a>, <a href="Changelog.md#Transpose-23">23</a>, <a href="Changelog.md#Transpose-21">21</a>, <a href="Changelog.md#Transpose-13">13</a>, <a href="Changelog.md#Transpose-1">1</a> | |
| <a href="#Trilu">Trilu</a> | <a href="Changelog.md#Trilu-14">14</a> | |
| <a href="#Unique">Unique</a> | <a href="Changelog.md#Unique-11">11</a> | |
| <a href="#Unsqueeze">Unsqueeze</a> | <a href="Changelog.md#Unsqueeze-25">25</a>, <a href="Changelog.md#Unsqueeze-24">24</a>, <a href="Changelog.md#Unsqueeze-23">23</a>, <a href="Changelog.md#Unsqueeze-21">21</a>, <a href="Changelog.md#Unsqueeze-13">13</a>, <a href="Changelog.md#Unsqueeze-11">11</a>, <a href="Changelog.md#Unsqueeze-1">1</a> | |
| <a href="#Upsample">Upsample</a> (deprecated) | <a href="Changelog.md#Upsample-10">10</a>, <a href="Changelog.md#Upsample-9">9</a>, <a href="Changelog.md#Upsample-7">7</a> | |
| <a href="#Where">Where</a> | <a href="Changelog.md#Where-16">16</a>, <a href="Changelog.md#Where-9">9</a> | |
| <a href="#Xor">Xor</a> | <a href="Changelog.md#Xor-7">7</a>, <a href="Changelog.md#Xor-1">1</a> | |
| Function | Since version | Function version |
| <a href="#AffineGrid">AffineGrid</a> | <a href="Changelog.md#AffineGrid-20">20</a> | 20 |
| <a href="#Attention">Attention</a> | <a href="Changelog.md#Attention-24">24</a>, <a href="Changelog.md#Attention-23">23</a> | 24 |
| <a href="#Bernoulli">Bernoulli</a> | <a href="Changelog.md#Bernoulli-22">22</a>, <a href="Changelog.md#Bernoulli-15">15</a> | 22 |
| <a href="#BlackmanWindow">BlackmanWindow</a> | <a href="Changelog.md#BlackmanWindow-17">17</a> | 17 |
| <a href="#CastLike">CastLike</a> | <a href="Changelog.md#CastLike-25">25</a>, <a href="Changelog.md#CastLike-24">24</a>, <a href="Changelog.md#CastLike-23">23</a>, <a href="Changelog.md#CastLike-21">21</a>, <a href="Changelog.md#CastLike-19">19</a>, <a href="Changelog.md#CastLike-15">15</a> | 25 |
| <a href="#Celu">Celu</a> | <a href="Changelog.md#Celu-12">12</a> | 12 |
| <a href="#CenterCropPad">CenterCropPad</a> | <a href="Changelog.md#CenterCropPad-18">18</a> | 18 |
| <a href="#Clip">Clip</a> | <a href="Changelog.md#Clip-13">13</a>, <a href="Changelog.md#Clip-12">12</a>, <a href="Changelog.md#Clip-11">11</a>, <a href="Changelog.md#Clip-6">6</a>, <a href="Changelog.md#Clip-1">1</a> | 13 |
| <a href="#DynamicQuantizeLinear">DynamicQuantizeLinear</a> | <a href="Changelog.md#DynamicQuantizeLinear-11">11</a> | 11 |
| <a href="#Elu">Elu</a> | <a href="Changelog.md#Elu-22">22</a>, <a href="Changelog.md#Elu-6">6</a>, <a href="Changelog.md#Elu-1">1</a> | 18 |
| <a href="#Gelu">Gelu</a> | <a href="Changelog.md#Gelu-20">20</a> | 20 |
| <a href="#GreaterOrEqual">GreaterOrEqual</a> | <a href="Changelog.md#GreaterOrEqual-16">16</a>, <a href="Changelog.md#GreaterOrEqual-12">12</a> | 16 |
| <a href="#GroupNormalization">GroupNormalization</a> | <a href="Changelog.md#GroupNormalization-21">21</a>, <a href="Changelog.md#GroupNormalization-18">18</a> | 21 |
| <a href="#HammingWindow">HammingWindow</a> | <a href="Changelog.md#HammingWindow-17">17</a> | 17 |
| <a href="#HannWindow">HannWindow</a> | <a href="Changelog.md#HannWindow-17">17</a> | 17 |
| <a href="#HardSigmoid">HardSigmoid</a> | <a href="Changelog.md#HardSigmoid-22">22</a>, <a href="Changelog.md#HardSigmoid-6">6</a>, <a href="Changelog.md#HardSigmoid-1">1</a> | 18 |
| <a href="#HardSwish">HardSwish</a> | <a href="Changelog.md#HardSwish-22">22</a>, <a href="Changelog.md#HardSwish-14">14</a> | 22 |
| <a href="#LayerNormalization">LayerNormalization</a> | <a href="Changelog.md#LayerNormalization-17">17</a> | 17, 18 |
| <a href="#LeakyRelu">LeakyRelu</a> | <a href="Changelog.md#LeakyRelu-16">16</a>, <a href="Changelog.md#LeakyRelu-6">6</a>, <a href="Changelog.md#LeakyRelu-1">1</a> | 16 |
| <a href="#LessOrEqual">LessOrEqual</a> | <a href="Changelog.md#LessOrEqual-16">16</a>, <a href="Changelog.md#LessOrEqual-12">12</a> | 16 |
| <a href="#LogSoftmax">LogSoftmax</a> | <a href="Changelog.md#LogSoftmax-13">13</a>, <a href="Changelog.md#LogSoftmax-11">11</a>, <a href="Changelog.md#LogSoftmax-1">1</a> | 13, 18 |
| <a href="#MeanVarianceNormalization">MeanVarianceNormalization</a> | <a href="Changelog.md#MeanVarianceNormalization-13">13</a>, <a href="Changelog.md#MeanVarianceNormalization-9">9</a> | 13, 18 |
| <a href="#Mish">Mish</a> | <a href="Changelog.md#Mish-22">22</a>, <a href="Changelog.md#Mish-18">18</a> | 22 |
| <a href="#NegativeLogLikelihoodLoss">NegativeLogLikelihoodLoss</a> | <a href="Changelog.md#NegativeLogLikelihoodLoss-22">22</a>, <a href="Changelog.md#NegativeLogLikelihoodLoss-13">13</a>, <a href="Changelog.md#NegativeLogLikelihoodLoss-12">12</a> | 22 |
| <a href="#PRelu">PRelu</a> | <a href="Changelog.md#PRelu-16">16</a>, <a href="Changelog.md#PRelu-9">9</a>, <a href="Changelog.md#PRelu-7">7</a>, <a href="Changelog.md#PRelu-6">6</a>, <a href="Changelog.md#PRelu-1">1</a> | 16 |
| <a href="#RMSNormalization">RMSNormalization</a> | <a href="Changelog.md#RMSNormalization-23">23</a> | 23 |
| <a href="#Range">Range</a> | <a href="Changelog.md#Range-11">11</a> | 11 |
| <a href="#ReduceL1">ReduceL1</a> | <a href="Changelog.md#ReduceL1-18">18</a>, <a href="Changelog.md#ReduceL1-13">13</a>, <a href="Changelog.md#ReduceL1-11">11</a>, <a href="Changelog.md#ReduceL1-1">1</a> | 18 |
| <a href="#ReduceL2">ReduceL2</a> | <a href="Changelog.md#ReduceL2-18">18</a>, <a href="Changelog.md#ReduceL2-13">13</a>, <a href="Changelog.md#ReduceL2-11">11</a>, <a href="Changelog.md#ReduceL2-1">1</a> | 18 |
| <a href="#ReduceLogSum">ReduceLogSum</a> | <a href="Changelog.md#ReduceLogSum-18">18</a>, <a href="Changelog.md#ReduceLogSum-13">13</a>, <a href="Changelog.md#ReduceLogSum-11">11</a>, <a href="Changelog.md#ReduceLogSum-1">1</a> | 18 |
| <a href="#ReduceLogSumExp">ReduceLogSumExp</a> | <a href="Changelog.md#ReduceLogSumExp-18">18</a>, <a href="Changelog.md#ReduceLogSumExp-13">13</a>, <a href="Changelog.md#ReduceLogSumExp-11">11</a>, <a href="Changelog.md#ReduceLogSumExp-1">1</a> | 18 |
| <a href="#ReduceSumSquare">ReduceSumSquare</a> | <a href="Changelog.md#ReduceSumSquare-18">18</a>, <a href="Changelog.md#ReduceSumSquare-13">13</a>, <a href="Changelog.md#ReduceSumSquare-11">11</a>, <a href="Changelog.md#ReduceSumSquare-1">1</a> | 18 |
| <a href="#Relu">Relu</a> | <a href="Changelog.md#Relu-14">14</a>, <a href="Changelog.md#Relu-13">13</a>, <a href="Changelog.md#Relu-6">6</a>, <a href="Changelog.md#Relu-1">1</a> | 18 |
| <a href="#RotaryEmbedding">RotaryEmbedding</a> | <a href="Changelog.md#RotaryEmbedding-23">23</a> | 23 |
| <a href="#Selu">Selu</a> | <a href="Changelog.md#Selu-22">22</a>, <a href="Changelog.md#Selu-6">6</a>, <a href="Changelog.md#Selu-1">1</a> | 18 |
| <a href="#SequenceMap">SequenceMap</a> | <a href="Changelog.md#SequenceMap-17">17</a> | 17 |
| <a href="#Shrink">Shrink</a> | <a href="Changelog.md#Shrink-9">9</a> | 18 |
| <a href="#Softmax">Softmax</a> | <a href="Changelog.md#Softmax-13">13</a>, <a href="Changelog.md#Softmax-11">11</a>, <a href="Changelog.md#Softmax-1">1</a> | 13, 18 |
| <a href="#SoftmaxCrossEntropyLoss">SoftmaxCrossEntropyLoss</a> | <a href="Changelog.md#SoftmaxCrossEntropyLoss-13">13</a>, <a href="Changelog.md#SoftmaxCrossEntropyLoss-12">12</a> | 13 |
| <a href="#Softplus">Softplus</a> | <a href="Changelog.md#Softplus-22">22</a>, <a href="Changelog.md#Softplus-1">1</a> | 18 |
| <a href="#Softsign">Softsign</a> | <a href="Changelog.md#Softsign-22">22</a>, <a href="Changelog.md#Softsign-1">1</a> | 18 |
| <a href="#Swish">Swish</a> | <a href="Changelog.md#Swish-24">24</a> | 24 |
| <a href="#ThresholdedRelu">ThresholdedRelu</a> | <a href="Changelog.md#ThresholdedRelu-22">22</a>, <a href="Changelog.md#ThresholdedRelu-10">10</a> | 18 |
| Operator | Since version | |
|---|---|---|
| <a href="#ai.onnx.preview.training.Adagrad">ai.onnx.preview.training.Adagrad</a> | <a href="Changelog.md#ai.onnx.preview.training.Adagrad-1">1</a> | |
| <a href="#ai.onnx.preview.training.Adam">ai.onnx.preview.training.Adam</a> | <a href="Changelog.md#ai.onnx.preview.training.Adam-1">1</a> | |
| <a href="#ai.onnx.preview.training.Gradient">ai.onnx.preview.training.Gradient</a> | <a href="Changelog.md#ai.onnx.preview.training.Gradient-1">1</a> | |
| <a href="#ai.onnx.preview.training.Momentum">ai.onnx.preview.training.Momentum</a> | <a href="Changelog.md#ai.onnx.preview.training.Momentum-1">1</a> |
Absolute takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where absolute value, y = abs(x), is applied to the tensor elementwise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Abs-1">1</a>, <a href="Changelog.md#Abs-6">6</a>
node = onnx.helper.make_node(
"Abs",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.abs(x)
expect(node, inputs=[x], outputs=[y], name="test_abs")
# SPDX-License-Identifier: Apache-2.0
from __future__ import annotations
import numpy as np
def abs(input: np.ndarray) -> np.ndarray: # noqa: A001
return np.abs(input) # type: ignore[no-any-return]
Calculates the arccosine (inverse of cosine) of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Acos-7">7</a>
node = onnx.helper.make_node(
"Acos",
inputs=["x"],
outputs=["y"],
)
x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arccos(x)
expect(node, inputs=[x], outputs=[y], name="test_acos_example")
x = np.random.rand(3, 4, 5).astype(np.float32)
y = np.arccos(x)
expect(node, inputs=[x], outputs=[y], name="test_acos")
Calculates the hyperbolic arccosine of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Acosh-9">9</a>
node = onnx.helper.make_node(
"Acosh",
inputs=["x"],
outputs=["y"],
)
x = np.array([10, np.e, 1]).astype(np.float32)
y = np.arccosh(x) # expected output [2.99322295, 1.65745449, 0.]
expect(node, inputs=[x], outputs=[y], name="test_acosh_example")
x = np.random.uniform(1.0, 10.0, (3, 4, 5)).astype(np.float32)
y = np.arccosh(x)
expect(node, inputs=[x], outputs=[y], name="test_acosh")
Performs element-wise binary addition (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Add-1">1</a>, <a href="Changelog.md#Add-6">6</a>, <a href="Changelog.md#Add-7">7</a>, <a href="Changelog.md#Add-13">13</a>
node = onnx.helper.make_node(
"Add",
inputs=["x", "y"],
outputs=["sum"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.int8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int8)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_int8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.int16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int16)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_uint64")
node = onnx.helper.make_node(
"Add",
inputs=["x", "y"],
outputs=["sum"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_bcast")
Generates a 2D or 3D flow field (sampling grid), given a batch of affine matrices theta
(https://pytorch.org/docs/stable/generated/torch.nn.functional.affine_grid.html).
An affine matrix theta is applied to a position tensor represented in its homogeneous expression. Here is an example in 3D:
[r00, r01, r02, t0] [x] [x']
[r10, r11, r12, t1] * [y] = [y']
[r20, r21, r22, t2] [z] [z']
[0, 0, 0, 1 ] [1] [1 ]
where (x, y, z) is the position in the original space, (x', y', z') is the position in the output space.
The last row is always [0, 0, 0, 1] and is not stored in the affine matrix. Therefore we have theta of shape (N, 2, 3) for 2D or (N, 3, 4) for 3D.
Input size is used to define grid of positions evenly spaced in the original 2D or 3D space, with dimensions ranging from -1 to 1.
The output grid contains positions in the output space.
When align_corners=1, consider -1 and 1 to refer to the centers of the corner pixels (mark v in illustration).
v v v v
|-------------------|------------------|
-1 0 1
When align_corners=0, consider -1 and 1 to refer to the outer edge of the corner pixels.
v v v v
|------------------|-------------------|
-1 0 1
This version of the operator has been available since version 20 of the default ONNX operator set.
theta_2d = create_theta_2d()
N, C, H, W = len(theta_2d), 3, 5, 6
data_size = (H, W)
for align_corners in (0, 1):
node = onnx.helper.make_node(
"AffineGrid",
inputs=["theta", "size"],
outputs=["grid"],
align_corners=align_corners,
)
original_grid = construct_original_grid(data_size, align_corners)
grid = apply_affine_transform(theta_2d, original_grid)
test_name = "test_affine_grid_2d"
if align_corners == 1:
test_name += "_align_corners"
expect(
node,
inputs=[theta_2d, np.array([N, C, H, W], dtype=np.int64)],
outputs=[grid],
name=test_name,
)
theta_3d = create_theta_3d()
N, C, D, H, W = len(theta_3d), 3, 4, 5, 6
data_size = (D, H, W)
for align_corners in (0, 1):
node = onnx.helper.make_node(
"AffineGrid",
inputs=["theta", "size"],
outputs=["grid"],
align_corners=align_corners,
)
original_grid = construct_original_grid(data_size, align_corners)
grid = apply_affine_transform(theta_3d, original_grid)
test_name = "test_affine_grid_3d"
if align_corners == 1:
test_name += "_align_corners"
expect(
node,
inputs=[theta_3d, np.array([N, C, D, H, W], dtype=np.int64)],
outputs=[grid],
name=test_name,
)
Returns the tensor resulted from performing the and logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 7 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#And-1">1</a>
node = onnx.helper.make_node(
"And",
inputs=["x", "y"],
outputs=["and"],
)
# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and2d")
# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and3d")
# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and4d")
node = onnx.helper.make_node(
"And",
inputs=["x", "y"],
outputs=["and"],
)
# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast3v1d")
# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast3v2d")
# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v2d")
# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v3d")
# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v4d")
Computes the indices of the max elements of the input tensor's element along the provided axis. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. If select_last_index is True (default False), the index of the last occurrence of the max is selected if the max appears more than once in the input. Otherwise the index of the first occurrence is selected. The type of the output tensor is integer.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ArgMax-1">1</a>, <a href="Changelog.md#ArgMax-11">11</a>, <a href="Changelog.md#ArgMax-12">12</a>
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
"ArgMax", inputs=["data"], outputs=["result"], keepdims=keepdims
)
# result: [[1, 1]]
result = argmax_use_numpy(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_default_axis_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmax_use_numpy(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_default_axis_random",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
"ArgMax",
inputs=["data"],
outputs=["result"],
keepdims=keepdims,
select_last_index=True,
)
# result: [[1, 1]]
result = argmax_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_default_axis_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmax_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_default_axis_random_select_last_index",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
"ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [[0], [1]]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmax_keepdims_example"
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmax_keepdims_random"
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
"ArgMax",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [[1], [1]]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_keepdims_random_select_last_index",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
"ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [[0], [1]]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_negative_axis_keepdims_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_negative_axis_keepdims_random",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
"ArgMax",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [[1], [1]]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_negative_axis_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_negative_axis_keepdims_random_select_last_index",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
"ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [0, 1]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_no_keepdims_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmax_no_keepdims_random"
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
"ArgMax",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [1, 1]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_no_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmax_no_keepdims_random_select_last_index",
)
Computes the indices of the min elements of the input tensor's element along the provided axis. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. If select_last_index is True (default False), the index of the last occurrence of the min is selected if the min appears more than once in the input. Otherwise the index of the first occurrence is selected. The type of the output tensor is integer.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ArgMin-1">1</a>, <a href="Changelog.md#ArgMin-11">11</a>, <a href="Changelog.md#ArgMin-12">12</a>
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
"ArgMin", inputs=["data"], outputs=["result"], keepdims=keepdims
)
# The content of result is : [[0], [0]]
result = argmin_use_numpy(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_default_axis_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmin_use_numpy(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_default_axis_random",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
"ArgMin",
inputs=["data"],
outputs=["result"],
keepdims=keepdims,
select_last_index=True,
)
# result: [[0, 0]]
result = argmin_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_default_axis_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmin_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_default_axis_random_select_last_index",
)
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
"ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1], [0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmin_keepdims_example"
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmin_keepdims_random"
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
"ArgMin",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [[1], [0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_keepdims_random_select_last_index",
)
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
"ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1], [0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_negative_axis_keepdims_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_negative_axis_keepdims_random",
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
"ArgMin",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [[1], [0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_negative_axis_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_negative_axis_keepdims_random_select_last_index",
)
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
"ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1, 0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_no_keepdims_example",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
node, inputs=[data], outputs=[result], name="test_argmin_no_keepdims_random"
)
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
"ArgMin",
inputs=["data"],
outputs=["result"],
axis=axis,
keepdims=keepdims,
select_last_index=True,
)
# result: [[1, 0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_no_keepdims_example_select_last_index",
)
data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
node,
inputs=[data],
outputs=[result],
name="test_argmin_no_keepdims_random_select_last_index",
)
Calculates the arcsine (inverse of sine) of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Asin-7">7</a>
node = onnx.helper.make_node(
"Asin",
inputs=["x"],
outputs=["y"],
)
x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arcsin(x)
expect(node, inputs=[x], outputs=[y], name="test_asin_example")
x = np.random.rand(3, 4, 5).astype(np.float32)
y = np.arcsin(x)
expect(node, inputs=[x], outputs=[y], name="test_asin")
Calculates the hyperbolic arcsine of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Asinh-9">9</a>
node = onnx.helper.make_node(
"Asinh",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.arcsinh(x) # expected output [-0.88137358, 0., 0.88137358]
expect(node, inputs=[x], outputs=[y], name="test_asinh_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.arcsinh(x)
expect(node, inputs=[x], outputs=[y], name="test_asinh")
Calculates the arctangent (inverse of tangent) of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Atan-7">7</a>
node = onnx.helper.make_node(
"Atan",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.arctan(x)
expect(node, inputs=[x], outputs=[y], name="test_atan_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.arctan(x)
expect(node, inputs=[x], outputs=[y], name="test_atan")
Calculates the hyperbolic arctangent of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Atanh-9">9</a>
node = onnx.helper.make_node(
"Atanh",
inputs=["x"],
outputs=["y"],
)
x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arctanh(x) # expected output [-0.54930615, 0., 0.54930615]
expect(node, inputs=[x], outputs=[y], name="test_atanh_example")
x = np.random.uniform(0.0, 1.0, (3, 4, 5)).astype(np.float32)
y = np.arctanh(x)
expect(node, inputs=[x], outputs=[y], name="test_atanh")
Computes scaled dot product attention on query, key and value tensors, using an optional attention mask if passed.
This operator covers self and cross variants of the attention operation based on sequence lengths of K, Q and V.
For self attention, kv_sequence_length equals to q_sequence_length.
For cross attention, query and key might have different lengths.
This operator also covers the 3 following variants based on the number of heads:
q_num_heads = kv_num_heads.q_num_heads > kv_num_heads, q_num_heads % kv_num_heads == 0.q_num_heads > kv_num_heads, kv_num_heads=1.Attention bias to be added is calculated based on attn_mask input and is_causal attribute:
attn_mask: A boolean mask where a value of True indicates that the element should take part in attention or a float mask of the same type as query, key, value that is added to the attention score.is_causal is set to 1, attention scores above the diagonal are masked out, regardless of the attn_mask input.With respect to KV cache update, this operator allows the following two use cases:
K and V inputs contain only the incoming
tokens for the current autoregressive step, and the four optional inputs/outputs past and present key and value are
all needed. The Attention op performs a Concat operation on the past and incoming key and value to form the present
key and value, respectively. Note that this only works correctly for the special case where the past key and value
do not contain padded tokens.TensorScatter operator). In this
case, the K and V inputs correspond to the entire cache tensor, so the four optional inputs/outputs past and
present key and value should not be used. An additional input nonpad_kv_seqlen of shape (batch_size,) may be
provided to indicate the number of non-padding tokens in each sample of the batch to save unnecessary computation.
Here, the kv_sequence dimension of attn_mask can be shorter than K and V, but still needs to be at least as long
as the maximum value of nonpad_kv_seqlen.Both past and present state key/values are optional. They shall be used together, and not allowed to use only one of them. The following pattern is applied to the Q, K and V inputs after appropriate reshaping of K and V inputs based on sequence lengths and num heads provided:
The following pattern is applied by this operator:
Q K V
| | |
Q*sqrt(scale) K*sqrt(scale) |
| | |
| Transpose |
| | |
---MatMul--- |
| |
at_mask---Add |
| |
softcap (if provided) |
| |
Softmax |
| |
-----MatMul------
|
Y
This version of the operator has been available since version 24 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Attention-23">23</a>
node = onnx.helper.make_node("Attention", inputs=["Q", "K", "V"], outputs=["Y"])
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_3d_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_diff_heads_sizes",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_3d_diff_heads_sizes_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_diff_heads_sizes_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_diff_heads_sizes_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_diff_heads_sizes_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 30).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 10).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_3d_diff_heads_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_gqa",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_3d_gqa_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
is_causal=1,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_gqa_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_gqa_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_gqa_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 9, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 72).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_3d_gqa_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
scale=scale,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
softcap=3.0,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
"""Test case to verify correct 3D to 4D transpose behavior.
This test verifies that 3D inputs are correctly reshaped and transposed
according to the ONNX specification:
[batch_size, seq_length, hidden_size] ->
[batch_size, seq_length, num_heads, head_size] ->
[batch_size, num_heads, seq_length, head_size]
"""
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
# Test inputs that will clearly demonstrate the transpose behavior
batch_size = 1
q_seq_length = 2
kv_seq_length = 2
head_size = 4
q_hidden_size = q_num_heads * head_size # 3 * 4 = 12
kv_hidden_size = kv_num_heads * head_size # 3 * 4 = 12
# Create structured inputs to verify correct transpose behavior
# Q has a pattern where each position in hidden dimension has a specific value
Q = np.zeros((batch_size, q_seq_length, q_hidden_size), dtype=np.float32)
# Fill Q with pattern: head0=[1,1,1,1], head1=[2,2,2,2], head2=[3,3,3,3]
for head in range(q_num_heads):
start_idx = head * head_size
end_idx = start_idx + head_size
Q[0, :, start_idx:end_idx] = float(head + 1)
K = np.ones((batch_size, kv_seq_length, kv_hidden_size), dtype=np.float32) * 0.1
V = np.ones((batch_size, kv_seq_length, kv_hidden_size), dtype=np.float32) * 0.1
Y, _, _, _ = _compute_attention(
Q,
K,
V,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_3d_transpose_verification",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_3d_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_3d_with_past_and_present_qk_matmul",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
qk_matmul_output_mode=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
qk_matmul_output_mode=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_3d_with_past_and_present_qk_matmul_bias",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
softcap=2.0,
qk_matmul_output_mode=2,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
softcap=2.0,
qk_matmul_output_mode=2,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_3d_with_past_and_present_qk_matmul_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
q_num_heads, kv_num_heads = 3, 3
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
qk_matmul_output_mode=3,
)
past_sequence_length = 12
Q = np.random.rand(2, 4, 24).astype(np.float32)
K = np.random.rand(2, 6, 24).astype(np.float32)
V = np.random.rand(2, 6, 24).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
q_num_heads=q_num_heads,
kv_num_heads=kv_num_heads,
qk_matmul_output_mode=3,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_3d_with_past_and_present_qk_matmul_softmax",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "", "", "nonpad_kv_seqlen"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 4).astype(np.float32)
nonpad_kv_seqlen = np.array([3, 4], dtype=np.int64)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
nonpad_kv_seqlen=nonpad_kv_seqlen,
)
expect(
node,
inputs=[Q, K, V, attn_mask, nonpad_kv_seqlen],
outputs=[Y],
name="test_attention_4d_diff_heads_mask4d_padded_kv",
opset_imports=[onnx.helper.make_opsetid("", 24)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 1, 4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_3d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
is_causal=1,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 1, 4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
is_causal=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_3d_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_4d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
is_causal=1,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
is_causal=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_4d_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(bool)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_bool",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6).astype(bool)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_attn_mask_bool_4d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, is_causal=1)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node("Attention", inputs=["Q", "K", "V"], outputs=["Y"])
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_diff_heads_sizes",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_diff_heads_sizes_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
is_causal=1,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_diff_heads_sizes_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, scale=scale)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_diff_heads_sizes_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=2.0,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
softcap=2.0,
)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_diff_heads_sizes_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 10).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_diff_heads_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
attn_mask = np.random.rand(2, 1, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 10).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_diff_heads_with_past_and_present_mask3d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 10).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 10).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_diff_heads_with_past_and_present_mask4d",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node("Attention", inputs=["Q", "K", "V"], outputs=["Y"])
Q = np.random.rand(2, 3, 4, 8).astype(np.float16)
K = np.random.rand(2, 3, 6, 8).astype(np.float16)
V = np.random.rand(2, 3, 6, 8).astype(np.float16)
Y, _, _, _ = _compute_attention(Q, K, V)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_fp16",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node("Attention", inputs=["Q", "K", "V"], outputs=["Y"])
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_gqa",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y"],
)
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y],
name="test_attention_4d_gqa_attn_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
is_causal=1,
)
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, is_causal=1)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_gqa_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
)
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, scale=scale)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_gqa_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=2.0,
)
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, softcap=2.0)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_gqa_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 9, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_gqa_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 9, 4, 8).astype(np.float16)
K = np.random.rand(2, 3, 6, 8).astype(np.float16)
V = np.random.rand(2, 3, 6, 8).astype(np.float16)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float16)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float16)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float16)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_gqa_with_past_and_present_fp16",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
scale = 1e-2
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
scale=scale,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, scale=scale)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_scaled",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y"],
softcap=2.0,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, _ = _compute_attention(Q, K, V, softcap=2.0)
expect(
node,
inputs=[Q, K, V],
outputs=[Y],
name="test_attention_4d_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value"],
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, _ = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value],
name="test_attention_4d_with_past_and_present",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
qk_matmul_output_mode=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
qk_matmul_output_mode=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul_bias",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
qk_matmul_output_mode=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 1, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
qk_matmul_output_mode=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul_bias_3d_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
qk_matmul_output_mode=1,
is_causal=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 1, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
qk_matmul_output_mode=1,
is_causal=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul_bias_3d_mask_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
qk_matmul_output_mode=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
qk_matmul_output_mode=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul_bias_4d_mask",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask", "past_key", "past_value"],
outputs=["Y", "present_key", "present_value", "qk_matmul_output"],
qk_matmul_output_mode=1,
is_causal=1,
)
past_sequence_length = 12
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(2, 3, 4, 6 + past_sequence_length).astype(np.float32)
past_key = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
past_value = np.random.rand(2, 3, past_sequence_length, 8).astype(np.float32)
Y, present_key, present_value, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
past_key=past_key,
past_value=past_value,
qk_matmul_output_mode=1,
is_causal=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask, past_key, past_value],
outputs=[Y, present_key, present_value, qk_matmul_output],
name="test_attention_4d_with_past_and_present_qk_matmul_bias_4d_mask_causal",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V"],
outputs=["Y", "", "", "qk_matmul_output"],
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
Y, _, _, qk_matmul_output = _compute_attention(Q, K, V)
expect(
node,
inputs=[Q, K, V],
outputs=[Y, qk_matmul_output],
name="test_attention_4d_with_qk_matmul",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y", "", "", "qk_matmul_output"],
qk_matmul_output_mode=1,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
qk_matmul_output_mode=1,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y, qk_matmul_output],
name="test_attention_4d_with_qk_matmul_bias",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y", "", "", "qk_matmul_output"],
softcap=2.0,
qk_matmul_output_mode=2,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
softcap=2.0,
qk_matmul_output_mode=2,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y, qk_matmul_output],
name="test_attention_4d_with_qk_matmul_softcap",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
node = onnx.helper.make_node(
"Attention",
inputs=["Q", "K", "V", "attn_mask"],
outputs=["Y", "", "", "qk_matmul_output"],
qk_matmul_output_mode=3,
)
Q = np.random.rand(2, 3, 4, 8).astype(np.float32)
K = np.random.rand(2, 3, 6, 8).astype(np.float32)
V = np.random.rand(2, 3, 6, 8).astype(np.float32)
attn_mask = np.random.rand(4, 6).astype(np.float32)
Y, _, _, qk_matmul_output = _compute_attention(
Q,
K,
V,
attn_mask=attn_mask,
qk_matmul_output_mode=3,
)
expect(
node,
inputs=[Q, K, V, attn_mask],
outputs=[Y, qk_matmul_output],
name="test_attention_4d_with_qk_matmul_softmax",
opset_imports=[onnx.helper.make_opsetid("", 23)],
)
AveragePool consumes an input tensor X and applies average pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. average pooling consisting of computing the average on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape is calculated differently depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized. With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):
output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
or
output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
if ceil_mode is enabled. pad_shape[i] is the sum of pads along axis i. Sliding windows that would start in the right padded region are ignored.
auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:
VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):
VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1
And pad shape will be following if SAME_UPPER or SAME_LOWER:
pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#AveragePool-1">1</a>, <a href="Changelog.md#AveragePool-7">7</a>, <a href="Changelog.md#AveragePool-10">10</a>, <a href="Changelog.md#AveragePool-11">11</a>, <a href="Changelog.md#AveragePool-19">19</a>
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2],
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2]
strides = [1]
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")
expect(node, inputs=[x], outputs=[y], name="test_averagepool_1d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
strides=[2, 2],
ceil_mode=True,
)
x = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
]
).astype(np.float32)
y = np.array([[[[6, 7.5], [12, 13.5]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_ceil")
"""input_shape: [1, 3, 2, 2]
output_shape: [1, 3, 1, 1]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
strides=[3, 3],
pads=[1, 1, 1, 1],
ceil_mode=True,
count_include_pad=1,
)
x = np.array(
[
[
[[0.8580, 0.0786], [0.2692, 0.1537]],
[[0.8816, 0.4353], [0.5772, 0.6623]],
[[0.9067, 0.9483], [0.5970, 0.7630]],
]
]
).astype(np.float32)
y = np.array([[[[0.1511]], [[0.2841]], [[0.3572]]]]).astype(np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_averagepool_2d_ceil_last_window_starts_on_pad",
)
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[1, 1],
dilations=[2, 2],
ceil_mode=True,
)
# input shape: [1, 1, 4, 4]
x = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
]
).astype(np.float32)
y = np.array([[[[6, 7], [10, 11]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_dilations")
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[2, 2, 2, 2],
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = 2
pad_top = 2
pad_right = 2
pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, extra_pads = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides, ceil_mode=False
)
padded = np.pad(
x,
(
(0, 0),
(0, 0),
(extra_pads[0], extra_pads[2]),
(extra_pads[1], extra_pads[3]),
),
mode="constant",
constant_values=np.nan,
)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=extra_pads,
pads=pads,
)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_pads")
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[2, 2, 2, 2],
count_include_pad=1,
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
dilations = (1, 1)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = 2
pad_top = 2
pad_right = 2
pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, extra_pads = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides, dilations, ceil_mode=False
)
padded = np.pad(
x,
(
(0, 0),
(0, 0),
(extra_pads[0], extra_pads[2]),
(extra_pads[1], extra_pads[3]),
),
mode="constant",
constant_values=0,
)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=extra_pads,
pads=pads,
count_include_pad=1,
)
expect(
node,
inputs=[x],
outputs=[y],
name="test_averagepool_2d_pads_count_include_pad",
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
pads=[2, 2, 2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array(
[
[
[
[7, 7.5, 8, 8.5, 9],
[9.5, 10, 10.5, 11, 11.5],
[12, 12.5, 13, 13.5, 14],
[14.5, 15, 15.5, 16, 16.5],
[17, 17.5, 18, 18.5, 19],
]
]
]
).astype(np.float32)
expect(
node, inputs=[x], outputs=[y], name="test_averagepool_2d_precomputed_pads"
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
pads=[2, 2, 2, 2],
count_include_pad=1,
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array(
[
[
[
[2.5200, 3.6000, 4.8000, 4.0800, 3.2400],
[4.5600, 6.4000, 8.4000, 7.0400, 5.5200],
[7.2000, 10.0000, 13.0000, 10.8000, 8.4000],
[6.9600, 9.6000, 12.4000, 10.2400, 7.9200],
[6.1200, 8.4000, 10.8000, 8.8800, 6.8400],
]
]
]
).astype(np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_averagepool_2d_precomputed_pads_count_include_pad",
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 3, 3]
pad_shape: [2, 2] -> [1, 1, 1, 1] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
strides=[2, 2],
auto_pad="SAME_UPPER",
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array([[[[4, 5.5, 7], [11.5, 13, 14.5], [19, 20.5, 22]]]]).astype(
np.float32
)
expect(
node,
inputs=[x],
outputs=[y],
name="test_averagepool_2d_precomputed_same_upper",
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array([[[[4, 6], [14, 16]]]]).astype(np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_averagepool_2d_precomputed_strides",
)
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_LOWER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=np.nan,
)
pads = (pad_top, pad_left, pad_bottom, pad_right)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=pads,
pads=pads,
)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_same_lower")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_UPPER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=np.nan,
)
pads = (pad_top, pad_left, pad_bottom, pad_right)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=pads,
pads=pads,
)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_same_upper")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
strides=[3, 3],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, pads = get_output_shape_explicit_padding(
None, x_shape[2:], kernel_shape, strides, ceil_mode=False
)
padded = x
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=pads,
pads=None,
)
expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_strides")
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")
expect(node, inputs=[x], outputs=[y], name="test_averagepool_3d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
strides=[1, 1, 1],
dilations=[2, 2, 2],
ceil_mode=True,
)
# input shape: [1, 1, 4, 4, 4]
x = np.array(
[
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
]
]
]
).astype(np.float32)
y = np.array([[[[[6, 7], [10, 11]], [[6, 7], [10, 11]]]]]).astype(np.float32)
expect(
node, inputs=[x], outputs=[y], name="test_averagepool_3d_dilations_small"
)
x_shape = (32, 32, 32)
dilations = (2, 2, 2)
kernel_shape = (5, 5, 5)
strides = (3, 3, 3)
count_include_pad = 0
for count_include_pad in (0, 1):
for ceil_mode in (True, False):
node = onnx.helper.make_node(
"AveragePool",
inputs=["x"],
outputs=["y"],
kernel_shape=kernel_shape,
strides=strides,
dilations=dilations,
count_include_pad=count_include_pad,
ceil_mode=ceil_mode,
)
x = np.random.randn(1, 1, *x_shape).astype(np.float32)
out_shape, extra_pads = get_output_shape_explicit_padding(
None,
x_shape,
kernel_shape,
strides,
dilations=dilations,
ceil_mode=ceil_mode,
)
padded = np.pad(
x,
(
(0, 0),
(0, 0),
(extra_pads[0], extra_pads[3]),
(extra_pads[1], extra_pads[4]),
(extra_pads[2], extra_pads[5]),
),
mode="constant",
constant_values=0 if count_include_pad == 1 else np.nan,
)
y = pool(
padded,
(1, 1, *x_shape),
kernel_shape,
strides,
out_shape,
"AVG",
pads_required=extra_pads,
pads=None,
dilations=dilations,
count_include_pad=count_include_pad,
)
test_name = f"test_averagepool_3d_dilations_large_count_include_pad_is_{count_include_pad}_ceil_mode_is_{ceil_mode}"
expect(node, inputs=[x], outputs=[y], name=test_name)
Carries out batch normalization as described in the paper https://arxiv.org/abs/1502.03167. Depending on the mode it is being run, There are five required inputs 'X', 'scale', 'B', 'input_mean' and 'input_var'. Note that 'input_mean' and 'input_var' are expected to be the estimated statistics in inference mode (training_mode=False, default), and the running statistics in training mode (training_mode=True). There are multiple cases for the number of outputs, which we list below:
When training_mode=False, extra outputs are invalid. The outputs are updated as follows when training_mode=True:
running_mean = input_mean * momentum + current_mean * (1 - momentum)
running_var = input_var * momentum + current_var * (1 - momentum)
Y = (X - current_mean) / sqrt(current_var + epsilon) * scale + B
where:
current_mean = ReduceMean(X, axis=all_except_channel_index)
current_var = ReduceVar(X, axis=all_except_channel_index)
Notice that ReduceVar refers to the population variance, and it equals to
sum(sqrd(x_i - x_avg)) / N
where N is the population size (this formula does not use sample size N - 1).
The computation of ReduceMean and ReduceVar uses float to avoid overflow for float16 inputs.
When training_mode=False:
Y = (X - input_mean) / sqrt(input_var + epsilon) * scale + B
For previous (depreciated) non-spatial cases, implementors are suggested to flatten the input shape to (N x C * D1 * D2 * ... * Dn) before a BatchNormalization Op. This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
This version of the operator has been available since version 15 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#BatchNormalization-1">1</a>, <a href="Changelog.md#BatchNormalization-6">6</a>, <a href="Changelog.md#BatchNormalization-7">7</a>, <a href="Changelog.md#BatchNormalization-9">9</a>, <a href="Changelog.md#BatchNormalization-14">14</a>
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
y = _batchnorm_test_mode(x, s, bias, mean, var).astype(np.float32)
node = onnx.helper.make_node(
"BatchNormalization",
inputs=["x", "s", "bias", "mean", "var"],
outputs=["y"],
)
# output size: (2, 3, 4, 5)
expect(
node,
inputs=[x, s, bias, mean, var],
outputs=[y],
name="test_batchnorm_example",
)
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
epsilon = 1e-2
y = _batchnorm_test_mode(x, s, bias, mean, var, epsilon).astype(np.float32)
node = onnx.helper.make_node(
"BatchNormalization",
inputs=["x", "s", "bias", "mean", "var"],
outputs=["y"],
epsilon=epsilon,
)
# output size: (2, 3, 4, 5)
expect(
node,
inputs=[x, s, bias, mean, var],
outputs=[y],
name="test_batchnorm_epsilon",
)
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
# using np.bool(1) while generating test data with "'bool' object has no attribute 'dtype'"
# working around by using np.byte(1).astype(bool)
training_mode = 1
y, output_mean, output_var = _batchnorm_training_mode(x, s, bias, mean, var)
node = onnx.helper.make_node(
"BatchNormalization",
inputs=["x", "s", "bias", "mean", "var"],
outputs=["y", "output_mean", "output_var"],
training_mode=training_mode,
)
# output size: (2, 3, 4, 5)
expect(
node,
inputs=[x, s, bias, mean, var],
outputs=[y, output_mean, output_var],
name="test_batchnorm_example_training_mode",
)
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
training_mode = 1
momentum = 0.9
epsilon = 1e-2
y, output_mean, output_var = _batchnorm_training_mode(
x, s, bias, mean, var, momentum, epsilon
)
node = onnx.helper.make_node(
"BatchNormalization",
inputs=["x", "s", "bias", "mean", "var"],
outputs=["y", "output_mean", "output_var"],
epsilon=epsilon,
training_mode=training_mode,
)
# output size: (2, 3, 4, 5)
expect(
node,
inputs=[x, s, bias, mean, var],
outputs=[y, output_mean, output_var],
name="test_batchnorm_epsilon_training_mode",
)
Draws binary random numbers (0 or 1) from a Bernoulli distribution. The input tensor should be a tensor containing probabilities p (a value in the range [0,1]) to be used for drawing the binary random number, where an output of 1 is produced with probability p and an output of 0 is produced with probability (1-p).
This operator is non-deterministic and may not produce the same values in different implementations (even if a seed is specified).
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Bernoulli-15">15</a>
node = onnx.helper.make_node(
"Bernoulli",
inputs=["x"],
outputs=["y"],
dtype=onnx.TensorProto.DOUBLE,
)
x = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y = bernoulli_reference_implementation(x, float)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli_double")
seed = float(0)
node = onnx.helper.make_node(
"Bernoulli",
inputs=["x"],
outputs=["y"],
seed=seed,
)
x = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y = bernoulli_reference_implementation(x, np.float32)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli_seed")
node = onnx.helper.make_node(
"Bernoulli",
inputs=["x"],
outputs=["y"],
)
x = np.random.uniform(0.0, 1.0, 10).astype(float)
y = bernoulli_reference_implementation(x, float)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli")
Reinterprets the binary representation of a tensor as a different data type, specified by the 'to' attribute. Unlike Cast, BitCast preserves the exact bit pattern without any value conversion.
The target data type must have the same bit-width as the input data type. The output tensor has the same shape as the input tensor. All types except string are supported. Implementations must treat the underlying bytes as little endian.
This version of the operator has been available since version 26 of the default ONNX operator set.
"""Test bitcasting 2D array from float32 to int32."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT32,
)
x = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32)
y = x.view(np.int32)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_2d_float32_to_int32")
"""Test bitcasting from bool to uint8 (same size)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.UINT8,
)
x = np.array([True, False, True, False], dtype=np.bool_)
y = x.view(np.uint8)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_bool_to_uint8")
"""Test bitcasting from float32 to int32 (same size)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT32,
)
x = np.array([1.0, -2.5, 3.75], dtype=np.float32)
y = x.view(np.int32)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_float32_to_int32")
"""Test bitcasting from float64 to int64 (same size)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT64,
)
x = np.array([1.0, -2.5, 3.75], dtype=np.float64)
y = x.view(np.int64)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_float64_to_int64")
"""Test bitcasting from int32 to float32 (same size)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.FLOAT,
)
x = np.array([1065353216, -1071644672, 1081081856], dtype=np.int32)
y = x.view(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_int32_to_float32")
"""Test bitcasting from int64 to float64 (same size)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.DOUBLE,
)
x = np.array(
[4607182418800017408, -4611686018427387904, 4614256656552045184],
dtype=np.int64,
)
y = x.view(np.float64)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_int64_to_float64")
"""Test bitcasting from int8 to uint8 (same size, different signedness)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.UINT8,
)
x = np.array([-1, -128, 127, 0], dtype=np.int8)
y = x.view(np.uint8)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_int8_to_uint8")
"""Test bitcasting scalar from float32 to int32."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT32,
)
x = np.array(1.0, dtype=np.float32)
y = x.view(np.int32)
expect(
node, inputs=[x], outputs=[y], name="test_bitcast_scalar_float32_to_int32"
)
"""Test bitcasting from uint16 to int16 (same size, different signedness)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT16,
)
x = np.array([1, 32768, 65535], dtype=np.uint16)
y = x.view(np.int16)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_uint16_to_int16")
"""Test bitcasting from uint32 to int32 (same size, different signedness)."""
node = onnx.helper.make_node(
"BitCast",
inputs=["x"],
outputs=["y"],
to=onnx.TensorProto.INT32,
)
x = np.array([4294967295, 2147483648, 2147483647], dtype=np.uint32)
y = x.view(np.int32)
expect(node, inputs=[x], outputs=[y], name="test_bitcast_uint32_to_int32")
Bitwise shift operator performs element-wise operation. For each input element, if the attribute "direction" is "RIGHT", this operator moves its binary representation toward the right side so that the input value is effectively decreased. If the attribute "direction" is "LEFT", bits of binary representation moves toward the left side, which results the increase of its actual value. The input X is the tensor to be shifted and another input Y specifies the amounts of shifting. For example, if "direction" is "Right", X is [1, 4], and S is [1, 1], the corresponding output Z would be [0, 2]. If "direction" is "LEFT" with X=[1, 2] and S=[1, 2], the corresponding output Y would be [2, 8].
Because this operator supports Numpy-style broadcasting, X's and Y's shapes are not necessarily identical. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 11 of the default ONNX operator set.
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)
x = np.array([16, 4, 1]).astype(np.uint16)
y = np.array([1, 2, 3]).astype(np.uint16)
z = x << y # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint16")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)
x = np.array([16, 4, 1]).astype(np.uint32)
y = np.array([1, 2, 3]).astype(np.uint32)
z = x << y # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint32")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)
x = np.array([16, 4, 1]).astype(np.uint64)
y = np.array([1, 2, 3]).astype(np.uint64)
z = x << y # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint64")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)
x = np.array([16, 4, 1]).astype(np.uint8)
y = np.array([1, 2, 3]).astype(np.uint8)
z = x << y # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint8")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)
x = np.array([16, 4, 1]).astype(np.uint16)
y = np.array([1, 2, 3]).astype(np.uint16)
z = x >> y # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint16")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)
x = np.array([16, 4, 1]).astype(np.uint32)
y = np.array([1, 2, 3]).astype(np.uint32)
z = x >> y # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint32")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)
x = np.array([16, 4, 1]).astype(np.uint64)
y = np.array([1, 2, 3]).astype(np.uint64)
z = x >> y # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint64")
node = onnx.helper.make_node(
"BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)
x = np.array([16, 4, 1]).astype(np.uint8)
y = np.array([1, 2, 3]).astype(np.uint8)
z = x >> y # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint8")
Returns the tensor resulting from performing the bitwise and operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 18 of the default ONNX operator set.
node = onnx.helper.make_node(
"BitwiseAnd",
inputs=["x", "y"],
outputs=["bitwiseand"],
)
# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_i32_2d")
# 3d
x = create_random_int((3, 4, 5), np.int16)
y = create_random_int((3, 4, 5), np.int16)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_i16_3d")
node = onnx.helper.make_node(
"BitwiseAnd",
inputs=["x", "y"],
outputs=["bitwiseand"],
)
# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_and(x, y)
expect(
node, inputs=[x, y], outputs=[z], name="test_bitwise_and_ui64_bcast_3v1d"
)
# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_ui8_bcast_4v3d")
Returns the bitwise not of the input tensor element-wise.
This version of the operator has been available since version 18 of the default ONNX operator set.
node = onnx.helper.make_node(
"BitwiseNot",
inputs=["x"],
outputs=["bitwise_not"],
)
# 2d
x = create_random_int((3, 4), np.int32)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_2d")
# 3d
x = create_random_int((3, 4, 5), np.uint16)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_3d")
# 4d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_4d")
Returns the tensor resulting from performing the bitwise or operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 18 of the default ONNX operator set.
node = onnx.helper.make_node(
"BitwiseOr",
inputs=["x", "y"],
outputs=["bitwiseor"],
)
# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_i32_2d")
# 4d
x = create_random_int((3, 4, 5, 6), np.int8)
y = create_random_int((3, 4, 5, 6), np.int8)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_i16_4d")
node = onnx.helper.make_node(
"BitwiseOr",
inputs=["x", "y"],
outputs=["bitwiseor"],
)
# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_ui64_bcast_3v1d")
# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_ui8_bcast_4v3d")
Returns the tensor resulting from performing the bitwise xor operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 18 of the default ONNX operator set.
node = onnx.helper.make_node(
"BitwiseXor",
inputs=["x", "y"],
outputs=["bitwisexor"],
)
# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_xor(x, y)
expect(
node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_ui64_bcast_3v1d"
)
# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_ui8_bcast_4v3d")
node = onnx.helper.make_node(
"BitwiseXor",
inputs=["x", "y"],
outputs=["bitwisexor"],
)
# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_i32_2d")
# 3d
x = create_random_int((3, 4, 5), np.int16)
y = create_random_int((3, 4, 5), np.int16)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_i16_3d")
Generates a Blackman window as described in the paper https://ieeexplore.ieee.org/document/1455106.
This version of the operator has been available since version 17 of the default ONNX operator set.
# Test periodic window
node = onnx.helper.make_node(
"BlackmanWindow",
inputs=["x"],
outputs=["y"],
)
size = np.int32(10)
a0 = 0.42
a1 = -0.5
a2 = 0.08
y = a0
y += a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
y += a2 * np.cos(4 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(
node,
inputs=[size],
outputs=[y.astype(np.float32)],
name="test_blackmanwindow",
)
# Test symmetric window
node = onnx.helper.make_node(
"BlackmanWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 0.42
a1 = -0.5
a2 = 0.08
y = a0
y += a1 * np.cos(
2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
y += a2 * np.cos(
4 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(
node,
inputs=[size],
outputs=[y.astype(np.float32)],
name="test_blackmanwindow_symmetric",
)
The operator casts the elements of a given input tensor to a data type specified by the 'to' argument and returns an output tensor of the same size in the converted type. The 'to' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message.
Casting from string tensor in plain (e.g., "3.14" and "1000") and scientific numeric representations (e.g., "1e-5" and "1E8") to float types is supported. For example, converting string "100.5" to an integer may yield result 100. There are some string literals reserved for special floating-point values; "+INF" (and "INF"), "-INF", and "NaN" are positive infinity, negative infinity, and not-a-number, respectively. Any string which can exactly match "+INF" in a case-insensitive way would be mapped to positive infinite. Similarly, this case-insensitive rule is applied to "INF" and "NaN". When casting from numeric tensors to string tensors, plain floating-point representation (such as "314.15926") would be used. Converting non-numerical-literal string such as "Hello World!" is an undefined behavior. Cases of converting string representing floating-point arithmetic value, such as "2.718", to INT is an undefined behavior.
Conversion from a numerical type to any numerical type is always allowed. User must be aware of precision loss and value change caused by range difference between two types. For example, a 64-bit float 3.1415926459 may be round to a 32-bit float 3.141592. Similarly, converting an integer 36 to Boolean may produce 1 because we truncate bits which can't be stored in the targeted type.
In more detail, the conversion among numerical types should follow these rules if the destination type is not a float 8 type.
{1.0, 0.0}.{1, 0}.Float 8 types (E4M3FN, E4M3FNUZ, E5M2, E5M2FNUZ) were introduced to speed up the training of
deep models. By default the conversion of a float x obeys
to the following rules. [x] means the value rounded to
the target mantissa width.
| x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 |
| -0 | -0 | 0 | -0 | 0 |
| NaN | NaN | NaN | NaN | NaN |
| Inf | FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX |
| -Inf | -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX |
| [x] > FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX |
| [x] < -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX |
| else | RNE | RNE | RNE | RNE |
The behavior changes if the parameter 'saturate' is set to False. The rules then become:
| x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 |
| -0 | -0 | 0 | -0 | 0 |
| NaN | NaN | NaN | NaN | NaN |
| -NaN | -NaN | NaN | -NaN | NaN |
| Inf | NaN | NaN | Inf | NaN |
| -Inf | -NaN | NaN | -Inf | NaN |
| [x] > FLT_MAX | NaN | NaN | Inf | NaN |
| [x] < -FLT_MAX | NaN | NaN | -Inf | NaN |
| else | RNE | RNE | RNE | RNE |
FLOAT8E8M0 type was introduced to enable Microscaling (MX) formats.
When casting to FLOAT8E8M0, the rounding behavior can be specified using the round_mode and saturate attributes.
The current CUDA behavior is to round up and saturate. Casting negative values to FLOAT8E8M0 gives undefined behavior.
The following table describes the casting behavior of special values to FLOAT8E8M0 in the two most common cases.
| x | saturate + up | non-saturate + nearest |
|---|---|---|
| 0 | 0 | NaN |
| -0 | Unspecified | Unspecified |
| NaN | NaN | NaN |
| Inf | E8M0_MAX | NaN |
| x > E8M0_MAX | E8M0_MAX | NaN |
| x < E8M0_MIN | E8M0_MIN | NaN |
| x < 0 | Unspecified | Unspecified |
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Cast-1">1</a>, <a href="Changelog.md#Cast-6">6</a>, <a href="Changelog.md#Cast-9">9</a>, <a href="Changelog.md#Cast-13">13</a>, <a href="Changelog.md#Cast-19">19</a>, <a href="Changelog.md#Cast-21">21</a>, <a href="Changelog.md#Cast-23">23</a>, <a href="Changelog.md#Cast-24">24</a>
test_cases = [
("FLOAT", "FLOAT16"),
("FLOAT", "DOUBLE"),
("FLOAT16", "FLOAT"),
("FLOAT16", "DOUBLE"),
("DOUBLE", "FLOAT"),
("DOUBLE", "FLOAT16"),
("FLOAT", "BFLOAT16"),
("BFLOAT16", "FLOAT"),
("FLOAT", "FLOAT8E4M3FN"),
("FLOAT16", "FLOAT8E4M3FN"),
("FLOAT", "FLOAT8E4M3FNUZ"),
("FLOAT16", "FLOAT8E4M3FNUZ"),
("FLOAT8E4M3FN", "FLOAT"),
("FLOAT8E4M3FN", "FLOAT16"),
("FLOAT8E4M3FNUZ", "FLOAT"),
("FLOAT8E4M3FNUZ", "FLOAT16"),
("FLOAT", "FLOAT8E5M2"),
("FLOAT16", "FLOAT8E5M2"),
("FLOAT", "FLOAT8E5M2FNUZ"),
("FLOAT16", "FLOAT8E5M2FNUZ"),
("FLOAT8E5M2", "FLOAT"),
("FLOAT8E5M2", "FLOAT16"),
("FLOAT8E5M2FNUZ", "FLOAT"),
("FLOAT8E5M2FNUZ", "FLOAT16"),
("FLOAT", "UINT4"),
("FLOAT16", "UINT4"),
("FLOAT", "INT4"),
("FLOAT16", "INT4"),
("UINT4", "FLOAT"),
("UINT4", "FLOAT16"),
("UINT4", "UINT8"),
("INT4", "FLOAT"),
("INT4", "FLOAT16"),
("INT4", "INT8"),
("FLOAT4E2M1", "FLOAT"),
("FLOAT4E2M1", "FLOAT16"),
("FLOAT", "FLOAT4E2M1"),
("FLOAT16", "FLOAT4E2M1"),
("FLOAT", "UINT2"),
("FLOAT16", "UINT2"),
("FLOAT", "INT2"),
("FLOAT16", "INT2"),
("UINT2", "FLOAT"),
("UINT2", "FLOAT16"),
("UINT2", "UINT8"),
("INT2", "FLOAT"),
("INT2", "FLOAT16"),
("INT2", "INT8"),
]
for from_type, to_type in test_cases:
if from_type == to_type:
# Skip cases where from_type and to_type are the same
continue
from_dtype = getattr(TensorProto, from_type)
to_dtype = getattr(TensorProto, to_type)
from_np_dtype = tensor_dtype_to_np_dtype(from_dtype)
to_np_dtype = tensor_dtype_to_np_dtype(to_dtype)
if from_type == "BFLOAT16" or to_type == "BFLOAT16":
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.816468",
"0.21087195",
"0.7229038",
"NaN",
"INF",
"+INF",
"-INF",
],
dtype=np.float32,
)
input_shape = (3, 4)
elif from_type in F8_TYPES or to_type in F8_TYPES:
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.7229038",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-0.0000001",
"0.0000001",
"-1000000",
],
dtype=np.float32,
)
input_shape = (3, 5)
elif from_type in ("UINT4", "INT4") or to_type in ("UINT4", "INT4"):
np_fp32 = np.arange(-9, 16).astype(np.float32)
input_shape = (5, 5)
elif from_type in ("UINT2", "INT2") or to_type in ("UINT2", "INT2"):
np_fp32 = np.arange(-3, 4).astype(np.float32)
input_shape = (7, 1)
elif from_type == "FLOAT4E2M1" or to_type == "FLOAT4E2M1":
np_fp32 = np.array(
[
"0.48",
"0.25",
"1.05",
"-3.5",
"-8",
"9",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-4",
"0.01",
"-0.0",
],
dtype=np.float32,
)
input_shape = (3, 5)
else:
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.816468",
"0.21087195",
"0.7229038",
"NaN",
"INF",
"+INF",
"-INF",
],
dtype=np.float32,
).reshape([3, 4])
input_shape = (3, 4)
if from_type in F8_TYPES:
np_from = onnx.numpy_helper.saturate_cast(np_fp32, from_np_dtype)
input = make_tensor(
"input",
from_dtype,
input_shape,
vals=np_from,
raw=True,
)
elif from_type in FOUR_BIT_TYPES:
np_from = np_fp32.astype(from_np_dtype)
packed = onnx.numpy_helper._pack_4bitx2(np_from)
# No byteswap needed on big-endian machines as _pack_4bitx2()
# returns a numpy array with uint8 datatype.
input = make_tensor(
"input", from_dtype, input_shape, vals=packed.tobytes(), raw=True
)
elif from_type in TWO_BIT_TYPES:
np_from = np_fp32.astype(from_np_dtype)
packed = onnx.numpy_helper._pack_2bitx4(np_from)
input = make_tensor(
"input", from_dtype, input_shape, vals=packed.tobytes(), raw=True
)
else:
np_from = np_fp32.astype(from_np_dtype)
input = make_tensor(
"input", from_dtype, input_shape, vals=np_from, raw=True
)
if to_type in F8_TYPES:
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=onnx.numpy_helper.saturate_cast(np_from, to_np_dtype),
raw=True,
)
elif to_type in FOUR_BIT_TYPES:
packed = onnx.numpy_helper._pack_4bitx2(np_from.astype(to_np_dtype))
# No byteswap needed on big-endian machines as _pack_4bitx2()
# returns a numpy array with uint8 datatype.
output = make_tensor(
"output", to_dtype, input_shape, vals=packed.tobytes(), raw=True
)
elif to_type in TWO_BIT_TYPES:
packed = onnx.numpy_helper._pack_2bitx4(np_from.astype(to_np_dtype))
output = make_tensor(
"output", to_dtype, input_shape, vals=packed.tobytes(), raw=True
)
else:
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=np_from.astype(to_np_dtype),
raw=True,
)
node = onnx.helper.make_node(
"Cast",
inputs=["input"],
outputs=["output"],
to=to_dtype,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_cast_" + from_type + "_to_" + to_type,
)
np_fp32 = np.array(
[
"0.0",
"0.124",
"0.25",
"0.5",
"1.1",
"2.0",
"4.0",
"8.0",
],
dtype=np.float32,
)
test_cases = [
("FLOAT", "FLOAT8E8M0"),
("FLOAT16", "FLOAT8E8M0"),
("FLOAT8E8M0", "FLOAT"),
("FLOAT8E8M0", "FLOAT16"),
]
for from_type, to_type in test_cases:
if from_type == "FLOAT":
input_np = np_fp32
output_np = to_float8e8m0(np_fp32)
elif from_type == "FLOAT16":
input_np = np_fp32.astype(np.float16)
output_np = to_float8e8m0(input_np)
elif from_type == "FLOAT8E8M0":
input_np = to_float8e8m0(np_fp32)
if to_type == "FLOAT":
output_np = input_np.astype(np.float32)
elif to_type == "FLOAT16":
output_np = input_np.astype(np.float16)
else:
raise ValueError(
f"Conversion from {from_type} to {to_type} is not tested."
)
else:
raise ValueError(
f"Conversion from {from_type} to {to_type} is not tested."
)
input = make_tensor(
"input",
getattr(TensorProto, from_type),
[2, 4],
input_np,
raw=True,
)
output = make_tensor(
"output",
getattr(TensorProto, to_type),
[2, 4],
output_np,
raw=True,
)
if to_type == "FLOAT8E8M0":
node = onnx.helper.make_node(
"Cast",
inputs=["input"],
outputs=["output"],
to=getattr(TensorProto, to_type),
saturate=1,
round_mode="up",
)
else:
node = onnx.helper.make_node(
"Cast",
inputs=["input"],
outputs=["output"],
to=getattr(TensorProto, to_type),
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_cast_e8m0_" + from_type + "_to_" + to_type,
)
test_cases = itertools.product(
[
"FLOAT",
"FLOAT16",
],
[
"FLOAT8E4M3FN",
"FLOAT8E4M3FNUZ",
"FLOAT8E5M2",
"FLOAT8E5M2FNUZ",
],
)
input_shape = (3, 5)
for from_type, to_type in test_cases:
from_dtype = getattr(TensorProto, from_type)
to_dtype = getattr(TensorProto, to_type)
from_np_dtype = tensor_dtype_to_np_dtype(from_dtype)
to_np_dtype = tensor_dtype_to_np_dtype(to_dtype)
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.7229038",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-0.0000001",
"0.0000001",
"-1000000",
],
dtype=np.float32,
)
input = make_tensor(
"input",
from_dtype,
input_shape,
vals=np_fp32.astype(from_np_dtype),
raw=True,
)
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=np_fp32.astype(from_np_dtype).astype(to_np_dtype),
raw=True,
)
node = onnx.helper.make_node(
"Cast",
inputs=["input"],
outputs=["output"],
to=to_dtype,
saturate=0,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_cast_no_saturate_" + from_type + "_to_" + to_type,
)
The operator casts the elements of a given input tensor (the first input) to the same data type as the elements of the second input tensor. See documentation of the Cast operator for further details.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#CastLike-15">15</a>, <a href="Changelog.md#CastLike-19">19</a>, <a href="Changelog.md#CastLike-21">21</a>, <a href="Changelog.md#CastLike-23">23</a>, <a href="Changelog.md#CastLike-24">24</a>
test_cases = [
("FLOAT", "FLOAT16"),
("FLOAT", "DOUBLE"),
("FLOAT16", "FLOAT"),
("FLOAT16", "DOUBLE"),
("DOUBLE", "FLOAT"),
("DOUBLE", "FLOAT16"),
("FLOAT", "BFLOAT16"),
("BFLOAT16", "FLOAT"),
("FLOAT", "FLOAT8E4M3FN"),
("FLOAT16", "FLOAT8E4M3FN"),
("FLOAT", "FLOAT8E4M3FNUZ"),
("FLOAT16", "FLOAT8E4M3FNUZ"),
("FLOAT8E4M3FN", "FLOAT"),
("FLOAT8E4M3FN", "FLOAT16"),
("FLOAT8E4M3FNUZ", "FLOAT"),
("FLOAT8E4M3FNUZ", "FLOAT16"),
("FLOAT", "FLOAT8E5M2"),
("FLOAT16", "FLOAT8E5M2"),
("FLOAT", "FLOAT8E5M2FNUZ"),
("FLOAT16", "FLOAT8E5M2FNUZ"),
("FLOAT8E5M2", "FLOAT"),
("FLOAT8E5M2", "FLOAT16"),
("FLOAT8E5M2FNUZ", "FLOAT"),
("FLOAT8E5M2FNUZ", "FLOAT16"),
("FLOAT", "UINT4"),
("FLOAT16", "UINT4"),
("FLOAT", "INT4"),
("FLOAT16", "INT4"),
("UINT4", "FLOAT"),
("UINT4", "FLOAT16"),
("UINT4", "UINT8"),
("INT4", "FLOAT"),
("INT4", "FLOAT16"),
("INT4", "INT8"),
("FLOAT4E2M1", "FLOAT"),
("FLOAT4E2M1", "FLOAT16"),
("FLOAT", "FLOAT4E2M1"),
("FLOAT16", "FLOAT4E2M1"),
("FLOAT", "UINT2"),
("FLOAT16", "UINT2"),
("FLOAT", "INT2"),
("FLOAT16", "INT2"),
("UINT2", "FLOAT"),
("UINT2", "FLOAT16"),
("UINT2", "UINT8"),
("INT2", "FLOAT"),
("INT2", "FLOAT16"),
("INT2", "INT8"),
]
f8_types = {"FLOAT8E4M3FN", "FLOAT8E4M3FNUZ", "FLOAT8E5M2", "FLOAT8E5M2FNUZ"}
for from_type, to_type in test_cases:
if from_type == to_type:
# Skip cases where from_type and to_type are the same
continue
from_dtype = getattr(TensorProto, from_type)
to_dtype = getattr(TensorProto, to_type)
from_np_dtype = tensor_dtype_to_np_dtype(from_dtype)
to_np_dtype = tensor_dtype_to_np_dtype(to_dtype)
if from_type == "BFLOAT16" or to_type == "BFLOAT16":
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.816468",
"0.21087195",
"0.7229038",
"NaN",
"INF",
"+INF",
"-INF",
],
dtype=np.float32,
)
input_shape = (3, 4)
elif from_type in f8_types or to_type in f8_types:
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.7229038",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-0.0000001",
"0.0000001",
"-1000000",
],
dtype=np.float32,
)
input_shape = (3, 5)
elif from_type in ("UINT4", "INT4") or to_type in ("UINT4", "INT4"):
np_fp32 = np.arange(-9, 16).astype(np.float32)
input_shape = (5, 5)
elif from_type in ("UINT2", "INT2") or to_type in ("UINT2", "INT2"):
np_fp32 = np.arange(-3, 4).astype(np.float32)
input_shape = (7, 1)
elif from_type == "FLOAT4E2M1" or to_type == "FLOAT4E2M1":
np_fp32 = np.array(
[
"0.48",
"0.25",
"1.05",
"-3.5",
"-8",
"9",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-4",
"0.01",
"-0.0",
],
dtype=np.float32,
)
input_shape = (3, 5)
else:
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.816468",
"0.21087195",
"0.7229038",
"NaN",
"INF",
"+INF",
"-INF",
],
dtype=np.float32,
).reshape([3, 4])
input_shape = (3, 4)
if from_type in F8_TYPES:
np_from = onnx.numpy_helper.saturate_cast(np_fp32, from_np_dtype)
input = make_tensor(
"input",
from_dtype,
input_shape,
vals=np_from,
raw=True,
)
elif from_type in FOUR_BIT_TYPES:
np_from = np_fp32.astype(from_np_dtype)
packed = onnx.numpy_helper._pack_4bitx2(np_from)
# No byteswap needed on big-endian machines as _pack_4bitx2()
# returns a numpy array with uint8 datatype.
input = make_tensor(
"input", from_dtype, input_shape, vals=packed.tobytes(), raw=True
)
elif from_type in TWO_BIT_TYPES:
np_from = np_fp32.astype(from_np_dtype)
packed = onnx.numpy_helper._pack_2bitx4(np_from)
# No byteswap needed on big-endian machines as _pack_2bitx4()
# returns a numpy array with uint8 datatype.
input = make_tensor(
"input", from_dtype, input_shape, vals=packed.tobytes(), raw=True
)
else:
np_from = np_fp32.astype(from_np_dtype)
input = make_tensor(
"input", from_dtype, input_shape, vals=np_from, raw=True
)
if to_type in F8_TYPES:
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=onnx.numpy_helper.saturate_cast(np_from, to_np_dtype),
raw=True,
)
elif to_type in FOUR_BIT_TYPES:
packed = onnx.numpy_helper._pack_4bitx2(np_from.astype(to_np_dtype))
# No byteswap needed on big-endian machines as _pack_4bitx2()
# returns a numpy array with uint8 datatype.
output = make_tensor(
"output", to_dtype, input_shape, vals=packed.tobytes(), raw=True
)
elif to_type in TWO_BIT_TYPES:
packed = onnx.numpy_helper._pack_2bitx4(np_from.astype(to_np_dtype))
# No byteswap needed on big-endian machines as _pack_2bitx4()
# returns a numpy array with uint8 datatype.
output = make_tensor(
"output", to_dtype, input_shape, vals=packed.tobytes(), raw=True
)
else:
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=np_from.astype(to_np_dtype),
raw=True,
)
like = make_tensor("like", to_dtype, (0,), vals=[])
node = onnx.helper.make_node(
"CastLike",
inputs=["input", "like"],
outputs=["output"],
)
expect(
node,
inputs=[input, like],
outputs=[output],
name="test_castlike_" + from_type + "_to_" + to_type,
)
test_cases = itertools.product(
[
"FLOAT",
"FLOAT16",
],
[
"FLOAT8E4M3FN",
"FLOAT8E4M3FNUZ",
"FLOAT8E5M2",
"FLOAT8E5M2FNUZ",
],
)
input_shape = (3, 5)
for from_type, to_type in test_cases:
from_dtype = getattr(TensorProto, from_type)
to_dtype = getattr(TensorProto, to_type)
from_np_dtype = tensor_dtype_to_np_dtype(from_dtype)
to_np_dtype = tensor_dtype_to_np_dtype(to_dtype)
np_fp32 = np.array(
[
"0.47892547",
"0.48033667",
"0.49968487",
"0.81910545",
"0.47031248",
"0.7229038",
"1000000",
"1e-7",
"NaN",
"INF",
"+INF",
"-INF",
"-0.0000001",
"0.0000001",
"-1000000",
],
dtype=np.float32,
)
input = make_tensor(
"input",
from_dtype,
input_shape,
vals=np_fp32.astype(from_np_dtype),
raw=True,
)
output = make_tensor(
"output",
to_dtype,
input_shape,
vals=np_fp32.astype(from_np_dtype).astype(to_np_dtype),
raw=True,
)
like = make_tensor("like", to_dtype, (0,), vals=[])
node = onnx.helper.make_node(
"CastLike",
inputs=["input", "like"],
outputs=["output"],
saturate=0,
)
expect(
node,
inputs=[input, like],
outputs=[output],
name="test_castlike_no_saturate_" + from_type + "_to_" + to_type,
)
Ceil takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the ceil is, y = ceil(x), is applied to the tensor elementwise. If x is integral, +0, -0, NaN, or infinite, x itself is returned.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Ceil-1">1</a>, <a href="Changelog.md#Ceil-6">6</a>
node = onnx.helper.make_node(
"Ceil",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.5, 1.2]).astype(np.float32)
y = np.ceil(x) # expected output [-1., 2.]
expect(node, inputs=[x], outputs=[y], name="test_ceil_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.ceil(x)
expect(node, inputs=[x], outputs=[y], name="test_ceil")
Continuously Differentiable Exponential Linear Units: Perform the linear unit element-wise on the input tensor X using formula:
max(0,x) + min(0,alpha*(exp(x/alpha)-1))
This version of the operator has been available since version 12 of the default ONNX operator set.
alpha = 2.0
node = onnx.helper.make_node(
"Celu",
inputs=["X"],
outputs=["Y"],
alpha=alpha,
)
input_data = np.array(
[
[
[[0.8439683], [0.5665144], [0.05836735]],
[[0.02916367], [0.12964272], [0.5060197]],
[[0.79538304], [0.9411346], [0.9546573]],
],
[
[[0.17730942], [0.46192095], [0.26480448]],
[[0.6746842], [0.01665257], [0.62473077]],
[[0.9240844], [0.9722341], [0.11965699]],
],
[
[[0.41356155], [0.9129373], [0.59330076]],
[[0.81929934], [0.7862604], [0.11799799]],
[[0.69248444], [0.54119414], [0.07513223]],
],
],
dtype=np.float32,
)
# Calculate expected output data
positive_input = np.maximum(0, input_data)
negative_input = np.minimum(0, alpha * (np.exp(input_data / alpha) - 1))
expected_output = positive_input + negative_input
expect(node, inputs=[input_data], outputs=[expected_output], name="test_celu")
Center crop or pad an input to given dimensions.
The crop/pad dimensions can be specified for a subset of the axes; unspecified dimensions will remain unchanged.
If the input dimensions are larger than the target crop dimensions, a centered cropping window will be extracted from the input. The starting value for the cropping window is rounded down, which means that if the difference between the input shape and the crop shape is odd, the cropping window will be shifted half a pixel to the left of the input center.
If the input dimensions are smaller than the target crop dimensions, the input will be padded equally on both sides to center it in the output. In cases where the total number of padding pixels is odd, an additional pixel will be added to the right side.
The padding value used is zero.
This version of the operator has been available since version 18 of the default ONNX operator set.
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
)
# First dim is even diff, second is uneven
x = np.random.randn(20, 10, 3).astype(np.float32)
shape = np.array([10, 7, 3], dtype=np.int64)
y = x[5:15, 1:8, :]
expect(node, inputs=[x, shape], outputs=[y], name="test_center_crop_pad_crop")
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
)
# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 10, 3], dtype=np.int64)
y = np.zeros([10, 10, 3], dtype=np.float32)
y[:, 1:9, :] = x[5:15, :, :]
expect(
node,
inputs=[x, shape],
outputs=[y],
name="test_center_crop_pad_crop_and_pad",
)
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
axes=[1, 2],
)
# Cropping on second dim, padding on third, first stays the same
x = np.random.randn(3, 20, 8).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([3, 10, 9], dtype=np.float32)
y[:, :, :8] = x[:, 5:15, :]
expect(
node,
inputs=[x, shape],
outputs=[y],
name="test_center_crop_pad_crop_axes_chw",
)
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
axes=[0, 1],
)
# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([10, 9, 3], dtype=np.float32)
y[:, :8, :] = x[5:15, :, :]
expect(
node,
inputs=[x, shape],
outputs=[y],
name="test_center_crop_pad_crop_axes_hwc",
)
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
axes=[-3, -2],
)
# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([10, 9, 3], dtype=np.float32)
y[:, :8, :] = x[5:15, :, :]
expect(
node,
inputs=[x, shape],
outputs=[y],
name="test_center_crop_pad_crop_negative_axes_hwc",
)
node = onnx.helper.make_node(
"CenterCropPad",
inputs=["x", "shape"],
outputs=["y"],
)
# First dim is even diff, second is uneven
x = np.random.randn(10, 7, 3).astype(np.float32)
shape = np.array([20, 10, 3], dtype=np.int64)
y = np.zeros([20, 10, 3], dtype=np.float32)
y[5:15, 1:8, :] = x
expect(node, inputs=[x, shape], outputs=[y], name="test_center_crop_pad_pad")
Clip operator limits the given input within an interval. The interval is specified by the inputs 'min' and 'max'. They default to numeric_limits::lowest() and numeric_limits::max(), respectively. When 'min' is greater than 'max', the clip operator sets all the 'input' values to the value of 'max'. Thus, this is equivalent to 'Min(max, Max(input, min))'.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Clip-1">1</a>, <a href="Changelog.md#Clip-6">6</a>, <a href="Changelog.md#Clip-11">11</a>, <a href="Changelog.md#Clip-12">12</a>
node = onnx.helper.make_node(
"Clip",
inputs=["x", "min", "max"],
outputs=["y"],
)
x = np.array([-2, 0, 2]).astype(np.float32)
min_val = np.float32(-1)
max_val = np.float32(1)
y = np.clip(x, min_val, max_val) # expected output [-1., 0., 1.]
expect(
node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_example"
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, min_val, max_val)
expect(node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip")
node = onnx.helper.make_node(
"Clip",
inputs=["x", "min", "max"],
outputs=["y"],
)
min_val = np.float32(-5)
max_val = np.float32(5)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-1, 0, 1]).astype(np.float32)
expect(
node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_inbounds"
)
x = np.array([-6, 0, 6]).astype(np.float32)
y = np.array([-5, 0, 5]).astype(np.float32)
expect(
node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_outbounds"
)
x = np.array([-1, 0, 6]).astype(np.float32)
y = np.array([-1, 0, 5]).astype(np.float32)
expect(
node,
inputs=[x, min_val, max_val],
outputs=[y],
name="test_clip_splitbounds",
)
x = np.array([-2, 0, 6]).astype(np.float32)
y = np.array([1, 1, 1]).astype(np.float32)
min_val = np.float32(2)
max_val = np.float32(1)
expect(
node,
inputs=[x, min_val, max_val],
outputs=[y],
name="test_clip_min_greater_than_max",
)
node = onnx.helper.make_node(
"Clip",
inputs=["x", "min"],
outputs=["y"],
)
min_val = np.float32(0)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, min_val, np.inf)
expect(node, inputs=[x, min_val], outputs=[y], name="test_clip_default_min")
no_min = "" # optional input, not supplied
node = onnx.helper.make_node(
"Clip",
inputs=["x", no_min, "max"],
outputs=["y"],
)
max_val = np.float32(0)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, -np.inf, max_val)
expect(node, inputs=[x, max_val], outputs=[y], name="test_clip_default_max")
no_max = "" # optional input, not supplied
node = onnx.helper.make_node(
"Clip",
inputs=["x", no_min, no_max],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-1, 0, 1]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_clip_default_inbounds")
node = onnx.helper.make_node(
"Clip",
inputs=["x", "min"],
outputs=["y"],
)
min_val = np.int8(0)
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.clip(x, min_val, np.iinfo(np.int8).max)
expect(
node, inputs=[x, min_val], outputs=[y], name="test_clip_default_int8_min"
)
no_min = "" # optional input, not supplied
node = onnx.helper.make_node(
"Clip",
inputs=["x", no_min, "max"],
outputs=["y"],
)
max_val = np.int8(0)
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.clip(x, np.iinfo(np.int8).min, max_val)
expect(
node, inputs=[x, max_val], outputs=[y], name="test_clip_default_int8_max"
)
no_max = "" # optional input, not supplied
node = onnx.helper.make_node(
"Clip",
inputs=["x", no_min, no_max],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.int8)
y = np.array([-1, 0, 1]).astype(np.int8)
expect(node, inputs=[x], outputs=[y], name="test_clip_default_int8_inbounds")
The operator rearranges column blocks back into a multidimensional image
Col2Im behaves similarly to PyTorch's fold https://pytorch.org/docs/stable/generated/torch.nn.Fold.html, but it only supports batched multi-dimensional image tensors. Another implementation in Python with N-dimension support can be found at https://github.com/f-dangel/unfoldNd/.
NOTE: Although specifying image_shape looks redundant because it could be calculated from convolution formulas, it is required as input for more advanced scenarios as explained at PyTorch's implementation (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Col2Im.cpp#L10)
This version of the operator has been available since version 18 of the default ONNX operator set.
input = np.array(
[
[
[1.0, 6.0, 11.0, 16.0, 21.0], # (1, 5, 5)
[2.0, 7.0, 12.0, 17.0, 22.0],
[3.0, 8.0, 13.0, 18.0, 23.0],
[4.0, 9.0, 14.0, 19.0, 24.0],
[5.0, 0.0, 15.0, 20.0, 25.0],
]
]
).astype(np.float32)
image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([1, 5]).astype(np.int64)
node = onnx.helper.make_node(
"Col2Im", ["input", "image_shape", "block_shape"], ["output"]
)
output = np.array(
[
[
[
[1.0, 2.0, 3.0, 4.0, 5.0], # (1, 1, 5, 5)
[6.0, 7.0, 8.0, 9.0, 0.0],
[11.0, 12.0, 13.0, 14.0, 15.0],
[16.0, 17.0, 18.0, 19.0, 20.0],
[21.0, 22.0, 23.0, 24.0, 25.0],
]
]
]
).astype(np.float32)
expect(
node,
inputs=[input, image_shape, block_shape],
outputs=[output],
name="test_col2im",
)
input = np.array(
[
[
[1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56], # (1, 10, 12)
[2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52, 57],
[3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53, 58],
[4, 9, 14, 19, 24, 29, 34, 39, 44, 49, 54, 59],
[5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60],
[61, 66, 71, 76, 81, 86, 91, 96, 101, 106, 111, 116],
[62, 67, 72, 77, 82, 87, 92, 97, 102, 107, 112, 117],
[63, 68, 73, 78, 83, 88, 93, 98, 103, 108, 113, 118],
[64, 69, 74, 79, 84, 89, 94, 99, 104, 109, 114, 119],
[65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120],
]
]
).astype(np.float32)
image_shape = np.array([3, 4, 5]).astype(np.int64)
block_shape = np.array([1, 1, 5]).astype(np.int64)
output = np.array(
[
[
[
[
[1, 2, 3, 4, 5], # (1, 2, 3, 4, 5)
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
],
[
[21, 22, 23, 24, 25],
[26, 27, 28, 29, 30],
[31, 32, 33, 34, 35],
[36, 37, 38, 39, 40],
],
[
[41, 42, 43, 44, 45],
[46, 47, 48, 49, 50],
[51, 52, 53, 54, 55],
[56, 57, 58, 59, 60],
],
],
[
[
[61, 62, 63, 64, 65],
[66, 67, 68, 69, 70],
[71, 72, 73, 74, 75],
[76, 77, 78, 79, 80],
],
[
[81, 82, 83, 84, 85],
[86, 87, 88, 89, 90],
[91, 92, 93, 94, 95],
[96, 97, 98, 99, 100],
],
[
[101, 102, 103, 104, 105],
[106, 107, 108, 109, 110],
[111, 112, 113, 114, 115],
[116, 117, 118, 119, 120],
],
],
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"Col2Im", ["input", "image_shape", "block_shape"], ["output"]
)
expect(
node,
inputs=[input, image_shape, block_shape],
outputs=[output],
name="test_col2im_5d",
)
input = np.array(
[
[
[1.0, 5.0, 9.0, 13.0, 17], # (1, 4, 5)
[2.0, 6.0, 10.0, 14.0, 18],
[3.0, 7.0, 11.0, 15.0, 19],
[4.0, 8.0, 12.0, 16.0, 20],
]
]
).astype(np.float32)
image_shape = np.array([6, 6]).astype(np.int64)
block_shape = np.array([2, 2]).astype(np.int64)
output = np.array(
[
[
[
[1.0, 0.0, 0.0, 0.0, 0.0, 2.0], # (1, 1, 6, 6)
[8.0, 0.0, 0.0, 0.0, 0.0, 10.0],
[16.0, 0.0, 0.0, 0.0, 0.0, 18.0],
[24.0, 0.0, 0.0, 0.0, 0.0, 26.0],
[32.0, 0.0, 0.0, 0.0, 0.0, 34.0],
[19.0, 0.0, 0.0, 0.0, 0.0, 20.0],
]
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"Col2Im",
["input", "image_shape", "block_shape"],
["output"],
dilations=[1, 5],
)
expect(
node,
inputs=[input, image_shape, block_shape],
outputs=[output],
name="test_col2im_dilations",
)
input = np.array(
[
[
[
1.0,
6.0,
11.0,
16.0,
21.0,
26,
31,
36,
41,
46,
51,
56,
61,
66,
71,
], # (1, 5, 15)
[
2.0,
7.0,
12.0,
17.0,
22.0,
27,
32,
37,
42,
47,
52,
57,
62,
67,
72,
],
[
3.0,
8.0,
13.0,
18.0,
23.0,
28,
33,
38,
43,
48,
53,
58,
63,
68,
73,
],
[
4.0,
9.0,
14.0,
19.0,
24.0,
29,
34,
39,
44,
49,
54,
59,
64,
69,
74,
],
[
5.0,
10.0,
15.0,
20.0,
25.0,
30,
35,
40,
45,
50,
55,
60,
65,
70,
75,
],
]
]
).astype(np.float32)
image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([1, 5]).astype(np.int64)
output = np.array(
[
[
[
[8.0, 21.0, 24.0, 27.0, 24.0], # (1, 1, 5, 5)
[38.0, 66.0, 69.0, 72.0, 54.0],
[68.0, 111.0, 114.0, 117.0, 84.0],
[98.0, 156.0, 159.0, 162.0, 114.0],
[128.0, 201.0, 204.0, 207.0, 144.0],
]
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"Col2Im",
["input", "image_shape", "block_shape"],
["output"],
pads=[0, 1, 0, 1],
)
expect(
node,
inputs=[input, image_shape, block_shape],
outputs=[output],
name="test_col2im_pads",
)
input = np.array(
[
[
[0.0, 0.0, 0.0, 0.0], # (1, 9, 4)
[1.0, 1.0, 1.0, 1.0],
[1.0, 1.0, 1.0, 1.0],
[1.0, 1.0, 1.0, 1.0],
[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0],
[1.0, 1.0, 1.0, 1.0],
[0.0, 0.0, 0.0, 0.0],
]
]
).astype(np.float32)
image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([3, 3]).astype(np.int64)
output = np.array(
[
[
[
[0.0, 1.0, 1.0, 1.0, 1.0], # (1, 1, 5, 5)
[1.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 2.0, 1.0, 2.0, 1.0],
[1.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 1.0, 0.0],
]
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"Col2Im",
["input", "image_shape", "block_shape"],
["output"],
strides=[2, 2],
)
expect(
node,
inputs=[input, image_shape, block_shape],
outputs=[output],
name="test_col2im_strides",
)
Selects slices from an input tensor along a given axis where condition evaluates to True for each axis index. In case axis is not provided, input is flattened before elements are selected. Compress behaves like numpy.compress: https://docs.scipy.org/doc/numpy/reference/generated/numpy.compress.html
This version of the operator has been available since version 11 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Compress-9">9</a>
node = onnx.helper.make_node(
"Compress",
inputs=["input", "condition"],
outputs=["output"],
axis=0,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1, 1])
output = np.compress(condition, input, axis=0)
# print(output)
# [[ 3. 4.]
# [ 5. 6.]]
expect(
node,
inputs=[input, condition.astype(bool)],
outputs=[output],
name="test_compress_0",
)
node = onnx.helper.make_node(
"Compress",
inputs=["input", "condition"],
outputs=["output"],
axis=1,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1])
output = np.compress(condition, input, axis=1)
# print(output)
# [[ 2.]
# [ 4.]
# [ 6.]]
expect(
node,
inputs=[input, condition.astype(bool)],
outputs=[output],
name="test_compress_1",
)
node = onnx.helper.make_node(
"Compress",
inputs=["input", "condition"],
outputs=["output"],
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1, 0, 0, 1])
output = np.compress(condition, input)
# print(output)
# [ 2., 5.]
expect(
node,
inputs=[input, condition.astype(bool)],
outputs=[output],
name="test_compress_default_axis",
)
node = onnx.helper.make_node(
"Compress",
inputs=["input", "condition"],
outputs=["output"],
axis=-1,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1])
output = np.compress(condition, input, axis=-1)
# print(output)
# [[ 2.]
# [ 4.]
# [ 6.]]
expect(
node,
inputs=[input, condition.astype(bool)],
outputs=[output],
name="test_compress_negative_axis",
)
Concatenate a list of tensors into a single tensor. All input tensors must have the same shape, except for the dimension size of the axis to concatenate on.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Concat-1">1</a>, <a href="Changelog.md#Concat-4">4</a>, <a href="Changelog.md#Concat-11">11</a>
test_cases: dict[str, Sequence[Any]] = {
"1d": ([1, 2], [3, 4]),
"2d": ([[1, 2], [3, 4]], [[5, 6], [7, 8]]),
"3d": (
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]],
[[[9, 10], [11, 12]], [[13, 14], [15, 16]]],
),
}
for test_case, values_ in test_cases.items():
values = [np.asarray(v, dtype=np.float32) for v in values_]
for i in range(len(values[0].shape)):
in_args = ["value" + str(k) for k in range(len(values))]
node = onnx.helper.make_node(
"Concat", inputs=list(in_args), outputs=["output"], axis=i
)
output = np.concatenate(values, i)
expect(
node,
inputs=list(values),
outputs=[output],
name="test_concat_" + test_case + "_axis_" + str(i),
)
for i in range(-len(values[0].shape), 0):
in_args = ["value" + str(k) for k in range(len(values))]
node = onnx.helper.make_node(
"Concat", inputs=list(in_args), outputs=["output"], axis=i
)
output = np.concatenate(values, i)
expect(
node,
inputs=list(values),
outputs=[output],
name="test_concat_" + test_case + "_axis_negative_" + str(abs(i)),
)
Concatenate a sequence of tensors into a single tensor. All input tensors must have the same shape, except for the dimension size of the axis to concatenate on. By default 'new_axis' is 0, the behavior is similar to numpy.concatenate. When 'new_axis' is 1, the behavior is similar to numpy.stack.
This version of the operator has been available since version 11 of the default ONNX operator set.
This operator produces a constant tensor. Exactly one of the provided attributes, either value, sparse_value, or value_* must be specified.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Constant-1">1</a>, <a href="Changelog.md#Constant-9">9</a>, <a href="Changelog.md#Constant-11">11</a>, <a href="Changelog.md#Constant-12">12</a>, <a href="Changelog.md#Constant-13">13</a>, <a href="Changelog.md#Constant-19">19</a>, <a href="Changelog.md#Constant-21">21</a>, <a href="Changelog.md#Constant-23">23</a>, <a href="Changelog.md#Constant-24">24</a>
values = np.random.randn(5, 5).astype(np.float32)
node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["values"],
value=onnx.helper.make_tensor(
name="const_tensor",
data_type=onnx.TensorProto.FLOAT,
dims=values.shape,
vals=values.flatten().astype(float),
),
)
expect(node, inputs=[], outputs=[values], name="test_constant")
Generate a tensor with given value and shape.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ConstantOfShape-9">9</a>, <a href="Changelog.md#ConstantOfShape-20">20</a>, <a href="Changelog.md#ConstantOfShape-21">21</a>, <a href="Changelog.md#ConstantOfShape-23">23</a>, <a href="Changelog.md#ConstantOfShape-24">24</a>
x = np.array([4, 3, 2]).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
"value", onnx.TensorProto.FLOAT, [1], [1]
)
node = onnx.helper.make_node(
"ConstantOfShape",
inputs=["x"],
outputs=["y"],
value=tensor_value,
)
y = np.ones(x, dtype=np.float32)
expect(node, inputs=[x], outputs=[y], name="test_constantofshape_float_ones")
x = np.array(
[
0,
]
).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
"value", onnx.TensorProto.INT32, [1], [0]
)
node = onnx.helper.make_node(
"ConstantOfShape",
inputs=["x"],
outputs=["y"],
value=tensor_value,
)
y = np.zeros(x, dtype=np.int32)
expect(
node, inputs=[x], outputs=[y], name="test_constantofshape_int_shape_zero"
)
x = np.array([10, 6]).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
"value", onnx.TensorProto.INT32, [1], [0]
)
node = onnx.helper.make_node(
"ConstantOfShape",
inputs=["x"],
outputs=["y"],
value=tensor_value,
)
y = np.zeros(x, dtype=np.int32)
expect(node, inputs=[x], outputs=[y], name="test_constantofshape_int_zeros")
The convolution operator consumes an input tensor and a filter, and computes the output.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Conv-1">1</a>, <a href="Changelog.md#Conv-11">11</a>
x = np.array(
[
[
[
[0.0, 1.0, 2.0, 3.0, 4.0], # (1, 1, 5, 5) input tensor
[5.0, 6.0, 7.0, 8.0, 9.0],
[10.0, 11.0, 12.0, 13.0, 14.0],
[15.0, 16.0, 17.0, 18.0, 19.0],
[20.0, 21.0, 22.0, 23.0, 24.0],
]
]
]
).astype(np.float32)
W = np.array(
[
[
[
[1.0, 1.0, 1.0], # (1, 1, 3, 3) tensor for convolution weights
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
]
]
]
).astype(np.float32)
# Convolution with padding
node_with_padding = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
kernel_shape=[3, 3],
# Default values for other attributes: strides=[1, 1], dilations=[1, 1], groups=1
pads=[1, 1, 1, 1],
)
y_with_padding = np.array(
[
[
[
[12.0, 21.0, 27.0, 33.0, 24.0], # (1, 1, 5, 5) output tensor
[33.0, 54.0, 63.0, 72.0, 51.0],
[63.0, 99.0, 108.0, 117.0, 81.0],
[93.0, 144.0, 153.0, 162.0, 111.0],
[72.0, 111.0, 117.0, 123.0, 84.0],
]
]
]
).astype(np.float32)
expect(
node_with_padding,
inputs=[x, W],
outputs=[y_with_padding],
name="test_basic_conv_with_padding",
)
# Convolution without padding
node_without_padding = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
kernel_shape=[3, 3],
# Default values for other attributes: strides=[1, 1], dilations=[1, 1], groups=1
pads=[0, 0, 0, 0],
)
y_without_padding = np.array(
[
[
[
[54.0, 63.0, 72.0], # (1, 1, 3, 3) output tensor
[99.0, 108.0, 117.0],
[144.0, 153.0, 162.0],
]
]
]
).astype(np.float32)
expect(
node_without_padding,
inputs=[x, W],
outputs=[y_without_padding],
name="test_basic_conv_without_padding",
)
x = np.array(
[
[
[
[0.0, 1.0, 2.0, 3.0, 4.0], # (1, 1, 5, 5) input tensor
[5.0, 6.0, 7.0, 8.0, 9.0],
[10.0, 11.0, 12.0, 13.0, 14.0],
[15.0, 16.0, 17.0, 18.0, 19.0],
[20.0, 21.0, 22.0, 23.0, 24.0],
]
]
]
).astype(np.float32)
W = np.array(
[
[
[
[1.0, 1.0, 1.0], # (1, 1, 3, 3) tensor for convolution weights
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
]
]
]
).astype(np.float32)
# Convolution with auto_pad='SAME_LOWER' and strides=2
node = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
auto_pad="SAME_LOWER",
kernel_shape=[3, 3],
strides=[2, 2],
)
y = np.array(
[[[[12.0, 27.0, 24.0], [63.0, 108.0, 81.0], [72.0, 117.0, 84.0]]]]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_conv_with_autopad_same")
x = np.array(
[
[
[
[0.0, 1.0, 2.0, 3.0, 4.0], # (1, 1, 7, 5) input tensor
[5.0, 6.0, 7.0, 8.0, 9.0],
[10.0, 11.0, 12.0, 13.0, 14.0],
[15.0, 16.0, 17.0, 18.0, 19.0],
[20.0, 21.0, 22.0, 23.0, 24.0],
[25.0, 26.0, 27.0, 28.0, 29.0],
[30.0, 31.0, 32.0, 33.0, 34.0],
]
]
]
).astype(np.float32)
W = np.array(
[
[
[
[1.0, 1.0, 1.0], # (1, 1, 3, 3) tensor for convolution weights
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
]
]
]
).astype(np.float32)
# Convolution with strides=2 and padding
node_with_padding = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[1, 1, 1, 1],
strides=[
2,
2,
], # Default values for other attributes: dilations=[1, 1], groups=1
)
y_with_padding = np.array(
[
[
[
[12.0, 27.0, 24.0], # (1, 1, 4, 3) output tensor
[63.0, 108.0, 81.0],
[123.0, 198.0, 141.0],
[112.0, 177.0, 124.0],
]
]
]
).astype(np.float32)
expect(
node_with_padding,
inputs=[x, W],
outputs=[y_with_padding],
name="test_conv_with_strides_padding",
)
# Convolution with strides=2 and no padding
node_without_padding = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[0, 0, 0, 0],
strides=[
2,
2,
], # Default values for other attributes: dilations=[1, 1], groups=1
)
y_without_padding = np.array(
[
[
[
[54.0, 72.0], # (1, 1, 3, 2) output tensor
[144.0, 162.0],
[234.0, 252.0],
]
]
]
).astype(np.float32)
expect(
node_without_padding,
inputs=[x, W],
outputs=[y_without_padding],
name="test_conv_with_strides_no_padding",
)
# Convolution with strides=2 and padding only along one dimension (the H dimension in NxCxHxW tensor)
node_with_asymmetric_padding = onnx.helper.make_node(
"Conv",
inputs=["x", "W"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[1, 0, 1, 0],
strides=[
2,
2,
], # Default values for other attributes: dilations=[1, 1], groups=1
)
y_with_asymmetric_padding = np.array(
[
[
[
[21.0, 33.0], # (1, 1, 4, 2) output tensor
[99.0, 117.0],
[189.0, 207.0],
[171.0, 183.0],
]
]
]
).astype(np.float32)
expect(
node_with_asymmetric_padding,
inputs=[x, W],
outputs=[y_with_asymmetric_padding],
name="test_conv_with_strides_and_asymmetric_padding",
)
The integer convolution operator consumes an input tensor, its zero-point, a filter, and its zero-point, and computes the output. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.
This version of the operator has been available since version 10 of the default ONNX operator set.
x = (
np.array([2, 3, 4, 5, 6, 7, 8, 9, 10])
.astype(np.uint8)
.reshape((1, 1, 3, 3))
)
x_zero_point = np.uint8(1)
w_zero_points = np.array([0, 1], dtype=np.uint8)
w = np.array([1, 1, 1, 1, 1, 1, 1, 1]).astype(np.uint8).reshape((2, 1, 2, 2))
y = (
np.array(
[
1,
3,
5,
3,
5,
12,
16,
9,
11,
24,
28,
15,
7,
15,
17,
9,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
]
)
.astype(np.int32)
.reshape((1, 2, 4, 4))
)
# ConvInteger with padding
convinteger_node_with_padding = onnx.helper.make_node(
"ConvInteger",
inputs=["x", "w", "x_zero_point", "w_zero_points"],
outputs=["y"],
pads=[1, 1, 1, 1],
)
expect(
convinteger_node_with_padding,
inputs=[x, w, x_zero_point, w_zero_points],
outputs=[y],
name="test_convinteger_with_padding",
)
x = (
np.array([2, 3, 4, 5, 6, 7, 8, 9, 10])
.astype(np.uint8)
.reshape((1, 1, 3, 3))
)
x_zero_point = np.uint8(1)
w = np.array([1, 1, 1, 1]).astype(np.uint8).reshape((1, 1, 2, 2))
y = np.array([12, 16, 24, 28]).astype(np.int32).reshape(1, 1, 2, 2)
# ConvInteger without padding
convinteger_node = onnx.helper.make_node(
"ConvInteger", inputs=["x", "w", "x_zero_point"], outputs=["y"]
)
expect(
convinteger_node,
inputs=[x, w, x_zero_point],
outputs=[y],
name="test_convinteger_without_padding",
)
The convolution transpose operator consumes an input tensor and a filter, and computes the output.
If the pads parameter is provided the shape of the output is calculated via the following equation:
output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]
output_shape can also be explicitly specified in which case pads values are auto generated using these equations:
total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i]
If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ConvTranspose-1">1</a>, <a href="Changelog.md#ConvTranspose-11">11</a>
x = np.array(
[[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]] # (1, 1, 3, 3)
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], # (1, 2, 3, 3)
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
]
]
).astype(np.float32)
node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])
y = np.array(
[
[
[
[0.0, 1.0, 3.0, 3.0, 2.0], # (1, 2, 5, 5)
[3.0, 8.0, 15.0, 12.0, 7.0],
[9.0, 21.0, 36.0, 27.0, 15.0],
[9.0, 20.0, 33.0, 24.0, 13.0],
[6.0, 13.0, 21.0, 15.0, 8.0],
],
[
[0.0, 1.0, 3.0, 3.0, 2.0],
[3.0, 8.0, 15.0, 12.0, 7.0],
[9.0, 21.0, 36.0, 27.0, 15.0],
[9.0, 20.0, 33.0, 24.0, 13.0],
[6.0, 13.0, 21.0, 15.0, 8.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose")
x = np.array([[[0.0, 1.0, 2.0]]]).astype(np.float32) # (1, 1, 3)
W = np.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]]).astype( # (1, 2, 3)
np.float32
)
node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])
y = np.array(
[[[0.0, 1.0, 3.0, 3.0, 2.0], [0.0, 1.0, 3.0, 3.0, 2.0]]] # (1, 2, 5)
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_1d")
x = np.array(
[
[
[
[
[0.0, 1.0, 2.0, 3.0, 4.0], # (1, 1, 3, 4, 5)
[5.0, 6.0, 7.0, 8.0, 9.0],
[10.0, 11.0, 12.0, 13.0, 14.0],
[15.0, 16.0, 17.0, 18.0, 19.0],
],
[
[20.0, 21.0, 22.0, 23.0, 24.0],
[25.0, 26.0, 27.0, 28.0, 29.0],
[30.0, 31.0, 32.0, 33.0, 34.0],
[35.0, 36.0, 37.0, 38.0, 39.0],
],
[
[40.0, 41.0, 42.0, 43.0, 44.0],
[45.0, 46.0, 47.0, 48.0, 49.0],
[50.0, 51.0, 52.0, 53.0, 54.0],
[55.0, 56.0, 57.0, 58.0, 59.0],
],
]
]
]
).astype(np.float32)
W = np.array(
[
[
[
[
[1.0, 1.0, 1.0], # (1, 2, 3, 3, 3)
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
],
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
]
]
).astype(np.float32)
node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])
y = np.array(
[
[
[
[
[0.0, 1.0, 3.0, 6.0, 9.0, 7.0, 4.0], # (1, 2, 5, 6, 7)
[5.0, 12.0, 21.0, 27.0, 33.0, 24.0, 13.0],
[15.0, 33.0, 54.0, 63.0, 72.0, 51.0, 27.0],
[30.0, 63.0, 99.0, 108.0, 117.0, 81.0, 42.0],
[25.0, 52.0, 81.0, 87.0, 93.0, 64.0, 33.0],
[15.0, 31.0, 48.0, 51.0, 54.0, 37.0, 19.0],
],
[
[20.0, 42.0, 66.0, 72.0, 78.0, 54.0, 28.0],
[50.0, 104.0, 162.0, 174.0, 186.0, 128.0, 66.0],
[90.0, 186.0, 288.0, 306.0, 324.0, 222.0, 114.0],
[120.0, 246.0, 378.0, 396.0, 414.0, 282.0, 144.0],
[90.0, 184.0, 282.0, 294.0, 306.0, 208.0, 106.0],
[50.0, 102.0, 156.0, 162.0, 168.0, 114.0, 58.0],
],
[
[60.0, 123.0, 189.0, 198.0, 207.0, 141.0, 72.0],
[135.0, 276.0, 423.0, 441.0, 459.0, 312.0, 159.0],
[225.0, 459.0, 702.0, 729.0, 756.0, 513.0, 261.0],
[270.0, 549.0, 837.0, 864.0, 891.0, 603.0, 306.0],
[195.0, 396.0, 603.0, 621.0, 639.0, 432.0, 219.0],
[105.0, 213.0, 324.0, 333.0, 342.0, 231.0, 117.0],
],
[
[60.0, 122.0, 186.0, 192.0, 198.0, 134.0, 68.0],
[130.0, 264.0, 402.0, 414.0, 426.0, 288.0, 146.0],
[210.0, 426.0, 648.0, 666.0, 684.0, 462.0, 234.0],
[240.0, 486.0, 738.0, 756.0, 774.0, 522.0, 264.0],
[170.0, 344.0, 522.0, 534.0, 546.0, 368.0, 186.0],
[90.0, 182.0, 276.0, 282.0, 288.0, 194.0, 98.0],
],
[
[40.0, 81.0, 123.0, 126.0, 129.0, 87.0, 44.0],
[85.0, 172.0, 261.0, 267.0, 273.0, 184.0, 93.0],
[135.0, 273.0, 414.0, 423.0, 432.0, 291.0, 147.0],
[150.0, 303.0, 459.0, 468.0, 477.0, 321.0, 162.0],
[105.0, 212.0, 321.0, 327.0, 333.0, 224.0, 113.0],
[55.0, 111.0, 168.0, 171.0, 174.0, 117.0, 59.0],
],
],
[
[
[0.0, 1.0, 3.0, 6.0, 9.0, 7.0, 4.0],
[5.0, 12.0, 21.0, 27.0, 33.0, 24.0, 13.0],
[15.0, 33.0, 54.0, 63.0, 72.0, 51.0, 27.0],
[30.0, 63.0, 99.0, 108.0, 117.0, 81.0, 42.0],
[25.0, 52.0, 81.0, 87.0, 93.0, 64.0, 33.0],
[15.0, 31.0, 48.0, 51.0, 54.0, 37.0, 19.0],
],
[
[20.0, 42.0, 66.0, 72.0, 78.0, 54.0, 28.0],
[50.0, 104.0, 162.0, 174.0, 186.0, 128.0, 66.0],
[90.0, 186.0, 288.0, 306.0, 324.0, 222.0, 114.0],
[120.0, 246.0, 378.0, 396.0, 414.0, 282.0, 144.0],
[90.0, 184.0, 282.0, 294.0, 306.0, 208.0, 106.0],
[50.0, 102.0, 156.0, 162.0, 168.0, 114.0, 58.0],
],
[
[60.0, 123.0, 189.0, 198.0, 207.0, 141.0, 72.0],
[135.0, 276.0, 423.0, 441.0, 459.0, 312.0, 159.0],
[225.0, 459.0, 702.0, 729.0, 756.0, 513.0, 261.0],
[270.0, 549.0, 837.0, 864.0, 891.0, 603.0, 306.0],
[195.0, 396.0, 603.0, 621.0, 639.0, 432.0, 219.0],
[105.0, 213.0, 324.0, 333.0, 342.0, 231.0, 117.0],
],
[
[60.0, 122.0, 186.0, 192.0, 198.0, 134.0, 68.0],
[130.0, 264.0, 402.0, 414.0, 426.0, 288.0, 146.0],
[210.0, 426.0, 648.0, 666.0, 684.0, 462.0, 234.0],
[240.0, 486.0, 738.0, 756.0, 774.0, 522.0, 264.0],
[170.0, 344.0, 522.0, 534.0, 546.0, 368.0, 186.0],
[90.0, 182.0, 276.0, 282.0, 288.0, 194.0, 98.0],
],
[
[40.0, 81.0, 123.0, 126.0, 129.0, 87.0, 44.0],
[85.0, 172.0, 261.0, 267.0, 273.0, 184.0, 93.0],
[135.0, 273.0, 414.0, 423.0, 432.0, 291.0, 147.0],
[150.0, 303.0, 459.0, 468.0, 477.0, 321.0, 162.0],
[105.0, 212.0, 321.0, 327.0, 333.0, 224.0, 113.0],
[55.0, 111.0, 168.0, 171.0, 174.0, 117.0, 59.0],
],
],
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_3d")
x = np.array(
[[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]] # (1, 1, 3, 3)
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], # (1, 2, 3, 3)
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
]
]
).astype(np.float32)
y = np.array(
[
[
[
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0], # (1, 2, 10, 8)
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
],
[
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
],
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], output_shape=[10, 8]
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_output_shape")
node = onnx.helper.make_node(
"ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], output_padding=[1, 1]
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_pad")
node = onnx.helper.make_node(
"ConvTranspose",
["X", "W"],
["Y"],
name="test",
strides=[3, 2],
output_shape=[10, 8],
kernel_shape=[3, 3],
output_padding=[1, 1],
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_kernel_shape")
x = np.array(
[[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]] # (1, 1, 3, 3)
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], # (1, 2, 3, 3)
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"ConvTranspose", ["X", "W"], ["Y"], auto_pad="SAME_UPPER", strides=[2, 2]
)
y = np.array(
[
[
[
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
[3.0, 3.0, 8.0, 5.0, 12.0, 7.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0],
[9.0, 9.0, 20.0, 11.0, 24.0, 13.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0],
],
[
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
[0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
[3.0, 3.0, 8.0, 5.0, 12.0, 7.0],
[3.0, 3.0, 7.0, 4.0, 9.0, 5.0],
[9.0, 9.0, 20.0, 11.0, 24.0, 13.0],
[6.0, 6.0, 13.0, 7.0, 15.0, 8.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_autopad_same")
x = np.array(
[[[[3.0, 8.0, 1.0], [9.0, 5.0, 7.0], [3.0, 2.0, 6.0]]]] # (1, 1, 3, 3)
).astype(np.float32)
W = np.array([[[[7.0, 2.0], [1.0, 9.0]]]]).astype(np.float32) # (1, 1, 2, 2)
node = onnx.helper.make_node(
"ConvTranspose", ["X", "W"], ["Y"], dilations=[2, 2]
)
y = np.array(
[
[
[
[21.0, 56.0, 13.0, 16.0, 2.0], # [1, 1, 5, 5]
[63.0, 35.0, 67.0, 10.0, 14.0],
[24.0, 22.0, 76.0, 76.0, 21.0],
[9.0, 5.0, 88.0, 45.0, 63.0],
[3.0, 2.0, 33.0, 18.0, 54.0],
]
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_dilations")
x = np.array(
[
[
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0], [15.0, 16.0, 17.0]],
]
]
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
]
).astype(np.float32)
node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"], group=2)
y = np.array(
[
[
[
[0.0, 1.0, 3.0, 3.0, 2.0],
[3.0, 8.0, 15.0, 12.0, 7.0],
[9.0, 21.0, 36.0, 27.0, 15.0],
[9.0, 20.0, 33.0, 24.0, 13.0],
[6.0, 13.0, 21.0, 15.0, 8.0],
],
[
[9.0, 19.0, 30.0, 21.0, 11.0],
[21.0, 44.0, 69.0, 48.0, 25.0],
[36.0, 75.0, 117.0, 81.0, 42.0],
[27.0, 56.0, 87.0, 60.0, 31.0],
[15.0, 31.0, 48.0, 33.0, 17.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_group_2")
x = np.array(
[
[
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0], [15.0, 16.0, 17.0]],
],
[
[[18.0, 19.0, 20.0], [21.0, 22.0, 23.0], [24.0, 25.0, 26.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0], [15.0, 16.0, 17.0]],
],
[
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0], [15.0, 16.0, 17.0]],
],
]
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
],
]
).astype(np.float32)
node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"], group=2)
y = np.array(
[
[
[
[0.0, 1.0, 3.0, 3.0, 2.0],
[3.0, 8.0, 15.0, 12.0, 7.0],
[9.0, 21.0, 36.0, 27.0, 15.0],
[9.0, 20.0, 33.0, 24.0, 13.0],
[6.0, 13.0, 21.0, 15.0, 8.0],
],
[
[9.0, 19.0, 30.0, 21.0, 11.0],
[21.0, 44.0, 69.0, 48.0, 25.0],
[36.0, 75.0, 117.0, 81.0, 42.0],
[27.0, 56.0, 87.0, 60.0, 31.0],
[15.0, 31.0, 48.0, 33.0, 17.0],
],
],
[
[
[18.0, 37.0, 57.0, 39.0, 20.0],
[39.0, 80.0, 123.0, 84.0, 43.0],
[63.0, 129.0, 198.0, 135.0, 69.0],
[45.0, 92.0, 141.0, 96.0, 49.0],
[24.0, 49.0, 75.0, 51.0, 26.0],
],
[
[9.0, 19.0, 30.0, 21.0, 11.0],
[21.0, 44.0, 69.0, 48.0, 25.0],
[36.0, 75.0, 117.0, 81.0, 42.0],
[27.0, 56.0, 87.0, 60.0, 31.0],
[15.0, 31.0, 48.0, 33.0, 17.0],
],
],
[
[
[0.0, 1.0, 3.0, 3.0, 2.0],
[3.0, 8.0, 15.0, 12.0, 7.0],
[9.0, 21.0, 36.0, 27.0, 15.0],
[9.0, 20.0, 33.0, 24.0, 13.0],
[6.0, 13.0, 21.0, 15.0, 8.0],
],
[
[9.0, 19.0, 30.0, 21.0, 11.0],
[21.0, 44.0, 69.0, 48.0, 25.0],
[36.0, 75.0, 117.0, 81.0, 42.0],
[27.0, 56.0, 87.0, 60.0, 31.0],
[15.0, 31.0, 48.0, 33.0, 17.0],
],
],
]
).astype(np.float32)
expect(
node, inputs=[x, W], outputs=[y], name="test_convtranspose_group_2_image_3"
)
x = np.array(
[[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]] # (1, 1, 3, 3)
).astype(np.float32)
W = np.array(
[
[
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], # (1, 2, 3, 3)
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
]
]
).astype(np.float32)
node = onnx.helper.make_node(
"ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], pads=[1, 2, 1, 2]
)
y = np.array(
[
[
[
[1.0, 1.0, 3.0], # (1, 2, 7, 3)
[1.0, 1.0, 3.0],
[7.0, 4.0, 9.0],
[7.0, 4.0, 9.0],
[7.0, 4.0, 9.0],
[13.0, 7.0, 15.0],
[13.0, 7.0, 15.0],
],
[
[1.0, 1.0, 3.0],
[1.0, 1.0, 3.0],
[7.0, 4.0, 9.0],
[7.0, 4.0, 9.0],
[7.0, 4.0, 9.0],
[13.0, 7.0, 15.0],
[13.0, 7.0, 15.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_pads")
Calculates the cosine of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Cos-7">7</a>
node = onnx.helper.make_node(
"Cos",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.cos(x)
expect(node, inputs=[x], outputs=[y], name="test_cos_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.cos(x)
expect(node, inputs=[x], outputs=[y], name="test_cos")
Calculates the hyperbolic cosine of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Cosh-9">9</a>
node = onnx.helper.make_node(
"Cosh",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.cosh(x) # expected output [1.54308069, 1., 1.54308069]
expect(node, inputs=[x], outputs=[y], name="test_cosh_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.cosh(x)
expect(node, inputs=[x], outputs=[y], name="test_cosh")
Performs cumulative product of the input elements along the given axis.
By default, it will do the product inclusively meaning the first element is copied as is.
Through an exclusive attribute, this behavior can change to exclude the first element.
It can also perform product in the opposite direction of the axis. For that, set reverse attribute to 1.
Example:
input_x = [1, 2, 3]
axis=0
output = [1, 2, 6]
exclusive=1
output = [1, 1, 2]
exclusive=0
reverse=1
output = [6, 6, 3]
exclusive=1
reverse=1
output = [6, 3, 1]
This version of the operator has been available since version 26 of the default ONNX operator set.
node = onnx.helper.make_node("CumProd", inputs=["x", "axis"], outputs=["y"])
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.array(0, dtype=np.int32)
y = np.array([1.0, 2.0, 6.0, 24.0, 120.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_1d")
node = onnx.helper.make_node(
"CumProd", inputs=["x", "axis"], outputs=["y"], exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.array(0, dtype=np.int32)
y = np.array([1.0, 1.0, 2.0, 6.0, 24.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_1d_exclusive")
node = onnx.helper.make_node(
"CumProd", inputs=["x", "axis"], outputs=["y"], exclusive=1
)
x = np.array([1, 2, 3, 4, 5]).astype(np.int32)
axis = np.array(0, dtype=np.int32)
y = np.array([1, 1, 2, 6, 24]).astype(np.int32)
expect(
node, inputs=[x, axis], outputs=[y], name="test_cumprod_1d_int32_exclusive"
)
node = onnx.helper.make_node(
"CumProd", inputs=["x", "axis"], outputs=["y"], reverse=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.array(0, dtype=np.int32)
y = np.array([120.0, 120.0, 60.0, 20.0, 5.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_1d_reverse")
node = onnx.helper.make_node(
"CumProd", inputs=["x", "axis"], outputs=["y"], reverse=1, exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.array(0, dtype=np.int32)
y = np.array([120.0, 60.0, 20.0, 5.0, 1.0]).astype(np.float64)
expect(
node,
inputs=[x, axis],
outputs=[y],
name="test_cumprod_1d_reverse_exclusive",
)
node = onnx.helper.make_node(
"CumProd",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.array(0, dtype=np.int32)
y = (
np.array([1.0, 2.0, 3.0, 4.0, 10.0, 18.0])
.astype(np.float64)
.reshape((2, 3))
)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_2d_axis_0")
node = onnx.helper.make_node(
"CumProd",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.array(1, dtype=np.int32)
y = (
np.array([1.0, 2.0, 6.0, 4.0, 20.0, 120.0])
.astype(np.float64)
.reshape((2, 3))
)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_2d_axis_1")
node = onnx.helper.make_node(
"CumProd",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.int32).reshape((2, 3))
axis = np.array(0, dtype=np.int32)
y = np.array([1, 2, 3, 4, 10, 18]).astype(np.int32).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumprod_2d_int32")
node = onnx.helper.make_node(
"CumProd",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.array(-1, dtype=np.int32)
y = (
np.array([1.0, 2.0, 6.0, 4.0, 20.0, 120.0])
.astype(np.float64)
.reshape((2, 3))
)
expect(
node, inputs=[x, axis], outputs=[y], name="test_cumprod_2d_negative_axis"
)
Performs cumulative sum of the input elements along the given axis.
By default, it will do the sum inclusively meaning the first element is copied as is.
Through an exclusive attribute, this behavior can change to exclude the first element.
It can also perform summation in the opposite direction of the axis. For that, set reverse attribute to 1.
Example:
input_x = [1, 2, 3]
axis=0
output = [1, 3, 6]
exclusive=1
output = [0, 1, 3]
exclusive=0
reverse=1
output = [6, 5, 3]
exclusive=1
reverse=1
output = [5, 3, 0]
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#CumSum-11">11</a>
node = onnx.helper.make_node("CumSum", inputs=["x", "axis"], outputs=["y"])
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([1.0, 3.0, 6.0, 10.0, 15.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d")
node = onnx.helper.make_node(
"CumSum", inputs=["x", "axis"], outputs=["y"], exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([0.0, 1.0, 3.0, 6.0, 10.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_exclusive")
node = onnx.helper.make_node(
"CumSum", inputs=["x", "axis"], outputs=["y"], exclusive=1
)
x = np.array([1, 2, 3, 4, 5]).astype(np.int32)
axis = np.int32(0)
y = np.array([0, 1, 3, 6, 10]).astype(np.int32)
expect(
node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_int32_exclusive"
)
node = onnx.helper.make_node(
"CumSum", inputs=["x", "axis"], outputs=["y"], reverse=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([15.0, 14.0, 12.0, 9.0, 5.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_reverse")
node = onnx.helper.make_node(
"CumSum", inputs=["x", "axis"], outputs=["y"], reverse=1, exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([14.0, 12.0, 9.0, 5.0, 0.0]).astype(np.float64)
expect(
node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_reverse_exclusive"
)
node = onnx.helper.make_node(
"CumSum",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(0)
y = np.array([1.0, 2.0, 3.0, 5.0, 7.0, 9.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_axis_0")
node = onnx.helper.make_node(
"CumSum",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(1)
y = np.array([1.0, 3.0, 6.0, 4.0, 9.0, 15.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_axis_1")
node = onnx.helper.make_node(
"CumSum",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.int32).reshape((2, 3))
axis = np.int32(0)
y = np.array([1, 2, 3, 5, 7, 9]).astype(np.int32).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_int32")
node = onnx.helper.make_node(
"CumSum",
inputs=["x", "axis"],
outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(-1)
y = np.array([1.0, 3.0, 6.0, 4.0, 9.0, 15.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_negative_axis")
Computes the discrete Fourier Transform (DFT) of the input.
Assuming the input has shape [M, N], where N is the dimension over which the
DFT is computed and M denotes the conceptual "all other dimensions,"
the DFT y[m, k] of shape [M, N] is defined as
$$y[m, k] = \sum_{n=0}^{N-1} e^{-2 \pi j \frac{k n}{N} } x[m, n] ,$$
and the inverse transform is defined as
$$x[m, n] = \frac{1}{N} \sum_{k=0}^{N-1} e^{2 \pi j \frac{k n}{N} } y[m, k] ,$$
where $j$ is the imaginary unit.
The actual shape of the output is specified in the "output" section.
Reference: https://docs.scipy.org/doc/scipy/tutorial/fft.html
This version of the operator has been available since version 20 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#DFT-17">17</a>
node = onnx.helper.make_node("DFT", inputs=["x", "", "axis"], outputs=["y"])
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
axis = np.array(1, dtype=np.int64)
y = np.fft.fft(x, axis=0)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft")
node = onnx.helper.make_node("DFT", inputs=["x", "", "axis"], outputs=["y"])
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
axis = np.array(2, dtype=np.int64)
y = np.fft.fft(x, axis=1)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_axis")
node = onnx.helper.make_node(
"DFT", inputs=["x", "", "axis"], outputs=["y"], inverse=1
)
x = np.arange(0, 100, dtype=np.complex64).reshape(10, 10)
axis = np.array(1, dtype=np.int64)
y = np.fft.ifft(x, axis=0)
x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_inverse")
# Test RFFT (Real FFT): real input -> one-sided complex output
node = onnx.helper.make_node(
"DFT", inputs=["x", "", "axis"], outputs=["y"], onesided=1
)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
axis = np.array(1, dtype=np.int64)
y = np.fft.rfft(x, axis=0)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 6, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_rfft")
# Test IRFFT (Inverse Real FFT): one-sided complex input -> real output
node = onnx.helper.make_node(
"DFT", inputs=["x", "", "axis"], outputs=["y"], onesided=1, inverse=1
)
# Create one-sided complex input (6 bins for signal length 10)
x = np.fft.rfft(np.arange(0, 100).reshape(10, 10), axis=0).astype(np.complex64)
axis = np.array(1, dtype=np.int64)
y = np.fft.irfft(x, n=10, axis=0)
x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 6, 10, 2)
y = y.reshape(1, 10, 10, 1).astype(np.float32)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_irfft")
node = onnx.helper.make_node("DFT", inputs=["x"], outputs=["y"], axis=1)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
y = np.fft.fft(x, axis=0)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
node,
inputs=[x],
outputs=[y],
name="test_dft_opset19",
opset_imports=[onnx.helper.make_opsetid("", 19)],
)
node = onnx.helper.make_node("DFT", inputs=["x"], outputs=["y"], axis=2)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
y = np.fft.fft(x, axis=1)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
node,
inputs=[x],
outputs=[y],
name="test_dft_axis_opset19",
opset_imports=[onnx.helper.make_opsetid("", 19)],
)
node = onnx.helper.make_node(
"DFT", inputs=["x"], outputs=["y"], inverse=1, axis=1
)
x = np.arange(0, 100, dtype=np.complex64).reshape(
10,
10,
)
y = np.fft.ifft(x, axis=0)
x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
node,
inputs=[x],
outputs=[y],
name="test_dft_inverse_opset19",
opset_imports=[onnx.helper.make_opsetid("", 19)],
)
# Test RFFT (Real FFT): real input -> one-sided complex output
node = onnx.helper.make_node(
"DFT", inputs=["x"], outputs=["y"], onesided=1, axis=1
)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
y = np.fft.rfft(x, axis=0)
x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 6, 10, 2)
expect(
node,
inputs=[x],
outputs=[y],
name="test_dft_rfft_opset19",
opset_imports=[onnx.helper.make_opsetid("", 19)],
)
# Test IRFFT (Inverse Real FFT): one-sided complex input -> real output
node = onnx.helper.make_node(
"DFT", inputs=["x"], outputs=["y"], onesided=1, inverse=1, axis=1
)
# Create one-sided complex input (6 bins for signal length 10)
x = np.fft.rfft(np.arange(0, 100).reshape(10, 10), axis=0).astype(np.complex64)
y = np.fft.irfft(x, n=10, axis=0)
x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 6, 10, 2)
y = y.reshape(1, 10, 10, 1).astype(np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_dft_irfft_opset19",
opset_imports=[onnx.helper.make_opsetid("", 19)],
)
Performs deformable convolution as described in https://arxiv.org/abs/1703.06211 and https://arxiv.org/abs/1811.11168. This operator specification supports the general N-D case. Note that most common use cases have 2D or 3D data.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#DeformConv-19">19</a>
X = np.arange(9).astype(np.float32)
X.shape = (1, 1, 3, 3)
W = np.ones((1, 1, 2, 2), dtype=np.float32)
# Convolution with padding
offset_with_padding = np.zeros((1, 8, 4, 4), dtype=np.float32)
# h-coord of [0, 0] element of kernel, at output position [0, 0]
offset_with_padding[0, 0, 0, 0] = 0.5
# w-coord of [1, 0] element of kernel, at output position [1, 2]
offset_with_padding[0, 5, 1, 2] = -0.1
node_with_padding = onnx.helper.make_node(
"DeformConv",
inputs=["X", "W", "offset_with_padding"],
outputs=["Y_with_padding"],
kernel_shape=[2, 2],
pads=[1, 1, 1, 1],
)
Y_with_padding = np.array(
[
[
[
[0.0, 1.0, 3.0, 2.0], # (1, 1, 4, 4) output tensor
[3.0, 8.0, 11.9, 7.0],
[9.0, 20.0, 24.0, 13.0],
[6.0, 13.0, 15.0, 8.0],
]
]
]
).astype(np.float32)
expect(
node_with_padding,
inputs=[X, W, offset_with_padding],
outputs=[Y_with_padding],
name="test_basic_deform_conv_with_padding",
)
# Convolution without padding
offset_without_padding = np.zeros((1, 8, 2, 2), dtype=np.float32)
# h-coord of [0, 0] element of kernel, at output position [0, 0]
offset_without_padding[0, 0, 0, 0] = 0.5
# w-coord of [1, 0] element of kernel, at output position [0, 1]
offset_without_padding[0, 5, 0, 1] = -0.1
node_without_padding = onnx.helper.make_node(
"DeformConv",
inputs=["X", "W", "offset_without_padding"],
outputs=["Y_without_padding"],
kernel_shape=[2, 2],
pads=[0, 0, 0, 0],
)
Y_without_padding = np.array(
[
[
[
[9.5, 11.9], # (1, 1, 2, 2) output tensor
[20.0, 24.0],
]
]
]
).astype(np.float32)
expect(
node_without_padding,
inputs=[X, W, offset_without_padding],
outputs=[Y_without_padding],
name="test_basic_deform_conv_without_padding",
)
X = np.arange(9).astype(np.float32)
X.shape = (1, 1, 3, 3)
W = np.ones((1, 1, 2, 2), dtype=np.float32)
B = np.ones((1,), dtype=np.float32)
offset = np.zeros((1, 8, 2, 2), dtype=np.float32)
# h-coord of [0, 0] element of kernel, at output position [0, 0]
offset[0, 0, 0, 0] = 0.5
# w-coord of [1, 0] element of kernel, at output position [0, 1]
offset[0, 5, 0, 1] = -0.1
mask = np.ones((1, 4, 2, 2), dtype=np.float32)
mask[0, 2, 1, 1] = 0.2 # [1, 0] element of kernel at output position [1, 1]
node = onnx.helper.make_node(
"DeformConv",
inputs=["X", "W", "offset", "B", "mask"],
outputs=["Y"],
kernel_shape=[2, 2],
pads=[0, 0, 0, 0],
)
Y = np.array(
[
[
[
[10.5, 12.9], # (1, 1, 2, 2) output tensor
[21.0, 19.4],
]
]
]
).astype(np.float32)
expect(
node,
inputs=[X, W, offset, B, mask],
outputs=[Y],
name="test_deform_conv_with_mask_bias",
)
X = np.zeros((1, 2, 3, 3), dtype=np.float32)
X[0, 0] = np.reshape(np.arange(9).astype(np.float32), (3, 3))
X[0, 1] = np.reshape(np.arange(8, -1, -1).astype(np.float32), (3, 3))
X.shape = (1, 2, 3, 3)
W = np.ones((1, 2, 2, 2), dtype=np.float32)
offset = np.zeros((1, 16, 2, 2), dtype=np.float32)
# h-coord of [0, 0] element of kernel in channel 0, at output position [0, 0]
offset[0, 0, 0, 0] = 0.5
# w-coord of [1, 0] element of kernel in channel 1, at output position [0, 1]
offset[0, 13, 0, 1] = -0.1
node = onnx.helper.make_node(
"DeformConv",
inputs=["X", "W", "offset"],
outputs=["Y"],
kernel_shape=[2, 2],
pads=[0, 0, 0, 0],
offset_group=2,
)
Y = np.array(
[
[
[
[33.5, 32.1], # (1, 1, 2, 2) output tensor
[32.0, 32.0],
]
]
]
).astype(np.float32)
expect(
node,
inputs=[X, W, offset],
outputs=[Y],
name="test_deform_conv_with_multiple_offset_groups",
)
DepthToSpace rearranges (permutes) data from depth into blocks of spatial data.
This is the reverse transformation of SpaceToDepth. More specifically, this op outputs a copy of
the input tensor where values from the depth dimension are moved in spatial blocks to the height
and width dimensions. By default, mode = DCR.
In the DCR mode, elements along the depth dimension from the input tensor are rearranged in the
following order: depth, column, and then row. The output y is computed from the input x as below:
b, c, h, w = x.shape
tmp = np.reshape(x, [b, blocksize, blocksize, c // (blocksize**2), h, w])
tmp = np.transpose(tmp, [0, 3, 4, 1, 5, 2])
y = np.reshape(tmp, [b, c // (blocksize**2), h * blocksize, w * blocksize])
In the CRD mode, elements along the depth dimension from the input tensor are rearranged in the following order: column, row, and the depth. The output y is computed from the input x as below:
b, c, h, w = x.shape
tmp = np.reshape(x, [b, c // (blocksize ** 2), blocksize, blocksize, h, w])
tmp = np.transpose(tmp, [0, 1, 4, 2, 5, 3])
y = np.reshape(tmp, [b, c // (blocksize ** 2), h * blocksize, w * blocksize])
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#DepthToSpace-1">1</a>, <a href="Changelog.md#DepthToSpace-11">11</a>
node = onnx.helper.make_node(
"DepthToSpace", inputs=["x"], outputs=["y"], blocksize=2, mode="CRD"
)
# (1, 8, 2, 3) input tensor
x = np.array(
[
[
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0]],
[[18.0, 19.0, 20.0], [21.0, 22.0, 23.0]],
[[27.0, 28.0, 29.0], [30.0, 31.0, 32.0]],
[[36.0, 37.0, 38.0], [39.0, 40.0, 41.0]],
[[45.0, 46.0, 47.0], [48.0, 49.0, 50.0]],
[[54.0, 55.0, 56.0], [57.0, 58.0, 59.0]],
[[63.0, 64.0, 65.0], [66.0, 67.0, 68.0]],
]
]
).astype(np.float32)
# (1, 2, 4, 6) output tensor
y = np.array(
[
[
[
[0.0, 9.0, 1.0, 10.0, 2.0, 11.0],
[18.0, 27.0, 19.0, 28.0, 20.0, 29.0],
[3.0, 12.0, 4.0, 13.0, 5.0, 14.0],
[21.0, 30.0, 22.0, 31.0, 23.0, 32.0],
],
[
[36.0, 45.0, 37.0, 46.0, 38.0, 47.0],
[54.0, 63.0, 55.0, 64.0, 56.0, 65.0],
[39.0, 48.0, 40.0, 49.0, 41.0, 50.0],
[57.0, 66.0, 58.0, 67.0, 59.0, 68.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_depthtospace_crd_mode_example")
node = onnx.helper.make_node(
"DepthToSpace", inputs=["x"], outputs=["y"], blocksize=2, mode="DCR"
)
# (1, 8, 2, 3) input tensor
x = np.array(
[
[
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]],
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0]],
[[18.0, 19.0, 20.0], [21.0, 22.0, 23.0]],
[[27.0, 28.0, 29.0], [30.0, 31.0, 32.0]],
[[36.0, 37.0, 38.0], [39.0, 40.0, 41.0]],
[[45.0, 46.0, 47.0], [48.0, 49.0, 50.0]],
[[54.0, 55.0, 56.0], [57.0, 58.0, 59.0]],
[[63.0, 64.0, 65.0], [66.0, 67.0, 68.0]],
]
]
).astype(np.float32)
# (1, 2, 4, 6) output tensor
y = np.array(
[
[
[
[0.0, 18.0, 1.0, 19.0, 2.0, 20.0],
[36.0, 54.0, 37.0, 55.0, 38.0, 56.0],
[3.0, 21.0, 4.0, 22.0, 5.0, 23.0],
[39.0, 57.0, 40.0, 58.0, 41.0, 59.0],
],
[
[9.0, 27.0, 10.0, 28.0, 11.0, 29.0],
[45.0, 63.0, 46.0, 64.0, 47.0, 65.0],
[12.0, 30.0, 13.0, 31.0, 14.0, 32.0],
[48.0, 66.0, 49.0, 67.0, 50.0, 68.0],
],
]
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_depthtospace_example")
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
full-precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
See QuantizeLinear for details on quantization granularity.
x_zero_point and x must have the same type. x and y must have the same shape. In the case of dequantizing
int32, there's no zero point (zero point is supposed to be 0).
zero-point is usually not used in the case of float8 and 4-bit types quantization, but the dequantization formula remains the same
for consistency. The output type is determined by the attribute output_dtype. If output_dtype is not supplied then the output type
is the same as x_scale. The output type also determines the precision of the multiplication operation.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#DequantizeLinear-10">10</a>, <a href="Changelog.md#DequantizeLinear-13">13</a>, <a href="Changelog.md#DequantizeLinear-19">19</a>, <a href="Changelog.md#DequantizeLinear-21">21</a>, <a href="Changelog.md#DequantizeLinear-23">23</a>, <a href="Changelog.md#DequantizeLinear-24">24</a>
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
)
# 1-D tensor zero point and scale of size equal to axis 1 of the input tensor
x = np.array(
[
[
[[3, 89], [34, 200], [74, 59]],
[[5, 24], [24, 87], [32, 13]],
[[245, 99], [4, 142], [121, 102]],
],
],
dtype=np.uint8,
)
x_scale = np.array([2, 4, 5], dtype=np.float32)
x_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (
x.astype(np.float32) - x_zero_point.reshape(1, 3, 1, 1).astype(np.float32)
) * x_scale.reshape(1, 3, 1, 1)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_axis",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=1,
block_size=2,
)
x = np.array(
[
[
[[3, 89], [34, 200], [74, 59]],
[[5, 24], [24, 87], [32, 13]],
[[5, 12], [12, 33], [65, 42]],
[[245, 99], [4, 142], [121, 102]],
],
],
dtype=np.uint8,
)
x_scale = np.array(
[
[
[[3.0, 2.0], [4.0, 1.0], [2.0, 2.0]],
[[5.0, 2.0], [4.0, 3.0], [5.0, 2.0]],
],
],
dtype=np.float32,
)
x_zero_point = np.array(
[
[
[[1, 0], [0, 1], [2, 20]],
[[3, 2], [4, 3], [15, 2]],
],
],
dtype=np.uint8,
)
# x.shape = (1, 4, 3, 2)
# x_scale.shape = (1, 2, 3, 2)
assert x_scale.shape == x_zero_point.shape
block_axis = 1
# The block shape is [x.shape[i] // x_scale.shape[i] for i in range(len(x.shape))] = (1, 2, 1, 1)
assert all(
x.shape[i] == x_scale.shape[i]
for i in range(len(x.shape))
if i != block_axis
)
assert x.shape[block_axis] % x_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // x_scale.shape[block_axis]
# Create element-wise scale and zero point
x_scale_elementwise = np.repeat(x_scale, repeats=repeats, axis=block_axis)
x_zero_point_elementwise = np.repeat(
x_zero_point, repeats=repeats, axis=block_axis
)
y = (
x.astype(np.float32) - x_zero_point_elementwise.astype(np.float32)
) * x_scale_elementwise
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_blocked",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
)
# scalar zero point and scale
x = np.array([0, 3, 128, 255]).astype(np.uint8)
x_scale = np.float32(2)
x_zero_point = np.uint8(128)
y = np.array([-256, -250, 0, 254], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float32)
expect(
node,
inputs=[x, x_scale],
outputs=[y],
name="test_dequantizelinear_e4m3fn",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
x_scale = np.float16(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float16)
expect(
node,
inputs=[x, x_scale],
outputs=[y],
name="test_dequantizelinear_e4m3fn_float16",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
zero_point = make_tensor("zero_point", TensorProto.FLOAT8E4M3FN, [1], [0])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, zero_point],
outputs=[y],
name="test_dequantizelinear_e4m3fn_zero_point",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E5M2, [5], [0, 0.5, 1, 49152, -96])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 98304.0, -192.0], dtype=np.float32)
expect(
node,
inputs=[x, x_scale],
outputs=[y],
name="test_dequantizelinear_e5m2",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT4E2M1, [5], [0, 1, -1, 1.5, -4])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.FLOAT4E2M1, (1,), [0])
y = np.array([0, 2, -2, 3, -8], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_float4e2m1",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
)
x = np.array([-300, -30, -1025, 1270]).astype(np.int16)
x_scale = np.float32(2)
x_zero_point = np.int16(-1024)
y = np.array([1448.0, 1988.0, -2.0, 4588.0], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_int16",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.INT2, [4], [0, 1, -1, -2])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.INT2, (1,), [1])
y = np.array([-2, 0, -4, -6], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_int2",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.INT4, [5], [0, 1, 7, -4, -8])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.INT4, (1,), [1])
y = np.array([-2, 0, 12, -10, -18], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_int4",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
)
x = np.array([30000, 31000, 32768, 33000]).astype(np.uint16)
x_scale = np.float32(2)
x_zero_point = np.uint16(32767)
y = np.array([-5534.0, -3534.0, 2.0, 466.0], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_uint16",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.UINT2, [4], [0, 1, 2, 3])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.UINT2, (1,), [1])
y = np.array([-2, 0, 2, 4], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_uint2",
)
node = onnx.helper.make_node(
"DequantizeLinear",
inputs=["x", "x_scale", "x_zero_point"],
outputs=["y"],
axis=0,
)
# scalar zero point and scale
x = make_tensor("x", TensorProto.UINT4, [5], [0, 1, 7, 10, 15])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.UINT4, (1,), [1])
y = np.array([-2, 0, 12, 18, 28], dtype=np.float32)
expect(
node,
inputs=[x, x_scale, x_zero_point],
outputs=[y],
name="test_dequantizelinear_uint4",
)
Det calculates determinant of a square matrix or batches of square matrices.
Det takes one input tensor of shape [*, M, M], where * is zero or more batch dimensions,
and the inner-most 2 dimensions form square matrices.
The output is a tensor of shape [*], containing the determinants of all input submatrices.
e.g., When the input is 2-D, the output is a scalar(shape is empty: []).
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Det-11">11</a>
node = onnx.helper.make_node(
"Det",
inputs=["x"],
outputs=["y"],
)
x = np.arange(4).reshape(2, 2).astype(np.float32)
y = np.linalg.det(x) # expect -2
expect(node, inputs=[x], outputs=[y], name="test_det_2d")
node = onnx.helper.make_node(
"Det",
inputs=["x"],
outputs=["y"],
)
x = np.array([[[1, 2], [3, 4]], [[1, 2], [2, 1]], [[1, 3], [3, 1]]]).astype(
np.float32
)
y = np.linalg.det(x) # expect array([-2., -3., -8.])
expect(node, inputs=[x], outputs=[y], name="test_det_nd")
Performs element-wise binary division (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
For integer inputs, the result is computed using truncating division (rounding toward zero). (Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Div-1">1</a>, <a href="Changelog.md#Div-6">6</a>, <a href="Changelog.md#Div-7">7</a>, <a href="Changelog.md#Div-13">13</a>
node = onnx.helper.make_node(
"Div",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([3, 4]).astype(np.float32)
y = np.array([1, 2]).astype(np.float32)
z = x / y # expected output [3., 2.]
expect(node, inputs=[x, y], outputs=[z], name="test_div_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.rand(3, 4, 5).astype(np.float32) + 1.0
z = x / y
expect(node, inputs=[x, y], outputs=[z], name="test_div")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.int8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int8) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_int8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.int16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int16) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_int16")
x = np.array([-3, 3, -3, 3], dtype=np.int32)
y = np.array([2, 2, -2, -2], dtype=np.int32)
z = np.array([-1, 1, 1, -1], dtype=np.int32)
expect(node, inputs=[x, y], outputs=[z], name="test_div_int32_trunc")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_uint64")
node = onnx.helper.make_node(
"Div",
inputs=["x", "y"],
outputs=["z"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.rand(5).astype(np.float32) + 1.0
z = x / y
expect(node, inputs=[x, y], outputs=[z], name="test_div_bcast")
Dropout takes an input floating-point tensor, an optional input ratio (floating-point scalar) and an optional input training_mode (boolean scalar). It produces two tensor outputs,
output (floating-point tensor) and mask (optional Tensor<bool>). If training_mode is true then the output Y will be a random dropout;
Note that this Dropout scales the masked input data by the following equation, so to convert the trained model into inference mode,
the user can simply not pass training_mode input or set it to false.
output = scale * data * mask,
where
scale = 1. / (1. - ratio).
This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Dropout-1">1</a>, <a href="Changelog.md#Dropout-6">6</a>, <a href="Changelog.md#Dropout-7">7</a>, <a href="Changelog.md#Dropout-10">10</a>, <a href="Changelog.md#Dropout-12">12</a>, <a href="Changelog.md#Dropout-13">13</a>
seed = np.int64(0)
node = onnx.helper.make_node("Dropout", inputs=["x"], outputs=["y"], seed=seed)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = dropout(x)
expect(node, inputs=[x], outputs=[y], name="test_dropout_default")
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x"], outputs=["y", "z"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y, z = dropout(x, return_mask=True)
expect(node, inputs=[x], outputs=[y, z], name="test_dropout_default_mask")
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r"], outputs=["y", "z"], seed=seed
)
r = np.float32(0.1)
x = np.random.randn(3, 4, 5).astype(np.float32)
y, z = dropout(x, r, return_mask=True)
expect(
node, inputs=[x, r], outputs=[y, z], name="test_dropout_default_mask_ratio"
)
node = onnx.helper.make_node(
"Dropout",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = x
expect(
node,
inputs=[x],
outputs=[y],
name="test_dropout_default_old",
opset_imports=[helper.make_opsetid("", 11)],
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r"], outputs=["y"], seed=seed
)
r = np.float32(0.1)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = dropout(x, r)
expect(node, inputs=[x, r], outputs=[y], name="test_dropout_default_ratio")
node = onnx.helper.make_node(
"Dropout",
inputs=["x"],
outputs=["y"],
ratio=0.2,
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = x
expect(
node,
inputs=[x],
outputs=[y],
name="test_dropout_random_old",
opset_imports=[helper.make_opsetid("", 11)],
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.75)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(node, inputs=[x, r, t], outputs=[y], name="test_training_dropout")
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.5)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(
node, inputs=[x, r, t], outputs=[y], name="test_training_dropout_default"
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.5)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
node,
inputs=[x, r, t],
outputs=[y, z],
name="test_training_dropout_default_mask",
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.0)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(
node, inputs=[x, r, t], outputs=[y], name="test_training_dropout_zero_ratio"
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.0)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
node,
inputs=[x, r, t],
outputs=[y, z],
name="test_training_dropout_zero_ratio_mask",
)
seed = np.int64(0)
node = onnx.helper.make_node(
"Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)
x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.75)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
node, inputs=[x, r, t], outputs=[y, z], name="test_training_dropout_mask"
)
A Function to fuse calculation for Scale, Zero Point and FP32->8Bit conversion of FP32 Input data. Outputs Scale, ZeroPoint and Quantized Input for a given FP32 Input. Scale is calculated as:
y_scale = (maximum(0, max(x)) - minimum(0, min(x))) / (qmax - qmin)
Zero point is calculated as:
intermediate_zero_point = qmin - min(x)/y_scale
y_zero_point = cast(round(saturate(intermediate_zero_point)))
Data quantization formula is:
y = saturate (round (x / y_scale) + y_zero_point)
This version of the operator has been available since version 11 of the default ONNX operator set.
node = onnx.helper.make_node(
"DynamicQuantizeLinear",
inputs=["x"],
outputs=["y", "y_scale", "y_zero_point"],
)
# expected scale 0.0196078438 and zero point 153
X = np.array([0, 2, -3, -2.5, 1.34, 0.5]).astype(np.float32)
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0)) # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)
expect(
node,
inputs=[X],
outputs=[Y, Y_Scale, Y_ZeroPoint],
name="test_dynamicquantizelinear",
)
# expected scale 0.0156862754 and zero point 255
X = np.array([-1.0, -2.1, -1.3, -2.5, -3.34, -4.0]).astype(np.float32)
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0)) # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)
expect(
node,
inputs=[X],
outputs=[Y, Y_Scale, Y_ZeroPoint],
name="test_dynamicquantizelinear_max_adjusted",
)
X = (
np.array([1, 2.1, 1.3, 2.5, 3.34, 4.0, 1.5, 2.6, 3.9, 4.0, 3.0, 2.345])
.astype(np.float32)
.reshape((3, 4))
)
# expected scale 0.0156862754 and zero point 0
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0)) # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)
expect(
node,
inputs=[X],
outputs=[Y, Y_Scale, Y_ZeroPoint],
name="test_dynamicquantizelinear_min_adjusted",
)
An einsum of the form term1, term2 -> output-term produces an output tensor using the following equation
output[output-term] = reduce-sum( input1[term1] * input2[term2] )
where the reduce-sum performs a summation over all the indices occurring in the input terms (term1, term2) that do not occur in the output-term.
The Einsum operator evaluates algebraic tensor operations on a sequence of tensors, using the Einstein summation convention. The equation string contains a comma-separated sequence of lower case letters. Each term corresponds to an operand tensor, and the characters within the terms correspond to operands dimensions.
This sequence may be followed by "->" to separate the left and right hand side of the equation. If the equation contains "->" followed by the right-hand side, the explicit (not classical) form of the Einstein summation is performed, and the right-hand side indices indicate output tensor dimensions. In other cases, output indices are (implicitly) set to the alphabetically sorted sequence of indices appearing exactly once in the equation.
When a dimension character is repeated in the left-hand side, it represents summation along the dimension.
The equation may contain ellipsis ("...") to enable broadcasting. Ellipsis must indicate a fixed number of dimensions. Specifically, every occurrence of ellipsis in the equation must represent the same number of dimensions. The right-hand side may contain exactly one ellipsis. In implicit mode, the ellipsis dimensions are set to the beginning of the output. The equation string may contain space (U+0020) character.
This version of the operator has been available since version 12 of the default ONNX operator set.
Eqn = "...ii ->...i"
node = onnx.helper.make_node(
"Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)
X = np.random.randn(3, 5, 5)
Z = einsum_reference_implementation(Eqn, (X,))
expect(node, inputs=[X], outputs=[Z], name="test_einsum_batch_diagonal")
Eqn = "bij, bjk -> bik"
node = onnx.helper.make_node(
"Einsum", inputs=["x", "y"], outputs=["z"], equation=Eqn
)
X = np.random.randn(5, 2, 3)
Y = np.random.randn(5, 3, 4)
Z = einsum_reference_implementation(Eqn, (X, Y))
expect(node, inputs=[X, Y], outputs=[Z], name="test_einsum_batch_matmul")
Eqn = "i,i"
node = onnx.helper.make_node(
"Einsum", inputs=["x", "y"], outputs=["z"], equation=Eqn
)
X = np.random.randn(5)
Y = np.random.randn(5)
Z = einsum_reference_implementation(Eqn, (X, Y))
expect(node, inputs=[X, Y], outputs=[Z], name="test_einsum_inner_prod")
Eqn = "->"
node = onnx.helper.make_node(
"Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)
X = np.array(5.0) # scalar input
Z = einsum_reference_implementation(Eqn, (X,))
expect(node, inputs=[X], outputs=[Z], name="test_einsum_scalar")
Eqn = "ij->i"
node = onnx.helper.make_node(
"Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)
X = np.random.randn(3, 4)
Z = einsum_reference_implementation(Eqn, (X,))
expect(node, inputs=[X], outputs=[Z], name="test_einsum_sum")
Eqn = "ij->ji"
node = onnx.helper.make_node(
"Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)
X = np.random.randn(3, 4)
Y = einsum_reference_implementation(Eqn, (X,))
expect(node, inputs=[X], outputs=[Y], name="test_einsum_transpose")
Elu takes one input data (Tensor<T>) and produces one output data
(Tensor<T>) where the function f(x) = alpha * (exp(x) - 1.) for x < 0, f(x) = x for x >= 0., is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Elu-1">1</a>, <a href="Changelog.md#Elu-6">6</a>
node = onnx.helper.make_node("Elu", inputs=["x"], outputs=["y"], alpha=2.0)
x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-1.2642411, 0., 1.]
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0
expect(node, inputs=[x], outputs=[y], name="test_elu_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0
expect(node, inputs=[x], outputs=[y], name="test_elu")
default_alpha = 1.0
node = onnx.helper.make_node(
"Elu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * default_alpha
expect(node, inputs=[x], outputs=[y], name="test_elu_default")
Returns the tensor resulted from performing the equal logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 19 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Equal-1">1</a>, <a href="Changelog.md#Equal-7">7</a>, <a href="Changelog.md#Equal-11">11</a>, <a href="Changelog.md#Equal-13">13</a>
node = onnx.helper.make_node(
"Equal",
inputs=["x", "y"],
outputs=["z"],
)
x = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
y = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal")
x = (np.random.randn(3, 4, 5) * 10).astype(np.int8)
y = (np.random.randn(3, 4, 5) * 10).astype(np.int8)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_int8")
x = (np.random.randn(3, 4, 5) * 10).astype(np.int16)
y = (np.random.randn(3, 4, 5) * 10).astype(np.int16)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_uint64")
node = onnx.helper.make_node(
"Equal",
inputs=["x", "y"],
outputs=["z"],
)
x = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
y = (np.random.randn(5) * 10).astype(np.int32)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_bcast")
node = onnx.helper.make_node(
"Equal",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array(["string1", "string2"], dtype=np.dtype(object))
y = np.array(["string1", "string3"], dtype=np.dtype(object))
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_string")
node = onnx.helper.make_node(
"Equal",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array(["string1", "string2"], dtype=np.dtype(object))
y = np.array(["string1"], dtype=np.dtype(object))
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_string_broadcast")
Computes the error function of the given input tensor element-wise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Erf-9">9</a>
node = onnx.helper.make_node(
"Erf",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
y = np.vectorize(math.erf)(x).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_erf")
Calculates the exponential of the given input tensor, element-wise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Exp-1">1</a>, <a href="Changelog.md#Exp-6">6</a>
node = onnx.helper.make_node(
"Exp",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.exp(x) # expected output [0.36787945, 1., 2.71828175]
expect(node, inputs=[x], outputs=[y], name="test_exp_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.exp(x)
expect(node, inputs=[x], outputs=[y], name="test_exp")
Broadcast the input tensor following the given shape and the broadcast rule. The broadcast rule is similar to numpy.array(input) * numpy.ones(shape): Dimensions are right alignment; Two corresponding dimensions must have the same value, or one of them is equal to 1. Also, this operator is similar to numpy.broadcast_to(input, shape), but the major difference is numpy.broadcast_to() does not allow shape to be smaller than input.size(). It is possible that the output.shape is not equal to shape, when some dimensions in shape is equal to 1, or the shape.ndim < input.shape.ndim.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Expand-8">8</a>
node = onnx.helper.make_node(
"Expand",
inputs=["data", "new_shape"],
outputs=["expanded"],
)
shape = [3, 1]
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[1.], [2.], [3.]]
new_shape = [2, 1, 6]
expanded = data * np.ones(new_shape, dtype=np.float32)
# print(expanded)
# [[[1., 1., 1., 1., 1., 1.],
# [2., 2., 2., 2., 2., 2.],
# [3., 3., 3., 3., 3., 3.]],
#
# [[1., 1., 1., 1., 1., 1.],
# [2., 2., 2., 2., 2., 2.],
# [3., 3., 3., 3., 3., 3.]]]
new_shape = np.array(new_shape, dtype=np.int64)
expect(
node,
inputs=[data, new_shape],
outputs=[expanded],
name="test_expand_dim_changed",
)
node = onnx.helper.make_node(
"Expand",
inputs=["data", "new_shape"],
outputs=["expanded"],
)
shape = [3, 1]
new_shape = [3, 4]
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[1.], [2.], [3.]]
expanded = np.tile(data, 4)
# print(expanded)
# [[1., 1., 1., 1.],
# [2., 2., 2., 2.],
# [3., 3., 3., 3.]]
new_shape = np.array(new_shape, dtype=np.int64)
expect(
node,
inputs=[data, new_shape],
outputs=[expanded],
name="test_expand_dim_unchanged",
)
Generate a 2D tensor (matrix) with ones on the diagonal and zeros everywhere else. Only 2D tensors are supported, i.e. input T1 must be of rank 2. The shape of the output tensor is the same as the input tensor. The data type can be specified by the 'dtype' argument. If 'dtype' is not specified, then the type of input tensor is used. By default, the main diagonal is populated with ones, but attribute 'k' can be used to populate upper or lower diagonals. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message and be valid as an output type.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#EyeLike-9">9</a>
shape = (4, 5)
off_diagonal_offset = 1
node = onnx.helper.make_node(
"EyeLike",
inputs=["x"],
outputs=["y"],
k=off_diagonal_offset,
dtype=onnx.TensorProto.FLOAT,
)
x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], k=off_diagonal_offset, dtype=np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_eyelike_populate_off_main_diagonal",
)
shape = (3, 4)
node = onnx.helper.make_node(
"EyeLike",
inputs=["x"],
outputs=["y"],
dtype=onnx.TensorProto.DOUBLE,
)
x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], dtype=np.float64)
expect(node, inputs=[x], outputs=[y], name="test_eyelike_with_dtype")
shape = (4, 4)
node = onnx.helper.make_node(
"EyeLike",
inputs=["x"],
outputs=["y"],
)
x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], dtype=np.int32)
expect(node, inputs=[x], outputs=[y], name="test_eyelike_without_dtype")
Flattens the input tensor into a 2D matrix. If input tensor has shape (d_0, d_1, ... d_n) then the output will have shape (d_0 X d_1 ... d_(axis-1), d_axis X d_(axis+1) ... X dn).
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Flatten-1">1</a>, <a href="Changelog.md#Flatten-9">9</a>, <a href="Changelog.md#Flatten-11">11</a>, <a href="Changelog.md#Flatten-13">13</a>, <a href="Changelog.md#Flatten-21">21</a>, <a href="Changelog.md#Flatten-23">23</a>, <a href="Changelog.md#Flatten-24">24</a>
shape = (2, 3, 4, 5)
a = np.random.random_sample(shape).astype(np.float32)
for i in range(len(shape)):
node = onnx.helper.make_node(
"Flatten",
inputs=["a"],
outputs=["b"],
axis=i,
)
new_shape = (1, -1) if i == 0 else (np.prod(shape[0:i]).astype(int), -1)
b = np.reshape(a, new_shape)
expect(node, inputs=[a], outputs=[b], name="test_flatten_axis" + str(i))
shape = (2, 3, 4, 5)
a = np.random.random_sample(shape).astype(np.float32)
for i in range(-len(shape), 0):
node = onnx.helper.make_node(
"Flatten",
inputs=["a"],
outputs=["b"],
axis=i,
)
new_shape = (np.prod(shape[0:i]).astype(int), -1)
b = np.reshape(a, new_shape)
expect(
node,
inputs=[a],
outputs=[b],
name="test_flatten_negative_axis" + str(abs(i)),
)
node = onnx.helper.make_node(
"Flatten",
inputs=["a"],
outputs=["b"], # Default value for axis: axis=1
)
shape = (5, 4, 3, 2)
a = np.random.random_sample(shape).astype(np.float32)
new_shape = (5, 24)
b = np.reshape(a, new_shape)
expect(node, inputs=[a], outputs=[b], name="test_flatten_default_axis")
Floor takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the floor is, y = floor(x), is applied to the tensor elementwise. If x is integral, +0, -0, NaN, or infinite, x itself is returned.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Floor-1">1</a>, <a href="Changelog.md#Floor-6">6</a>
node = onnx.helper.make_node(
"Floor",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.5, 1.2, 2]).astype(np.float32)
y = np.floor(x) # expected output [-2., 1., 2.]
expect(node, inputs=[x], outputs=[y], name="test_floor_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.floor(x)
expect(node, inputs=[x], outputs=[y], name="test_floor")
Computes an one-layer GRU. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X - input tensorz - update gater - reset gateh - hidden gatet - time step (t-1 means previous time step)W[zrh] - W parameter weight matrix for update, reset, and hidden gatesR[zrh] - R recurrence weight matrix for update, reset, and hidden gatesWb[zrh] - W bias vectors for update, reset, and hidden gatesRb[zrh] - R bias vectors for update, reset, and hidden gatesWB[zrh] - W parameter weight matrix for backward update, reset, and hidden gatesRB[zrh] - R recurrence weight matrix for backward update, reset, and hidden gatesWBb[zrh] - W bias vectors for backward update, reset, and hidden gatesRBb[zrh] - R bias vectors for backward update, reset, and hidden gatesH - Hidden statenum_directions - 2 if direction == bidirectional else 1Activation functions:
NOTE: Below are optional
Equations (Default: f=Sigmoid, g=Tanh):
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GRU-1">1</a>, <a href="Changelog.md#GRU-3">3</a>, <a href="Changelog.md#GRU-7">7</a>, <a href="Changelog.md#GRU-14">14</a>
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 6
number_of_gates = 3
weight_scale = 0.2
layout = 1
node = onnx.helper.make_node(
"GRU",
inputs=["X", "W", "R"],
outputs=["Y", "Y_h"],
hidden_size=hidden_size,
layout=layout,
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
gru = GRUHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = gru.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
name="test_gru_batchwise",
)
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 5
weight_scale = 0.1
number_of_gates = 3
node = onnx.helper.make_node(
"GRU", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
gru = GRUHelper(X=input, W=W, R=R)
_, Y_h = gru.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y_h.astype(np.float32)],
name="test_gru_defaults",
)
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
np.float32
)
input_size = 3
hidden_size = 3
weight_scale = 0.1
custom_bias = 0.1
number_of_gates = 3
node = onnx.helper.make_node(
"GRU",
inputs=["X", "W", "R", "B"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
# Adding custom bias
W_B = custom_bias * np.ones((1, number_of_gates * hidden_size)).astype(
np.float32
)
R_B = np.zeros((1, number_of_gates * hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)
gru = GRUHelper(X=input, W=W, R=R, B=B)
_, Y_h = gru.step()
expect(
node,
inputs=[input, W, R, B],
outputs=[Y_h.astype(np.float32)],
name="test_gru_with_initial_bias",
)
input = np.array(
[
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]],
[[10.0, 11.0, 12.0], [13.0, 14.0, 15.0], [16.0, 17.0, 18.0]],
]
).astype(np.float32)
input_size = 3
hidden_size = 5
number_of_gates = 3
node = onnx.helper.make_node(
"GRU",
inputs=["X", "W", "R", "B"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
W = np.random.randn(1, number_of_gates * hidden_size, input_size).astype(
np.float32
)
R = np.random.randn(1, number_of_gates * hidden_size, hidden_size).astype(
np.float32
)
# Adding custom bias
W_B = np.random.randn(1, number_of_gates * hidden_size).astype(np.float32)
R_B = np.random.randn(1, number_of_gates * hidden_size).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)
gru = GRUHelper(X=input, W=W, R=R, B=B)
_, Y_h = gru.step()
expect(
node,
inputs=[input, W, R, B],
outputs=[Y_h.astype(np.float32)],
name="test_gru_seq_length",
)
Given data tensor of rank r >= 1, and indices tensor of rank q, gather
entries of the axis dimension of data (by default outer-most one as axis=0) indexed by indices, and concatenates
them in an output tensor of rank q + (r - 1).
It is an indexing operation that indexes into the input data along a single (specified) axis.
Each entry in indices produces a r-1 dimensional slice of the input tensor.
The entire operation produces, conceptually, a q-dimensional tensor of r-1 dimensional slices,
which is arranged into a q + (r-1)-dimensional tensor, with the q dimensions taking the
place of the original axis that is being indexed into.
The following few examples illustrate how Gather works for specific shapes of data,
indices, and given value of axis:
| data shape | indices shape | axis | output shape | output equation |
|---|---|---|---|---|
| (P, Q) | ( ) (a scalar) | 0 | (Q) | output[q] = data[indices, q] |
| (P, Q, R) | ( ) (a scalar) | 1 | (P, R) | output[p, r] = data[p, indices, r] |
| (P, Q) | (R, S) | 0 | (R, S, Q) | output[r, s, q] = data[ [indices[r, s], q] |
| (P, Q) | (R, S) | 1 | (P, R, S) | output[p, r, s] = data[ p, indices[r, s]] |
More generally, if axis = 0, let k = indices[i_{0}, ..., i_{q-1}]
then output[i_{0}, ..., i_{q-1}, j_{0}, ..., j_{r-2}] = input[k , j_{0}, ..., j_{r-2}]:
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
indices = [
[0, 1],
[1, 2],
]
output = [
[
[1.0, 1.2],
[2.3, 3.4],
],
[
[2.3, 3.4],
[4.5, 5.7],
],
]
If axis = 1, let k = indices[i_{0}, ..., i_{q-1}]
then output[j_{0}, i_{0}, ..., i_{q-1}, j_{1}, ..., j_{r-2}] = input[j_{0}, k, j_{1}, ..., j_{r-2}]:
data = [
[1.0, 1.2, 1.9],
[2.3, 3.4, 3.9],
[4.5, 5.7, 5.9],
]
indices = [
[0, 2],
]
axis = 1,
output = [
[[1.0, 1.9]],
[[2.3, 3.9]],
[[4.5, 5.9]],
]
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Gather-1">1</a>, <a href="Changelog.md#Gather-11">11</a>
node = onnx.helper.make_node(
"Gather",
inputs=["data", "indices"],
outputs=["y"],
axis=0,
)
data = np.random.randn(5, 4, 3, 2).astype(np.float32)
indices = np.array([0, 1, 3])
y = np.take(data, indices, axis=0)
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_0",
)
node = onnx.helper.make_node(
"Gather",
inputs=["data", "indices"],
outputs=["y"],
axis=1,
)
data = np.random.randn(5, 4, 3, 2).astype(np.float32)
indices = np.array([0, 1, 3])
y = np.take(data, indices, axis=1)
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_1",
)
node = onnx.helper.make_node(
"Gather",
inputs=["data", "indices"],
outputs=["y"],
axis=1,
)
data = np.random.randn(3, 3).astype(np.float32)
indices = np.array([[0, 2]])
y = np.take(data, indices, axis=1)
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_2d_indices",
)
node = onnx.helper.make_node(
"Gather",
inputs=["data", "indices"],
outputs=["y"],
axis=0,
)
data = np.arange(10).astype(np.float32)
indices = np.array([0, -9, -10])
y = np.take(data, indices, axis=0)
# print(y)
# [0. 1. 0.]
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_negative_indices",
)
GatherElements takes two inputs data and indices of the same rank r >= 1
and an optional attribute axis that identifies an axis of data
(by default, the outer-most axis, that is axis 0). It is an indexing operation
that produces its output by indexing into the input data tensor at index
positions determined by elements of the indices tensor.
Its output shape is the same as the shape of indices and consists of one value
(gathered from the data) for each element in indices.
For instance, in the 3-D case (r = 3), the output produced is determined by the following equations:
out[i][j][k] = input[index[i][j][k]][j][k] if axis = 0,
out[i][j][k] = input[i][index[i][j][k]][k] if axis = 1,
out[i][j][k] = input[i][j][index[i][j][k]] if axis = 2,
This operator is also the inverse of ScatterElements. It is similar to Torch's gather operation.
Example 1:
data = [
[1, 2],
[3, 4],
]
indices = [
[0, 0],
[1, 0],
]
axis = 1
output = [
[1, 1],
[4, 3],
]
Example 2:
data = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
indices = [
[1, 2, 0],
[2, 0, 0],
]
axis = 0
output = [
[4, 8, 3],
[7, 2, 3],
]
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GatherElements-11">11</a>
axis = 1
node = onnx.helper.make_node(
"GatherElements",
inputs=["data", "indices"],
outputs=["y"],
axis=axis,
)
data = np.array([[1, 2], [3, 4]], dtype=np.float32)
indices = np.array([[0, 0], [1, 0]], dtype=np.int32)
y = gather_elements(data, indices, axis)
# print(y) produces
# [[1, 1],
# [4, 3]]
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_elements_0",
)
axis = 0
node = onnx.helper.make_node(
"GatherElements",
inputs=["data", "indices"],
outputs=["y"],
axis=axis,
)
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
indices = np.array([[1, 2, 0], [2, 0, 0]], dtype=np.int32)
y = gather_elements(data, indices, axis)
# print(y) produces
# [[4, 8, 3],
# [7, 2, 3]]
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_elements_1",
)
axis = 0
node = onnx.helper.make_node(
"GatherElements",
inputs=["data", "indices"],
outputs=["y"],
axis=axis,
)
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
indices = np.array([[-1, -2, 0], [-2, 0, 0]], dtype=np.int32)
y = gather_elements(data, indices, axis)
# print(y) produces
# [[7, 5, 3],
# [4, 2, 3]]
expect(
node,
inputs=[data, indices.astype(np.int64)],
outputs=[y],
name="test_gather_elements_negative_indices",
)
Given data tensor of rank r >= 1, indices tensor of rank q >= 1, and batch_dims integer b, this operator gathers
slices of data into an output tensor of rank q + r - indices_shape[-1] - 1 - b.
indices is an q-dimensional integer tensor, best thought of as a (q-1)-dimensional tensor of index-tuples into data,
where each element defines a slice of data
batch_dims (denoted as b) is an integer indicating the number of batch dimensions, i.e the leading b number of dimensions of
data tensor and indices are representing the batches, and the gather starts from the b+1 dimension.
Some salient points about the inputs' rank and shape:
r >= 1 and q >= 1 are to be honored. There is no dependency condition to be met between ranks r and q
The first b dimensions of the shape of indices tensor and data tensor must be equal.
b < min(q, r) is to be honored.
The indices_shape[-1] should have a value between 1 (inclusive) and rank r-b (inclusive)
All values in indices are expected to be within bounds [-s, s-1] along axis of size s (i.e.) -data_shape[i] <= indices[...,i] <= data_shape[i] - 1.
It is an error if any of the index values are out of bounds.
The output is computed as follows:
The output tensor is obtained by mapping each index-tuple in the indices tensor to the corresponding slice of the input data.
If indices_shape[-1] > r-b => error condition
If indices_shape[-1] == r-b, since the rank of indices is q, indices can be thought of as N (q-b-1)-dimensional tensors
containing 1-D tensors of dimension r-b, where N is an integer equals to the product of 1 and all the elements in the batch dimensions
of the indices_shape. Let us think of each such r-b ranked tensor as indices_slice. Each scalar value corresponding to data[0:b-1,indices_slice]
is filled into the corresponding location of the (q-b-1)-dimensional tensor to form the output tensor (Example 1 below)
If indices_shape[-1] < r-b, since the rank of indices is q, indices can be thought of as N (q-b-1)-dimensional tensor
containing 1-D tensors of dimension < r-b. Let us think of each such tensors as indices_slice. Each tensor slice corresponding
to data[0:b-1, indices_slice , :] is filled into the corresponding location of the (q-b-1)-dimensional tensor
to form the output tensor (Examples 2, 3, 4 and 5 below)
This operator is the inverse of ScatterND.
Example 1
batch_dims = 0
data = [[0,1],[2,3]] # data_shape = [2, 2]
indices = [[0,0],[1,1]] # indices_shape = [2, 2]
output = [0,3] # output_shape = [2]
Example 2
batch_dims = 0
data = [[0,1],[2,3]] # data_shape = [2, 2]
indices = [[1],[0]] # indices_shape = [2, 1]
output = [[2,3],[0,1]] # output_shape = [2, 2]
Example 3
batch_dims = 0
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[0,1],[1,0]] # indices_shape = [2, 2]
output = [[2,3],[4,5]] # output_shape = [2, 2]
Example 4
batch_dims = 0
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[[0,1]],[[1,0]]] # indices_shape = [2, 1, 2]
output = [[[2,3]],[[4,5]]] # output_shape = [2, 1, 2]
Example 5
batch_dims = 1
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[1],[0]] # indices_shape = [2, 1]
output = [[2,3],[4,5]] # output_shape = [2, 2]
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GatherND-11">11</a>, <a href="Changelog.md#GatherND-12">12</a>
node = onnx.helper.make_node(
"GatherND",
inputs=["data", "indices"],
outputs=["output"],
)
data = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], dtype=np.float32)
indices = np.array([[[0, 1]], [[1, 0]]], dtype=np.int64)
output = gather_nd_impl(data, indices, 0)
expected_output = np.array([[[2, 3]], [[4, 5]]], dtype=np.float32)
assert np.array_equal(output, expected_output)
expect(
node,
inputs=[data, indices],
outputs=[output],
name="test_gathernd_example_float32",
)
node = onnx.helper.make_node(
"GatherND",
inputs=["data", "indices"],
outputs=["output"],
)
data = np.array([[0, 1], [2, 3]], dtype=np.int32)
indices = np.array([[0, 0], [1, 1]], dtype=np.int64)
output = gather_nd_impl(data, indices, 0)
expected_output = np.array([0, 3], dtype=np.int32)
assert np.array_equal(output, expected_output)
expect(
node,
inputs=[data, indices],
outputs=[output],
name="test_gathernd_example_int32",
)
node = onnx.helper.make_node(
"GatherND",
inputs=["data", "indices"],
outputs=["output"],
batch_dims=1,
)
data = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], dtype=np.int32)
indices = np.array([[1], [0]], dtype=np.int64)
output = gather_nd_impl(data, indices, 1)
expected_output = np.array([[2, 3], [4, 5]], dtype=np.int32)
assert np.array_equal(output, expected_output)
expect(
node,
inputs=[data, indices],
outputs=[output],
name="test_gathernd_example_int32_batch_dim1",
)
Gelu takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the gaussian error linear units function, $y = 0.5 * x * (1 + erf(x/sqrt(2)))$ is applied to the tensor elementwise. If the attribute "approximate" is set to "tanh", the function estimation, $y = 0.5 * x * (1 + Tanh(sqrt(2/\pi) * (x + 0.044715 * x^3)))$ is used and applied to the tensor elementwise.
This version of the operator has been available since version 20 of the default ONNX operator set.
node = onnx.helper.make_node("Gelu", inputs=["x"], outputs=["y"])
x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.15865526, 0., 0.84134474]
y = (0.5 * x * (1 + np.vectorize(math.erf)(x / np.sqrt(2)))).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_default_1")
x = np.random.randn(3, 4, 5).astype(np.float32)
# expected output [2.99595031, 3.99987331, 4.99999857]
y = (0.5 * x * (1 + np.vectorize(math.erf)(x / np.sqrt(2)))).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_default_2")
node = onnx.helper.make_node(
"Gelu", inputs=["x"], outputs=["y"], approximate="tanh"
)
x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.158808, 0., 0.841192]
y = (
0.5
* x
* (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * np.power(x, 3))))
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_tanh_1")
x = np.random.randn(3, 4, 5).astype(np.float32)
# expected output [2.9963627, 3.99993, 4.9999995]
y = (
0.5
* x
* (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * np.power(x, 3))))
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_tanh_2")
General Matrix multiplication: https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3
Compute Y = alpha * A' * B' + beta * C, where input tensor A has shape (M, K) or (K, M), input tensor B has shape (K, N) or (N, K), input tensor C is broadcastable to shape (M, N), and output tensor Y has shape (M, N). A will be transposed before doing the computation if attribute transA is non-zero, same for B and transB. This operator supports unidirectional broadcasting (tensor C should be unidirectional broadcastable to tensor A * B); for more details please check the doc. This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Gemm-1">1</a>, <a href="Changelog.md#Gemm-6">6</a>, <a href="Changelog.md#Gemm-7">7</a>, <a href="Changelog.md#Gemm-9">9</a>, <a href="Changelog.md#Gemm-11">11</a>
node = onnx.helper.make_node(
"Gemm",
inputs=["a", "b", "c"],
outputs=["y"],
alpha=0.25,
beta=0.35,
transA=1,
transB=1,
)
a = np.random.ranf([4, 3]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.random.ranf([1, 5]).astype(np.float32)
y = gemm_reference_implementation(
a, b, c, transA=1, transB=1, alpha=0.25, beta=0.35
)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_all_attributes")
node = onnx.helper.make_node(
"Gemm", inputs=["a", "b", "c"], outputs=["y"], alpha=0.5
)
a = np.random.ranf([3, 5]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, alpha=0.5)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_alpha")
node = onnx.helper.make_node(
"Gemm", inputs=["a", "b", "c"], outputs=["y"], beta=0.5
)
a = np.random.ranf([2, 7]).astype(np.float32)
b = np.random.ranf([7, 4]).astype(np.float32)
c = np.random.ranf([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, beta=0.5)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_beta")
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 6]).astype(np.float32)
b = np.random.ranf([6, 4]).astype(np.float32)
c = np.random.ranf([3, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_matrix_bias"
)
node = onnx.helper.make_node("Gemm", inputs=["a", "b"], outputs=["y"])
a = np.random.ranf([2, 10]).astype(np.float32)
b = np.random.ranf([10, 3]).astype(np.float32)
y = gemm_reference_implementation(a, b)
expect(node, inputs=[a, b], outputs=[y], name="test_gemm_default_no_bias")
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([2, 3]).astype(np.float32)
b = np.random.ranf([3, 4]).astype(np.float32)
c = np.array(3.14).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_scalar_bias"
)
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 7]).astype(np.float32)
b = np.random.ranf([7, 3]).astype(np.float32)
c = np.random.ranf([1]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
node,
inputs=[a, b, c],
outputs=[y],
name="test_gemm_default_single_elem_vector_bias",
)
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([2, 7]).astype(np.float32)
b = np.random.ranf([7, 4]).astype(np.float32)
c = np.random.ranf([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_vector_bias"
)
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 5]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_zero_bias")
node = onnx.helper.make_node(
"Gemm", inputs=["a", "b", "c"], outputs=["y"], transA=1
)
a = np.random.ranf([6, 3]).astype(np.float32)
b = np.random.ranf([6, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, transA=1)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_transposeA")
node = onnx.helper.make_node(
"Gemm", inputs=["a", "b", "c"], outputs=["y"], transB=1
)
a = np.random.ranf([3, 6]).astype(np.float32)
b = np.random.ranf([4, 6]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, transB=1)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_transposeB")
GlobalAveragePool consumes an input tensor X and applies average pooling across the values in the same channel. This is equivalent to AveragePool with kernel size equal to the spatial dimension of input tensor.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GlobalAveragePool-1">1</a>
node = onnx.helper.make_node(
"GlobalAveragePool",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(1, 3, 5, 5).astype(np.float32)
y = np.mean(x, axis=tuple(range(2, np.ndim(x))), keepdims=True)
expect(node, inputs=[x], outputs=[y], name="test_globalaveragepool")
node = onnx.helper.make_node(
"GlobalAveragePool",
inputs=["x"],
outputs=["y"],
)
x = np.array(
[
[
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
]
]
).astype(np.float32)
y = np.array([[[[5]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_globalaveragepool_precomputed")
GlobalLpPool consumes an input tensor X and applies lp pool pooling across the values in the same channel. This is equivalent to LpPool with kernel size equal to the spatial dimension of input tensor.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GlobalLpPool-1">1</a>, <a href="Changelog.md#GlobalLpPool-2">2</a>
GlobalMaxPool consumes an input tensor X and applies max pooling across the values in the same channel. This is equivalent to MaxPool with kernel size equal to the spatial dimension of input tensor.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GlobalMaxPool-1">1</a>
node = onnx.helper.make_node(
"GlobalMaxPool",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(1, 3, 5, 5).astype(np.float32)
y = np.max(x, axis=tuple(range(2, np.ndim(x))), keepdims=True)
expect(node, inputs=[x], outputs=[y], name="test_globalmaxpool")
node = onnx.helper.make_node(
"GlobalMaxPool",
inputs=["x"],
outputs=["y"],
)
x = np.array(
[
[
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
]
]
).astype(np.float32)
y = np.array([[[[9]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_globalmaxpool_precomputed")
Returns the tensor resulted from performing the greater logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Greater-1">1</a>, <a href="Changelog.md#Greater-7">7</a>, <a href="Changelog.md#Greater-9">9</a>
node = onnx.helper.make_node(
"Greater",
inputs=["x", "y"],
outputs=["greater"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater")
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.random.randn(3, 4, 5).astype(np.int8)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_int8")
x = np.random.randn(3, 4, 5).astype(np.int16)
y = np.random.randn(3, 4, 5).astype(np.int16)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_uint64")
node = onnx.helper.make_node(
"GreaterOrEqual",
inputs=["x", "y"],
outputs=["greater_equal"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal")
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.random.randn(3, 4, 5).astype(np.int8)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_int8")
x = np.random.randn(3, 4, 5).astype(np.int16)
y = np.random.randn(3, 4, 5).astype(np.int16)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_uint64")
node = onnx.helper.make_node(
"Greater",
inputs=["x", "y"],
outputs=["greater"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_bcast")
node = onnx.helper.make_node(
"GreaterOrEqual",
inputs=["x", "y"],
outputs=["greater_equal"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_bcast")
Returns the tensor resulted from performing the greater_equal logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 16 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GreaterOrEqual-12">12</a>
Given an input X and a flow-field grid, computes the output Y using X values and pixel locations from the grid.
For spatial input X with shape (N, C, H, W), the grid will have shape (N, H_out, W_out, 2),
the output Y will have shape (N, C, H_out, W_out). For volumetric input X with shape (N, C, D, H, W),
the grid will have shape (N, D_out, H_out, W_out, 3), the output Y will have shape (N, C, D_out, H_out, W_out).
More generally, for an input X of rank r+2 with shape (N, C, d1, d2, ..., dr),
the grid will have shape (N, D1_out, D2_out, ..., Dr_out, r), the output Y will have shape (N, C, D1_out, D2_out, ..., Dr_out).
The tensor X contains values at centers of square pixels (voxels, etc) locations such as (n, c, d1_in, d2_in, ..., dr_in).
The (n, d1_out, d2_out, ..., dr_out, :) values from the tensor grid are the normalized positions for interpolating the values
at the (n, c, d1_out, d2_out, ..., dr_out) locations from the output tensor Y using a specified interpolation method (the mode)
and a padding mode (for grid positions falling outside the 2-dimensional image).
For example, the values in grid[n, h_out, w_out, :] are size-2 vectors specifying normalized positions in the 2-dimensional space of X.
They are used to interpolate output values of Y[n, c, h_out, w_out].
The GridSample operator is often used in doing grid generator and sampler in the Spatial Transformer Networks. See also in torch.nn.functional.grid_sample.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GridSample-16">16</a>, <a href="Changelog.md#GridSample-20">20</a>
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
padding_mode="zeros",
align_corners=0,
)
# X shape, [N, C, H, W] - [1, 1, 4, 4]
X = np.array(
[
[
[
[0.0, 1.0, 2.0, 3.0],
[4.0, 5.0, 6.0, 7.0],
[8.0, 9.0, 10.0, 11.0],
[12.0, 13.0, 14.0, 15.0],
]
]
],
dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 6, 6, 2]
Grid = np.array(
[
[
[
[-1.0000, -1.0000],
[-0.6000, -1.0000],
[-0.2000, -1.0000],
[0.2000, -1.0000],
[0.6000, -1.0000],
[1.0000, -1.0000],
],
[
[-1.0000, -0.6000],
[-0.6000, -0.6000],
[-0.2000, -0.6000],
[0.2000, -0.6000],
[0.6000, -0.6000],
[1.0000, -0.6000],
],
[
[-1.0000, -0.2000],
[-0.6000, -0.2000],
[-0.2000, -0.2000],
[0.2000, -0.2000],
[0.6000, -0.2000],
[1.0000, -0.2000],
],
[
[-1.0000, 0.2000],
[-0.6000, 0.2000],
[-0.2000, 0.2000],
[0.2000, 0.2000],
[0.6000, 0.2000],
[1.0000, 0.2000],
],
[
[-1.0000, 0.6000],
[-0.6000, 0.6000],
[-0.2000, 0.6000],
[0.2000, 0.6000],
[0.6000, 0.6000],
[1.0000, 0.6000],
],
[
[-1.0000, 1.0000],
[-0.6000, 1.0000],
[-0.2000, 1.0000],
[0.2000, 1.0000],
[0.6000, 1.0000],
[1.0000, 1.0000],
],
]
],
dtype=np.float32,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 6, 6]
Y = np.array(
[
[
[
[0.0000, 0.1500, 0.5500, 0.9500, 1.3500, 0.7500],
[0.6000, 1.5000, 2.3000, 3.1000, 3.9000, 2.1000],
[2.2000, 4.7000, 5.5000, 6.3000, 7.1000, 3.7000],
[3.8000, 7.9000, 8.7000, 9.5000, 10.3000, 5.3000],
[5.4000, 11.1000, 11.9000, 12.7000, 13.5000, 6.9000],
[3.0000, 6.1500, 6.5500, 6.9500, 7.3500, 3.7500],
]
]
],
dtype=np.float32,
)
expect(node, inputs=[X, Grid], outputs=[Y], name="test_gridsample")
# X shape, [N, C, H, W] - [1, 1, 3, 2]
X = np.array(
[[[[0.0, 1.0], [2.0, 3.0], [4.0, 5.0]]]],
dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 2, 4, 2]
Grid = np.array(
[
[
[
[-1.0000, -1.0000],
[-0.5000, -0.5000],
[-0.2000, -0.2000],
[0.0000, 0.0000],
],
[
[0.0000, 0.0000],
[-0.2000, -0.2000],
[0.5000, 0.5000],
[1.0000, 1.0000],
],
]
],
dtype=np.float32,
)
# setting mode = 'bilinear', default align_corners = 0
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
[[[[0.0000, 0.5000, 1.7000, 2.5000], [2.5000, 1.7000, 4.5000, 1.2500]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bilinear],
name="test_gridsample_bilinear",
)
# setting mode = 'bilinear', align_corners = 1
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_align_corners = np.array(
[[[[0.0000, 1.2500, 2.0000, 2.5000], [2.5000, 2.0000, 3.7500, 5.0000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_align_corners],
name="test_gridsample_aligncorners_true",
)
# setting mode = 'nearest'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="nearest",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
[[[[0.0, 0.0, 2.0, 2.0], [2.0, 2.0, 5.0, 0.0]]]],
dtype=np.float32,
)
expect(
node, inputs=[X, Grid], outputs=[Y_nearest], name="test_gridsample_nearest"
)
# setting mode = 'bicubic'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="cubic",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
[[[[-0.1406, 0.3828, 1.7556, 2.9688], [2.9688, 1.7556, 5.1445, 1.3906]]]],
dtype=np.float32,
)
expect(
node, inputs=[X, Grid], outputs=[Y_bicubic], name="test_gridsample_bicubic"
)
# ============================================================================
# Additional tests
# The reference output tensors were generated using PyTorch 2.0.
Grid = np.array(
[
[
[[-1.0, -0.8], [-0.6, -0.5], [-0.1, -0.2], [0.7, 0.0]],
[[0.0, 0.4], [0.2, -0.2], [-0.3, 0.5], [-1.0, 1.0]],
]
],
dtype=np.float32,
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="nearest",
align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
[[[[0.0, 0.0, 2.0, 3.0], [4.0, 3.0, 4.0, 4.0]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_nearest],
name="test_gridsample_nearest_align_corners_0_additional_1",
)
# setting mode = 'nearest'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="nearest",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
[[[[0.0, 0.0, 2.0, 3.0], [2.0, 3.0, 4.0, 4.0]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_nearest],
name="test_gridsample_nearest_align_corners_1_additional_1",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
[[[[0.0000, 0.4500, 1.8000, 2.4000], [3.7000, 2.1000, 3.7000, 1.0000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bilinear],
name="test_gridsample_bilinear_align_corners_0_additional_1",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
[[[[0.4000, 1.2000, 2.0500, 2.8500], [3.3000, 2.2000, 3.3500, 4.0000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bilinear],
name="test_gridsample_bilinear_align_corners_1_additional_1",
)
# These two new bicubic tests produces slightly higher error ~5e-5
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="cubic",
align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
[
[
[
[-0.173250, 0.284265, 1.923106, 2.568000],
[5.170375, 2.284414, 4.744844, 1.046875],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bicubic],
name="test_gridsample_bicubic_align_corners_0_additional_1",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="cubic",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
[
[
[
[0.304001, 1.128750, 2.266270, 3.144844],
[4.531500, 2.455360, 4.599819, 4.000000],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bicubic],
name="test_gridsample_bicubic_align_corners_1_additional_1",
)
# X shape, [N, C, H, W] - [1, 1, 3, 2]
X = np.array(
[[[[0.0, 1.0], [2.0, 3.0], [4.0, 5.0]]]],
dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 2, 4, 2]
Grid = np.array(
[
[
[
[-10.0000, -10.0000],
[-5.0000, -5.0000],
[-0.2000, -0.2000],
[10.0000, 10.0000],
],
[
[10.0000, 10.0000],
[-0.2000, -0.2000],
[5.0000, 5.0000],
[10.0000, 10.0000],
],
]
],
dtype=np.float32,
)
# setting padding_mode = 'zeros'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
padding_mode="zeros",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_zeros = np.array(
[[[[0.0000, 0.0000, 1.7000, 0.0000], [0.0000, 1.7000, 0.0000, 0.0000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_zeros],
name="test_gridsample_zeros_padding",
)
# setting padding_mode = 'border'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
padding_mode="border",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_border = np.array(
[[[[0.0000, 0.0000, 1.7000, 5.0000], [5.0000, 1.7000, 5.0000, 5.0000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_border],
name="test_gridsample_border_padding",
)
# setting padding_mode = 'reflection'
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
padding_mode="reflection",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_reflection = np.array(
[[[[2.5000, 0.0000, 1.7000, 2.5000], [2.5000, 1.7000, 5.0000, 2.5000]]]],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_reflection],
name="test_gridsample_reflection_padding",
)
X = np.array(
[
[
[
[[1.0, 2.0], [3.0, 4.0]],
[[5.0, 6.0], [7.0, 8.0]],
[[9.0, 10.0], [11.0, 12.0]],
]
]
],
dtype=np.float32,
)
Grid = np.array(
[
[
[
[[-1.0, -1.0, -1.0], [-1.0, -0.5, 0.3]],
[[-0.5, -0.5, -0.5], [1.0, -0.6, -1.0]],
[[-0.2, -0.2, -0.2], [0.4, 0.2, 0.6]],
[[0.0, 0.0, 0.0], [-1.0, 0.0, 0.0]],
],
[
[[0.0, 0.0, 0.0], [-1.0, 1.0, 0.0]],
[[-0.2, -0.2, -0.2], [1.0, 0.4, -0.2]],
[[0.5, 0.5, 0.5], [-1.0, -0.8, 0.8]],
[[1.0, 1.0, 1.0], [0.4, 0.6, -0.3]],
],
]
],
dtype=np.float32,
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="nearest",
align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
[
[
[
[[1.0, 5.0], [1.0, 0.0], [5.0, 12.0], [5.0, 5.0]],
[[5.0, 0.0], [5.0, 0.0], [12.0, 9.0], [0.0, 8.0]],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_nearest],
name="test_gridsample_volumetric_nearest_align_corners_0",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="nearest",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
[
[
[
[[1.0, 5.0], [1.0, 2.0], [5.0, 12.0], [5.0, 5.0]],
[[5.0, 7.0], [5.0, 8.0], [12.0, 9.0], [12.0, 8.0]],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_nearest],
name="test_gridsample_volumetric_nearest_align_corners_1",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
[
[
[
[
[0.1250, 3.4000],
[2.0000, 0.4500],
[4.7000, 10.9000],
[6.5000, 3.0000],
],
[
[6.5000, 1.7500],
[4.7000, 3.3000],
[11.0000, 2.5200],
[1.5000, 5.4900],
],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bilinear],
name="test_gridsample_volumetric_bilinear_align_corners_0",
)
node = onnx.helper.make_node(
"GridSample",
inputs=["X", "Grid"],
outputs=["Y"],
mode="linear",
align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
[
[
[
[
[1.0000, 6.7000],
[3.7500, 2.4000],
[5.4000, 9.3000],
[6.5000, 6.0000],
],
[
[6.5000, 7.0000],
[5.4000, 6.6000],
[9.2500, 8.4000],
[12.0000, 6.1000],
],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[X, Grid],
outputs=[Y_bilinear],
name="test_gridsample_volumetric_bilinear_align_corners_1",
)
A GroupNormalization function. Carries out group normalization as described in the paper https://arxiv.org/abs/1803.08494
This operator transforms input according to
y = scale * (x - mean) / sqrt(variance + epsilon) + bias,
where the mean and variance are computed per instance per group of channels, and
scale and bias should be specified for each channel. The number of
groups num_groups should be divisible by the number of channels so that there are
an equal number of channels per group.
The overall computation has two stages: the first stage normalizes the elements to
have zero mean and unit variance for each instance in each group, and the second
stage scales and shifts the results of the first stage. The floating-point precision
used in the first stage is determined by the stash_type attribute. For example,
if stash_type is 1, the operator casts all input variables to 32-bit float,
performs the computation, and finally casts the normalized results back to the
original type of X. The second stage does not depend on stash_type.
When the number of groups is the same as the number of channels, this operator is equivalent to InstanceNormalization. When there is only one group, this operator is equivalent to LayerNormalization.
This version of the operator has been available since version 21 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#GroupNormalization-18">18</a>
c = 4
num_groups = 2
x = np.random.randn(3, c, 2, 2).astype(np.float32)
scale = np.random.randn(c).astype(np.float32)
bias = np.random.randn(c).astype(np.float32)
epsilon = 1e-2
y = _group_normalization(x, num_groups, scale, bias, epsilon).astype(np.float32)
node = onnx.helper.make_node(
"GroupNormalization",
inputs=["x", "scale", "bias"],
outputs=["y"],
epsilon=epsilon,
num_groups=num_groups,
)
expect(
node,
inputs=[x, scale, bias],
outputs=[y],
name="test_group_normalization_epsilon",
)
c = 4
num_groups = 2
x = np.random.randn(3, c, 2, 2).astype(np.float32)
scale = np.random.randn(c).astype(np.float32)
bias = np.random.randn(c).astype(np.float32)
y = _group_normalization(x, num_groups, scale, bias).astype(np.float32)
node = onnx.helper.make_node(
"GroupNormalization",
inputs=["x", "scale", "bias"],
outputs=["y"],
num_groups=num_groups,
)
expect(
node,
inputs=[x, scale, bias],
outputs=[y],
name="test_group_normalization_example",
)
Generates a Hamming window as described in the paper https://ieeexplore.ieee.org/document/1455106.
This version of the operator has been available since version 17 of the default ONNX operator set.
# Test periodic window
node = onnx.helper.make_node(
"HammingWindow",
inputs=["x"],
outputs=["y"],
)
size = np.int32(10)
a0 = 25 / 46
a1 = 1 - a0
y = a0 - a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(
node,
inputs=[size],
outputs=[y.astype(np.float32)],
name="test_hammingwindow",
)
# Test symmetric window
node = onnx.helper.make_node(
"HammingWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 25 / 46
a1 = 1 - a0
y = a0 - a1 * np.cos(
2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(
node,
inputs=[size],
outputs=[y.astype(np.float32)],
name="test_hammingwindow_symmetric",
)
Generates a Hann window as described in the paper https://ieeexplore.ieee.org/document/1455106.
This version of the operator has been available since version 17 of the default ONNX operator set.
# Test periodic window
node = onnx.helper.make_node(
"HannWindow",
inputs=["x"],
outputs=["y"],
)
size = np.int32(10)
a0 = 0.5
a1 = 0.5
y = a0 - a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(
node, inputs=[size], outputs=[y.astype(np.float32)], name="test_hannwindow"
)
# Test symmetric window
node = onnx.helper.make_node(
"HannWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 0.5
a1 = 0.5
y = a0 - a1 * np.cos(
2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(
node,
inputs=[size],
outputs=[y.astype(np.float32)],
name="test_hannwindow_symmetric",
)
HardSigmoid takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the HardSigmoid function, y = max(0, min(1, alpha * x + beta)), is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#HardSigmoid-1">1</a>, <a href="Changelog.md#HardSigmoid-6">6</a>
node = onnx.helper.make_node(
"HardSigmoid", inputs=["x"], outputs=["y"], alpha=0.5, beta=0.6
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.clip(x * 0.5 + 0.6, 0, 1) # expected output [0.1, 0.6, 1.]
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x * 0.5 + 0.6, 0, 1)
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid")
default_alpha = 0.2
default_beta = 0.5
node = onnx.helper.make_node(
"HardSigmoid",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x * default_alpha + default_beta, 0, 1)
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid_default")
HardSwish takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the HardSwish function, y = x * max(0, min(1, alpha * x + beta)) = x * HardSigmoid<alpha, beta>(x), where alpha = 1/6 and beta = 0.5, is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#HardSwish-14">14</a>
node = onnx.helper.make_node(
"HardSwish",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = hardswish(x)
expect(node, inputs=[x], outputs=[y], name="test_hardswish")
The operator computes the hardmax values for the given input:
Hardmax(element in input, axis) = 1 if the element is the first maximum value along the specified axis, 0 otherwise
The "axis" attribute indicates the dimension along which Hardmax will be performed. The output tensor has the same shape and contains the Hardmax values of the corresponding input.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Hardmax-1">1</a>, <a href="Changelog.md#Hardmax-11">11</a>
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
)
x = np.array([[3, 0, 1, 2], [2, 5, 1, 0], [0, 1, 3, 2], [0, 1, 2, 3]]).astype(
np.float32
)
# expect result:
# [[1. 0. 0. 0.]
# [0. 1. 0. 0.]
# [0. 0. 1. 0.]
# [0. 0. 0. 1.]]
y = hardmax(x)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_example")
# For multiple occurrences of the maximal values, the first occurrence is selected for one-hot output
x = np.array([[3, 3, 3, 1]]).astype(np.float32)
# expect result:
# [[1, 0, 0, 0]]
y = hardmax(x)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_one_hot")
x = np.random.randn(3, 4, 5).astype(np.float32)
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
axis=0,
)
y = hardmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_0")
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
axis=1,
)
y = hardmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_1")
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
axis=2,
)
y = hardmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_2")
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
axis=-1,
)
y = hardmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_negative_axis")
# default axis is -1
node = onnx.helper.make_node(
"Hardmax",
inputs=["x"],
outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_default_axis")
Identity operator
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Identity-1">1</a>, <a href="Changelog.md#Identity-13">13</a>, <a href="Changelog.md#Identity-14">14</a>, <a href="Changelog.md#Identity-16">16</a>, <a href="Changelog.md#Identity-19">19</a>, <a href="Changelog.md#Identity-21">21</a>, <a href="Changelog.md#Identity-23">23</a>, <a href="Changelog.md#Identity-24">24</a>
node = onnx.helper.make_node(
"Identity",
inputs=["x"],
outputs=["y"],
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
expect(node, inputs=[data], outputs=[data], name="test_identity")
ten_in_tp = onnx.helper.make_tensor_type_proto(
onnx.TensorProto.FLOAT, shape=[5]
)
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)
opt_in_tp = onnx.helper.make_optional_type_proto(seq_in_tp)
identity_node = onnx.helper.make_node(
"Identity", inputs=["opt_in"], outputs=["opt_out"]
)
x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]
expect(
identity_node,
inputs=[x],
outputs=[x],
name="test_identity_opt",
opset_imports=[onnx.helper.make_opsetid("", 16)],
input_type_protos=[opt_in_tp],
output_type_protos=[opt_in_tp],
)
node = onnx.helper.make_node(
"Identity",
inputs=["x"],
outputs=["y"],
)
data = [
np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
),
np.array(
[
[
[
[2, 3],
[1, 5],
]
]
],
dtype=np.float32,
),
]
expect(node, inputs=[data], outputs=[data], name="test_identity_sequence")
If conditional
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#If-1">1</a>, <a href="Changelog.md#If-11">11</a>, <a href="Changelog.md#If-13">13</a>, <a href="Changelog.md#If-16">16</a>, <a href="Changelog.md#If-19">19</a>, <a href="Changelog.md#If-21">21</a>, <a href="Changelog.md#If-23">23</a>, <a href="Changelog.md#If-24">24</a>
# Given a bool scalar input cond.
# return constant tensor x if cond is True, otherwise return constant tensor y.
then_out = onnx.helper.make_tensor_value_info(
"then_out", onnx.TensorProto.FLOAT, [5]
)
else_out = onnx.helper.make_tensor_value_info(
"else_out", onnx.TensorProto.FLOAT, [5]
)
x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
y = np.array([5, 4, 3, 2, 1]).astype(np.float32)
then_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["then_out"],
value=onnx.numpy_helper.from_array(x),
)
else_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["else_out"],
value=onnx.numpy_helper.from_array(y),
)
then_body = onnx.helper.make_graph(
[then_const_node], "then_body", [], [then_out]
)
else_body = onnx.helper.make_graph(
[else_const_node], "else_body", [], [else_out]
)
if_node = onnx.helper.make_node(
"If",
inputs=["cond"],
outputs=["res"],
then_branch=then_body,
else_branch=else_body,
)
cond = np.array(1).astype(bool)
res = x if cond else y
expect(
if_node,
inputs=[cond],
outputs=[res],
name="test_if",
opset_imports=[onnx.helper.make_opsetid("", 11)],
)
# Given a bool scalar input cond, return an empty optional sequence of
# tensor if True, return an optional sequence with value x
# (the input optional sequence) otherwise.
ten_in_tp = onnx.helper.make_tensor_type_proto(
onnx.TensorProto.FLOAT, shape=[5]
)
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)
then_out_tensor_tp = onnx.helper.make_tensor_type_proto(
onnx.TensorProto.FLOAT, shape=[5]
)
then_out_seq_tp = onnx.helper.make_sequence_type_proto(then_out_tensor_tp)
then_out_opt_tp = onnx.helper.make_optional_type_proto(then_out_seq_tp)
then_out = onnx.helper.make_value_info("optional_empty", then_out_opt_tp)
else_out_tensor_tp = onnx.helper.make_tensor_type_proto(
onnx.TensorProto.FLOAT, shape=[5]
)
else_out_seq_tp = onnx.helper.make_sequence_type_proto(else_out_tensor_tp)
else_out_opt_tp = onnx.helper.make_optional_type_proto(else_out_seq_tp)
else_out = onnx.helper.make_value_info("else_opt", else_out_opt_tp)
x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]
cond = np.array(0).astype(bool)
res = compute_if_outputs(x, cond)
opt_empty_in = onnx.helper.make_node(
"Optional", inputs=[], outputs=["optional_empty"], type=seq_in_tp
)
then_body = onnx.helper.make_graph([opt_empty_in], "then_body", [], [then_out])
else_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["x"],
value=onnx.numpy_helper.from_array(x[0]),
)
else_seq_node = onnx.helper.make_node(
"SequenceConstruct", inputs=["x"], outputs=["else_seq"]
)
else_optional_seq_node = onnx.helper.make_node(
"Optional", inputs=["else_seq"], outputs=["else_opt"]
)
else_body = onnx.helper.make_graph(
[else_const_node, else_seq_node, else_optional_seq_node],
"else_body",
[],
[else_out],
)
if_node = onnx.helper.make_node(
"If",
inputs=["cond"],
outputs=["sequence"],
then_branch=then_body,
else_branch=else_body,
)
expect(
if_node,
inputs=[cond],
outputs=[res],
name="test_if_opt",
output_type_protos=[else_out_opt_tp],
opset_imports=[onnx.helper.make_opsetid("", 16)],
)
# Given a bool scalar input cond.
# return constant sequence x if cond is True, otherwise return constant sequence y.
then_out = onnx.helper.make_tensor_sequence_value_info(
"then_out", onnx.TensorProto.FLOAT, shape=[5]
)
else_out = onnx.helper.make_tensor_sequence_value_info(
"else_out", onnx.TensorProto.FLOAT, shape=[5]
)
x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]
y = [np.array([5, 4, 3, 2, 1]).astype(np.float32)]
then_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["x"],
value=onnx.numpy_helper.from_array(x[0]),
)
then_seq_node = onnx.helper.make_node(
"SequenceConstruct", inputs=["x"], outputs=["then_out"]
)
else_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["y"],
value=onnx.numpy_helper.from_array(y[0]),
)
else_seq_node = onnx.helper.make_node(
"SequenceConstruct", inputs=["y"], outputs=["else_out"]
)
then_body = onnx.helper.make_graph(
[then_const_node, then_seq_node], "then_body", [], [then_out]
)
else_body = onnx.helper.make_graph(
[else_const_node, else_seq_node], "else_body", [], [else_out]
)
if_node = onnx.helper.make_node(
"If",
inputs=["cond"],
outputs=["res"],
then_branch=then_body,
else_branch=else_body,
)
cond = np.array(1).astype(bool)
res = x if cond else y
expect(
if_node,
inputs=[cond],
outputs=[res],
name="test_if_seq",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
Loads and decodes and image from a file. If it can't decode for any reason (e.g. corrupted encoded stream, invalid format, it will return an empty matrix). The following image formats are supported:
B0 = round_half_down((1/4) * A + (3/4) * B)
B1 = round_half_up((3/4) * B + (1/4) * C)
This method, is the default chroma upsampling method in the well-established libjpeg-turbo library, also referred as "smooth" or "fancy" upsampling.
This version of the operator has been available since version 20 of the default ONNX operator set.
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"bmp", _image_decoder_data.image_decoder_decode_bmp_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_bmp_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"jpeg2000", _image_decoder_data.image_decoder_decode_jpeg2k_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_jpeg2k_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="BGR",
)
data, output = _generate_test_data(
"jpeg", _image_decoder_data.image_decoder_decode_jpeg_bgr, "BGR"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_jpeg_bgr",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="Grayscale",
)
data, output = _generate_test_data(
"jpeg", _image_decoder_data.image_decoder_decode_jpeg_grayscale, "Grayscale"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_jpeg_grayscale",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"jpeg", _image_decoder_data.image_decoder_decode_jpeg_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_jpeg_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"png", _image_decoder_data.image_decoder_decode_png_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_png_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"ppm", _image_decoder_data.image_decoder_decode_pnm_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_pnm_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"tiff", _image_decoder_data.image_decoder_decode_tiff_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_tiff_rgb",
)
node = onnx.helper.make_node(
"ImageDecoder",
inputs=["data"],
outputs=["output"],
pixel_format="RGB",
)
data, output = _generate_test_data(
"webp", _image_decoder_data.image_decoder_decode_webp_rgb, "RGB"
)
expect(
node,
inputs=[data],
outputs=[output],
name="test_image_decoder_decode_webp_rgb",
)
Carries out instance normalization as described in the paper https://arxiv.org/abs/1607.08022.
y = scale * (x - mean) / sqrt(variance + epsilon) + B, where mean and variance are computed per instance per channel.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#InstanceNormalization-1">1</a>, <a href="Changelog.md#InstanceNormalization-6">6</a>
def _instancenorm_test_mode(x, s, bias, epsilon=1e-5): # type: ignore
dims_x = len(x.shape)
axis = tuple(range(2, dims_x))
mean = np.mean(x, axis=axis, keepdims=True)
var = np.var(x, axis=axis, keepdims=True)
dim_ones = (1,) * (dims_x - 2)
s = s.reshape(-1, *dim_ones)
bias = bias.reshape(-1, *dim_ones)
return s * (x - mean) / np.sqrt(var + epsilon) + bias
# input size: (1, 2, 1, 3)
x = np.array([[[[-1, 0, 1]], [[2, 3, 4]]]]).astype(np.float32)
s = np.array([1.0, 1.5]).astype(np.float32)
bias = np.array([0, 1]).astype(np.float32)
y = _instancenorm_test_mode(x, s, bias).astype(np.float32)
node = onnx.helper.make_node(
"InstanceNormalization",
inputs=["x", "s", "bias"],
outputs=["y"],
)
# output size: (1, 2, 1, 3)
expect(node, inputs=[x, s, bias], outputs=[y], name="test_instancenorm_example")
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
epsilon = 1e-2
y = _instancenorm_test_mode(x, s, bias, epsilon).astype(np.float32)
node = onnx.helper.make_node(
"InstanceNormalization",
inputs=["x", "s", "bias"],
outputs=["y"],
epsilon=epsilon,
)
# output size: (2, 3, 4, 5)
expect(node, inputs=[x, s, bias], outputs=[y], name="test_instancenorm_epsilon")
Map infinity to true and other values to false.
This version of the operator has been available since version 20 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#IsInf-10">10</a>
node = onnx.helper.make_node(
"IsInf",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float32)
y = np.isinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf")
node = onnx.helper.make_node(
"IsInf",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float16)
y = np.isinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_float16")
node = onnx.helper.make_node(
"IsInf", inputs=["x"], outputs=["y"], detect_positive=0
)
x = np.array([-1.7, np.nan, np.inf, -3.6, -np.inf, np.inf], dtype=np.float32)
y = np.isneginf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_negative")
node = onnx.helper.make_node(
"IsInf", inputs=["x"], outputs=["y"], detect_negative=0
)
x = np.array([-1.7, np.nan, np.inf, 3.6, -np.inf, np.inf], dtype=np.float32)
y = np.isposinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_positive")
Returns which elements of the input are NaN.
This version of the operator has been available since version 20 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#IsNaN-9">9</a>, <a href="Changelog.md#IsNaN-13">13</a>
node = onnx.helper.make_node(
"IsNaN",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float16)
y = np.isnan(x)
expect(node, inputs=[x], outputs=[y], name="test_isnan_float16")
node = onnx.helper.make_node(
"IsNaN",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float32)
y = np.isnan(x)
expect(node, inputs=[x], outputs=[y], name="test_isnan")
Local Response Normalization proposed in the AlexNet paper.
It normalizes over local input regions.
The local region is defined across the channels. For an element X[n, c, d1, ..., dk] in a tensor
of shape (N x C x D1 x D2, ..., Dk), its region is
{X[n, i, d1, ..., dk] | max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))}.
square_sum[n, c, d1, ..., dk] = sum(X[n, i, d1, ..., dk] ^ 2),
where max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2)).
Y[n, c, d1, ..., dk] = X[n, c, d1, ..., dk] / (bias + alpha / size * square_sum[n, c, d1, ..., dk] ) ^ beta
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LRN-1">1</a>
alpha = 0.0001
beta = 0.75
bias = 1.0
nsize = 3
node = onnx.helper.make_node("LRN", inputs=["x"], outputs=["y"], size=3)
x = np.random.randn(5, 5, 5, 5).astype(np.float32)
square_sum = np.zeros((5, 5, 5, 5)).astype(np.float32)
for n, c, h, w in np.ndindex(x.shape):
square_sum[n, c, h, w] = sum(
x[
n,
max(0, c - math.floor((nsize - 1) / 2)) : min(
5, c + math.ceil((nsize - 1) / 2) + 1
),
h,
w,
]
** 2
)
y = x / ((bias + (alpha / nsize) * square_sum) ** beta)
expect(node, inputs=[x], outputs=[y], name="test_lrn_default")
alpha = 0.0002
beta = 0.5
bias = 2.0
nsize = 3
node = onnx.helper.make_node(
"LRN",
inputs=["x"],
outputs=["y"],
alpha=alpha,
beta=beta,
bias=bias,
size=nsize,
)
x = np.random.randn(5, 5, 5, 5).astype(np.float32)
square_sum = np.zeros((5, 5, 5, 5)).astype(np.float32)
for n, c, h, w in np.ndindex(x.shape):
square_sum[n, c, h, w] = sum(
x[
n,
max(0, c - math.floor((nsize - 1) / 2)) : min(
5, c + math.ceil((nsize - 1) / 2) + 1
),
h,
w,
]
** 2
)
y = x / ((bias + (alpha / nsize) * square_sum) ** beta)
expect(node, inputs=[x], outputs=[y], name="test_lrn")
Computes an one-layer LSTM. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X - input tensori - input gateo - output gatef - forget gatec - cell gatet - time step (t-1 means previous time step)W[iofc] - W parameter weight matrix for input, output, forget, and cell gatesR[iofc] - R recurrence weight matrix for input, output, forget, and cell gatesWb[iofc] - W bias vectors for input, output, forget, and cell gatesRb[iofc] - R bias vectors for input, output, forget, and cell gatesP[iof] - P peephole weight vector for input, output, and forget gatesWB[iofc] - W parameter weight matrix for backward input, output, forget, and cell gatesRB[iofc] - R recurrence weight matrix for backward input, output, forget, and cell gatesWBb[iofc] - W bias vectors for backward input, output, forget, and cell gatesRBb[iofc] - R bias vectors for backward input, output, forget, and cell gatesPB[iof] - P peephole weight vector for backward input, output, and forget gatesH - Hidden statenum_directions - 2 if direction == bidirectional else 1Activation functions:
NOTE: Below are optional
Equations (Default: f=Sigmoid, g=Tanh, h=Tanh):
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LSTM-1">1</a>, <a href="Changelog.md#LSTM-7">7</a>, <a href="Changelog.md#LSTM-14">14</a>
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 7
weight_scale = 0.3
number_of_gates = 4
layout = 1
node = onnx.helper.make_node(
"LSTM",
inputs=["X", "W", "R"],
outputs=["Y", "Y_h"],
hidden_size=hidden_size,
layout=layout,
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
lstm = LSTMHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = lstm.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
name="test_lstm_batchwise",
)
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 3
weight_scale = 0.1
number_of_gates = 4
node = onnx.helper.make_node(
"LSTM", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
lstm = LSTMHelper(X=input, W=W, R=R)
_, Y_h = lstm.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y_h.astype(np.float32)],
name="test_lstm_defaults",
)
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
np.float32
)
input_size = 3
hidden_size = 4
weight_scale = 0.1
custom_bias = 0.1
number_of_gates = 4
node = onnx.helper.make_node(
"LSTM",
inputs=["X", "W", "R", "B"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
# Adding custom bias
W_B = custom_bias * np.ones((1, number_of_gates * hidden_size)).astype(
np.float32
)
R_B = np.zeros((1, number_of_gates * hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), 1)
lstm = LSTMHelper(X=input, W=W, R=R, B=B)
_, Y_h = lstm.step()
expect(
node,
inputs=[input, W, R, B],
outputs=[Y_h.astype(np.float32)],
name="test_lstm_with_initial_bias",
)
input = np.array([[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]]]).astype(
np.float32
)
input_size = 4
hidden_size = 3
weight_scale = 0.1
number_of_gates = 4
number_of_peepholes = 3
node = onnx.helper.make_node(
"LSTM",
inputs=["X", "W", "R", "B", "sequence_lens", "initial_h", "initial_c", "P"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
# Initializing Inputs
W = weight_scale * np.ones(
(1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
(1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
B = np.zeros((1, 2 * number_of_gates * hidden_size)).astype(np.float32)
seq_lens = np.repeat(input.shape[0], input.shape[1]).astype(np.int32)
init_h = np.zeros((1, input.shape[1], hidden_size)).astype(np.float32)
init_c = np.zeros((1, input.shape[1], hidden_size)).astype(np.float32)
P = weight_scale * np.ones((1, number_of_peepholes * hidden_size)).astype(
np.float32
)
lstm = LSTMHelper(
X=input, W=W, R=R, B=B, P=P, initial_c=init_c, initial_h=init_h
)
_, Y_h = lstm.step()
expect(
node,
inputs=[input, W, R, B, seq_lens, init_h, init_c, P],
outputs=[Y_h.astype(np.float32)],
name="test_lstm_with_peepholes",
)
This is layer normalization defined in ONNX as function.
The overall computation can be split into two stages.
The first stage is standardization, which makes the
normalized elements have zero mean and unit variances.
The computation required by standardization can be
described by the following equations.
Mean = ReduceMean<axes=normalized_axes>(X) D = Sub(X, Mean) DD = Mul(D, D) Var = ReduceMean<axes=normalized_axes>(DD) VarEps = Add(Var, epsilon) StdDev = Sqrt(VarEps) InvStdDev = Reciprocal(StdDev) Normalized = Mul(D, InvStdDev)
where normalized_axes is [axis, ..., rank of X - 1].
The variables Var and StdDev stand for variance and
standard deviation, respectively. The second output is
Mean and the last one is InvStdDev.
Depending on stash_type attribute, the actual computation
must happen in different floating-point precision.
For example, if stash_type is 1, this operator casts
all input variables to 32-bit float, perform the computation, and
finally cast Normalized back to the original type of X.
The second stage then scales and shifts the outcome of the
first stage using
NormalizedScaled = Mul(Normalized, Scale) Y = Add(NormalizedScaled, B)
The second stage doesn't depends on stash_type.
All equations are in this syntax.
The same variable (i.e., input, output, and attribute) uses
the same name in the equations above and this operator's definition.
Let d[i] indicate the i-th dimension of X.
If X's shape is [d[0], ..., d[axis-1], d[axis], ..., d[rank-1]],
the shape of Mean and InvStdDev is [d[0], ..., d[axis-1], 1, ..., 1].
Y and X have the same shape. This operator supports unidirectional broadcasting
(tensors Scale and B should be unidirectional broadcastable to tensor X);
for more details please check the doc.
This version of the operator has been available since version 17 of the default ONNX operator set.
X = np.random.randn(3, 4).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
B = np.random.randn(*normalized_shape).astype(np.float32)
Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis=axis)
node = onnx.helper.make_node(
"LayerNormalization",
inputs=["X", "W", "B"],
outputs=["Y", "Mean", "InvStdDev"],
axis=axis,
)
if axis < 0:
name = f"test_layer_normalization_2d_axis_negative_{-axis}"
else:
name = f"test_layer_normalization_2d_axis{axis}"
expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
epsilon = 1e-1
X = np.random.randn(2, 3, 5).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
B = np.random.randn(*normalized_shape).astype(np.float32)
Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis, epsilon)
node = onnx.helper.make_node(
"LayerNormalization",
inputs=["X", "W", "B"],
outputs=["Y", "Mean", "InvStdDev"],
axis=axis,
epsilon=epsilon,
)
if axis < 0:
name = f"test_layer_normalization_3d_axis_negative_{-axis}_epsilon"
else:
name = f"test_layer_normalization_3d_axis{axis}_epsilon"
expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
X = np.random.randn(2, 3, 4, 5).astype(np.float32)
# Default axis in LayerNormalization is -1.
normalized_shape = calculate_normalized_shape(X.shape, -1)
W = np.random.randn(*normalized_shape).astype(np.float32)
B = np.random.randn(*normalized_shape).astype(np.float32)
# Axis is default to -1 in the reference implementation.
Y, mean, inv_std_dev = _layer_normalization(X, W, B)
# Not specifying axis attribute means -1.
node = onnx.helper.make_node(
"LayerNormalization",
inputs=["X", "W", "B"],
outputs=["Y", "Mean", "InvStdDev"],
)
expect(
node,
inputs=[X, W, B],
outputs=[Y, mean, inv_std_dev],
name="test_layer_normalization_default_axis",
)
X = np.random.randn(2, 3, 4, 5).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
B = np.random.randn(*normalized_shape).astype(np.float32)
Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis)
node = onnx.helper.make_node(
"LayerNormalization",
inputs=["X", "W", "B"],
outputs=["Y", "Mean", "InvStdDev"],
axis=axis,
)
if axis < 0:
name = f"test_layer_normalization_4d_axis_negative_{-axis}"
else:
name = f"test_layer_normalization_4d_axis{axis}"
expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
LeakyRelu takes input data (Tensor<T>) and an argument alpha, and produces one
output data (Tensor<T>) where the function f(x) = alpha * x for x < 0,
f(x) = x for x >= 0, is applied to the data tensor elementwise.
This version of the operator has been available since version 16 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LeakyRelu-1">1</a>, <a href="Changelog.md#LeakyRelu-6">6</a>
node = onnx.helper.make_node(
"LeakyRelu", inputs=["x"], outputs=["y"], alpha=0.1
)
x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.1, 0., 1.]
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * 0.1
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * 0.1
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu")
default_alpha = 0.01
node = onnx.helper.make_node(
"LeakyRelu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * default_alpha
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu_default")
Returns the tensor resulted from performing the less logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Less-1">1</a>, <a href="Changelog.md#Less-7">7</a>, <a href="Changelog.md#Less-9">9</a>
node = onnx.helper.make_node(
"Less",
inputs=["x", "y"],
outputs=["less"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less")
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.random.randn(3, 4, 5).astype(np.int8)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_int8")
x = np.random.randn(3, 4, 5).astype(np.int16)
y = np.random.randn(3, 4, 5).astype(np.int16)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_uint64")
node = onnx.helper.make_node(
"LessOrEqual",
inputs=["x", "y"],
outputs=["less_equal"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal")
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.random.randn(3, 4, 5).astype(np.int8)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_int8")
x = np.random.randn(3, 4, 5).astype(np.int16)
y = np.random.randn(3, 4, 5).astype(np.int16)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_int16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_uint8")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_uint16")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_uint32")
x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_uint64")
node = onnx.helper.make_node(
"Less",
inputs=["x", "y"],
outputs=["less"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_bcast")
node = onnx.helper.make_node(
"LessOrEqual",
inputs=["x", "y"],
outputs=["less_equal"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_bcast")
Returns the tensor resulted from performing the less_equal logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 16 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LessOrEqual-12">12</a>
Calculates the natural log of the given input tensor, element-wise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Log-1">1</a>, <a href="Changelog.md#Log-6">6</a>
node = onnx.helper.make_node(
"Log",
inputs=["x"],
outputs=["y"],
)
x = np.array([1, 10]).astype(np.float32)
y = np.log(x) # expected output [0., 2.30258512]
expect(node, inputs=[x], outputs=[y], name="test_log_example")
x = np.exp(np.random.randn(3, 4, 5).astype(np.float32))
y = np.log(x)
expect(node, inputs=[x], outputs=[y], name="test_log")
The operator computes the log of softmax values for the given input:
LogSoftmax(input, axis) = Log(Softmax(input, axis=axis))
The "axis" attribute indicates the dimension along which LogSoftmax will be performed. The output tensor has the same shape and contains the LogSoftmax values of the corresponding input.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LogSoftmax-1">1</a>, <a href="Changelog.md#LogSoftmax-11">11</a>
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
)
x = np.array([[-1, 0, 1]]).astype(np.float32)
# expected output
# [[-2.4076061 -1.407606 -0.407606 ]]
y = logsoftmax(x)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_example_1")
x = np.array([[0, 1, 2, 3], [10000, 10001, 10002, 10003]]).astype(np.float32)
# expected output
# [[-3.4401896 -2.4401896 -1.4401896 -0.44018966]
# [-3.4401896 -2.4401896 -1.4401896 -0.44018966]]
y = logsoftmax(x)
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_large_number")
x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
axis=0,
)
y = logsoftmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_0")
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
axis=1,
)
y = logsoftmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_1")
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
axis=2,
)
y = logsoftmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_2")
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
axis=-1,
)
y = logsoftmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_negative_axis")
# default axis is -1
node = onnx.helper.make_node(
"LogSoftmax",
inputs=["x"],
outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_default_axis")
Generic Looping construct. This loop has multiple termination conditions:
This table summarizes the operating modes of this operator with equivalent C-style code:
Operator inputs defined as (max_trip_count, condition_var).
input ("", ""): for (int i=0; ; ++i) { cond = ... // Note this value is ignored, but is required in the body }
input ("", cond) // Note this is analogous to a while loop bool cond = ...; for (int i=0; cond; ++i) { cond = ...; }
input ("", 1) // Note this is analogous to a do-while loop bool cond = true for (int i=0; cond; ++i) { cond = ...; }
input (trip_count, "") // Note this is analogous to a for loop int trip_count = ... for (int i=0; i < trip_count; ++i) { cond = ...; // ignored }
input (trip_count, cond) int trip_count = ...; bool cond = ...; for (int i=0; i < trip_count && cond; ++i) { cond = ...; }
Sample usage - cond as well as trip count
graph predict-net {
%a = Constant[value = <Scalar Tensor [3]>]()
%b = Constant[value = <Scalar Tensor [6]>]()
%keepgoing = Constant[value = <Scalar Tensor [1]>]()
%max_trip_count = Constant[value = <Scalar Tensor [10]>]()
%keepgoing_out, %b_out, %user_defined_vals = Loop[body = <graph body-net>](%max_trip_count, %keepgoing, %b)
return
}
graph body-net (
%i[INT32, scalar] // iteration number
%keepgoing_in[BOOL, scalar] // incoming loop-termination-condition; not used
%b_in[INT32, scalar] // incoming value of loop-carried-dependency b
) {
%my_local = Add(%a, %b_in)
%b_out = Sub(%a, %b_in) // outgoing value of loop-carried-dependency b
%keepgoing_out = Greater(%my_local, %b_out) // outgoing loop-termination-condition
%user_defined_val = Add(%b_in, %b_in) // scan-output value to be accumulated
return %keepgoing_out, %b_out, %user_defined_val
}
Sample equivalent C code
{
/* User-defined code (enclosing scope) */
int a = 3, b = 6;
bool keepgoing = true; // Analogous to input cond
/* End user-defined code */
/* Implicitly-defined code */
const int max_trip_count = 10; // Analogous to input M
int user_defined_vals[]; // Imagine this is resizable
/* End implicitly-defined code */
/* initialize loop-carried variables and scan-output variables */
bool keepgoing_out = keepgoing
int b_out = b
for (int i=0; i < max_trip_count && keepgoing_out; ++i) {
/* Implicitly-defined code: bind actual parameter values
to formal parameter variables of loop-body */
bool keepgoing_in = keepgoing_out;
bool b_in = b_out;
/* User-defined code (loop body) */
int my_local = a + b_in; // Reading value "a" from the enclosing scope is fine
b_out = a - b_in;
keepgoing_out = my_local > b_out;
user_defined_val = b_in + b_in; // b_in and b_out are different variables
/* End user-defined code */
/* Implicitly defined-code */
user_defined_vals[i] = user_defined_val // accumulate scan-output values
}
// int t = my_local; // Can't do this. my_local is not accessible here.
// The values below are bound to the output variables of the loop and therefore accessible
// b_out; user_defined_vals; keepgoing_out;
}
There are several things of note in this code snippet:
Note that the semantics of this op support "diagonal" or "wavefront" execution. (See Step 3 here for an example: https://devblogs.nvidia.com/optimizing-recurrent-neural-networks-cudnn-5/). Frontends should emit multi-layer RNNs as a series of While operators (with time being the inner looping dimension), with each successive layer consuming the scan_outputs from the previous layer, possibly going through several point-wise operators (e.g. dropout, residual connections, linear layer).
The input/output of subgraph (produced by loop node) matching is based on order instead of name. The implementation will figure out the names based on this order.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Loop-1">1</a>, <a href="Changelog.md#Loop-11">11</a>, <a href="Changelog.md#Loop-13">13</a>, <a href="Changelog.md#Loop-16">16</a>, <a href="Changelog.md#Loop-19">19</a>, <a href="Changelog.md#Loop-21">21</a>, <a href="Changelog.md#Loop-23">23</a>, <a href="Changelog.md#Loop-24">24</a>
# Given a tensor x of values [x1, ..., xN], and initial tensor y
# sum up its elements using a scan
# returning the final state (y+x1+x2+...+xN) as well the scan_output
# [y+x1, y+x1+x2, ..., y+x1+x2+...+xN]
y_in = onnx.helper.make_tensor_value_info("y_in", onnx.TensorProto.FLOAT, [1])
y_out = onnx.helper.make_tensor_value_info("y_out", onnx.TensorProto.FLOAT, [1])
scan_out = onnx.helper.make_tensor_value_info(
"scan_out", onnx.TensorProto.FLOAT, [1]
)
cond_in = onnx.helper.make_tensor_value_info(
"cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
"cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
"iter_count", onnx.TensorProto.INT64, []
)
x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
y = np.array([-2]).astype(np.float32)
x_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["x"],
value=onnx.helper.make_tensor(
name="const_tensor_x",
data_type=onnx.TensorProto.FLOAT,
dims=x.shape,
vals=x.flatten().astype(float),
),
)
one_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["one"],
value=onnx.helper.make_tensor(
name="const_tensor_one",
data_type=onnx.TensorProto.INT64,
dims=(),
vals=[1],
),
)
i_add_node = onnx.helper.make_node(
"Add", inputs=["iter_count", "one"], outputs=["end"]
)
start_unsqueeze_node = onnx.helper.make_node(
"Unsqueeze", inputs=["iter_count"], outputs=["slice_start"], axes=[0]
)
end_unsqueeze_node = onnx.helper.make_node(
"Unsqueeze", inputs=["end"], outputs=["slice_end"], axes=[0]
)
slice_node = onnx.helper.make_node(
"Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)
y_add_node = onnx.helper.make_node(
"Add", inputs=["y_in", "slice_out"], outputs=["y_out"]
)
identity_node = onnx.helper.make_node(
"Identity", inputs=["cond_in"], outputs=["cond_out"]
)
scan_identity_node = onnx.helper.make_node(
"Identity", inputs=["y_out"], outputs=["scan_out"]
)
loop_body = onnx.helper.make_graph(
[
identity_node,
x_const_node,
one_const_node,
i_add_node,
start_unsqueeze_node,
end_unsqueeze_node,
slice_node,
y_add_node,
scan_identity_node,
],
"loop_body",
[iter_count, cond_in, y_in],
[cond_out, y_out, scan_out],
)
node = onnx.helper.make_node(
"Loop",
inputs=["trip_count", "cond", "y"],
outputs=["res_y", "res_scan"],
body=loop_body,
)
trip_count = np.array(5).astype(np.int64)
res_y = np.array([13]).astype(np.float32)
cond = np.array(1).astype(bool)
res_scan = np.array([-1, 1, 4, 8, 13]).astype(np.float32).reshape((5, 1))
expect(
node,
inputs=[trip_count, cond, y],
outputs=[res_y, res_scan],
name="test_loop11",
opset_imports=[onnx.helper.make_opsetid("", 11)],
)
# Given a tensor x of values [x1, ..., xN],
# Return a sequence of tensors of
# [[x1], [x1, x2], ..., [x1, ..., xN]]
seq_in = onnx.helper.make_tensor_sequence_value_info(
"seq_in", onnx.TensorProto.FLOAT, None
)
seq_out = onnx.helper.make_tensor_sequence_value_info(
"seq_out", onnx.TensorProto.FLOAT, None
)
cond_in = onnx.helper.make_tensor_value_info(
"cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
"cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
"iter_count", onnx.TensorProto.INT64, []
)
x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
x_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["x"],
value=onnx.helper.make_tensor(
name="const_tensor_x",
data_type=onnx.TensorProto.FLOAT,
dims=x.shape,
vals=x.flatten().astype(float),
),
)
one_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["one"],
value=onnx.helper.make_tensor(
name="const_tensor_one",
data_type=onnx.TensorProto.INT64,
dims=(),
vals=[1],
),
)
zero_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["slice_start"],
value=onnx.helper.make_tensor(
name="const_tensor_zero",
data_type=onnx.TensorProto.INT64,
dims=(1,),
vals=[0],
),
)
axes_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["axes"],
value=onnx.helper.make_tensor(
name="const_tensor_axes",
data_type=onnx.TensorProto.INT64,
dims=(),
vals=[0],
),
)
add_node = onnx.helper.make_node(
"Add", inputs=["iter_count", "one"], outputs=["end"]
)
end_unsqueeze_node = onnx.helper.make_node(
"Unsqueeze", inputs=["end", "axes"], outputs=["slice_end"]
)
slice_node = onnx.helper.make_node(
"Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)
insert_node = onnx.helper.make_node(
"SequenceInsert", inputs=["seq_in", "slice_out"], outputs=["seq_out"]
)
identity_node = onnx.helper.make_node(
"Identity", inputs=["cond_in"], outputs=["cond_out"]
)
loop_body = onnx.helper.make_graph(
[
identity_node,
x_const_node,
one_const_node,
zero_const_node,
add_node,
axes_node,
end_unsqueeze_node,
slice_node,
insert_node,
],
"loop_body",
[iter_count, cond_in, seq_in],
[cond_out, seq_out],
)
node = onnx.helper.make_node(
"Loop",
inputs=["trip_count", "cond", "seq_empty"],
outputs=["seq_res"],
body=loop_body,
)
trip_count = np.array(5).astype(np.int64)
seq_empty: list[Any] = []
seq_res = [x[: int(i)] for i in x]
cond = np.array(1).astype(bool)
expect(
node,
inputs=[trip_count, cond, seq_empty],
outputs=[seq_res],
name="test_loop13_seq",
opset_imports=[onnx.helper.make_opsetid("", 13)],
input_type_protos=[
onnx.helper.make_tensor_type_proto(
onnx.TensorProto.INT64, trip_count.shape
),
onnx.helper.make_tensor_type_proto(onnx.TensorProto.BOOL, cond.shape),
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, [])
),
],
)
# Given a tensor sequence of values [x1, ..., xN], and an initial optional sequence of tensors [x0],
# Return a concatenated sequence of tensors of
# [x0, [x1], [x1, x2], ..., [x1, ..., xN]]
ten_in_tp = onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, [])
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)
opt_in_tp = onnx.helper.make_optional_type_proto(seq_in_tp)
opt_in = onnx.helper.make_value_info("opt_seq_in", opt_in_tp)
seq_out = onnx.helper.make_tensor_sequence_value_info(
"seq_out", onnx.TensorProto.FLOAT, []
)
cond_in = onnx.helper.make_tensor_value_info(
"cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
"cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
"iter_count", onnx.TensorProto.INT64, []
)
x0 = np.array(0).astype(np.float32)
x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
optional_has_elem_node = onnx.helper.make_node(
"OptionalHasElement", inputs=["opt_seq_in"], outputs=["optional_has_elem"]
)
optional_is_none = onnx.helper.make_node(
"Not", inputs=["optional_has_elem"], outputs=["optional_is_none"]
)
optional_get_elem = onnx.helper.make_node(
"OptionalGetElement", inputs=["opt_seq_in"], outputs=["seq_in"]
)
constant_in = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["constant_in"],
value=onnx.helper.make_tensor(
name="const_tensor", data_type=onnx.TensorProto.FLOAT, dims=(), vals=[0]
),
)
seq_const_in = onnx.helper.make_node(
"SequenceConstruct", inputs=["constant_in"], outputs=["init_seq_in"]
)
then_seq_out = onnx.helper.make_tensor_sequence_value_info(
"init_seq_in", onnx.TensorProto.FLOAT, []
)
then_body = onnx.helper.make_graph(
[constant_in, seq_const_in], "then_body", [], [then_seq_out]
)
else_seq_out = onnx.helper.make_tensor_sequence_value_info(
"seq_in", onnx.TensorProto.FLOAT, []
)
else_body = onnx.helper.make_graph(
[optional_get_elem], "else_body", [], [else_seq_out]
)
if_node = onnx.helper.make_node(
"If",
inputs=["optional_is_none"],
outputs=["sequence"],
then_branch=then_body,
else_branch=else_body,
)
x_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["x"],
value=onnx.helper.make_tensor(
name="const_tensor_x",
data_type=onnx.TensorProto.FLOAT,
dims=x.shape,
vals=x.flatten().astype(float),
),
)
one_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["one"],
value=onnx.helper.make_tensor(
name="const_tensor_one",
data_type=onnx.TensorProto.INT64,
dims=(),
vals=[1],
),
)
zero_const_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["slice_start"],
value=onnx.helper.make_tensor(
name="const_tensor_zero",
data_type=onnx.TensorProto.INT64,
dims=(1,),
vals=[0],
),
)
axes_node = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["axes"],
value=onnx.helper.make_tensor(
name="const_tensor_axes",
data_type=onnx.TensorProto.INT64,
dims=(),
vals=[0],
),
)
add_node = onnx.helper.make_node(
"Add", inputs=["iter_count", "one"], outputs=["end"]
)
end_unsqueeze_node = onnx.helper.make_node(
"Unsqueeze", inputs=["end", "axes"], outputs=["slice_end"]
)
slice_node = onnx.helper.make_node(
"Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)
insert_node = onnx.helper.make_node(
"SequenceInsert", inputs=["sequence", "slice_out"], outputs=["seq_out"]
)
identity_node = onnx.helper.make_node(
"Identity", inputs=["cond_in"], outputs=["cond_out"]
)
loop_body = onnx.helper.make_graph(
[
identity_node,
optional_has_elem_node,
optional_is_none,
if_node,
x_const_node,
one_const_node,
zero_const_node,
add_node,
axes_node,
end_unsqueeze_node,
slice_node,
insert_node,
],
"loop_body",
[iter_count, cond_in, opt_in],
[cond_out, seq_out],
)
node = onnx.helper.make_node(
"Loop",
inputs=["trip_count", "cond", "opt_seq"],
outputs=["seq_res"],
body=loop_body,
)
trip_count = np.array(5).astype(np.int64)
cond = np.array(1).astype(bool)
seq_res = compute_loop_outputs(x, [x0], trip_count)
opt_seq_in: list[Any] = [x0]
expect(
node,
inputs=[trip_count, cond, opt_seq_in],
outputs=[seq_res],
name="test_loop16_seq_none",
opset_imports=[onnx.helper.make_opsetid("", 16)],
input_type_protos=[
onnx.helper.make_tensor_type_proto(
onnx.TensorProto.INT64, trip_count.shape
),
onnx.helper.make_tensor_type_proto(onnx.TensorProto.BOOL, cond.shape),
opt_in_tp,
],
)
Given a matrix, apply Lp-normalization along the provided axis.
The output is computed as: output = input / Lp_norm(input, axis).
When the Lp norm is zero (i.e., all elements along the axis are zero),
the output is defined to be zero to avoid division by zero.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LpNormalization-1">1</a>
node = onnx.helper.make_node("LpNormalization", inputs=["x"], outputs=["y"])
x = np.array(
[[[1.0, 2.0, 2.0], [3.0, 4.0, 0.0]], [[0.0, 5.0, 5.0], [6.0, 8.0, 0.0]]],
dtype=np.float32,
)
lp_norm_default = np.sqrt(np.sum(x**2, axis=-1, keepdims=True))
y = x / lp_norm_default
expect(node, inputs=[x], outputs=[y], name="test_lpnormalization_default")
node = onnx.helper.make_node(
"LpNormalization", inputs=["x"], outputs=["y"], axis=0, p=1
)
x = np.array([3.0, 4.0], dtype=np.float32)
l1_norm_axis_0 = np.sum(abs(x), axis=0, keepdims=True)
y = x / l1_norm_axis_0
expect(node, inputs=[x], outputs=[y], name="test_l1normalization_axis_0")
node = onnx.helper.make_node(
"LpNormalization", inputs=["x"], outputs=["y"], axis=1, p=1
)
x = np.array([[3.0, 4.0], [6.0, 8.0]], dtype=np.float32)
l1_norm_axis_1 = np.sum(abs(x), axis=1, keepdims=True)
y = x / l1_norm_axis_1
expect(node, inputs=[x], outputs=[y], name="test_l1normalization_axis_1")
node = onnx.helper.make_node(
"LpNormalization", inputs=["x"], outputs=["y"], axis=-1, p=1
)
x = np.array(
[[[1.0, 2.0, 2.0], [3.0, 4.0, 0.0]], [[0.0, 5.0, 5.0], [6.0, 8.0, 0.0]]],
dtype=np.float32,
)
l1_norm_axis_last = np.sum(abs(x), axis=-1, keepdims=True)
y = x / l1_norm_axis_last
expect(node, inputs=[x], outputs=[y], name="test_l1normalization_axis_last")
node = onnx.helper.make_node(
"LpNormalization", inputs=["x"], outputs=["y"], axis=0, p=2
)
x = np.array(
[[[1.0, 2.0, 2.0], [3.0, 4.0, 0.0]], [[0.0, 5.0, 5.0], [6.0, 8.0, 0.0]]],
dtype=np.float32,
)
l2_norm_axis_0 = np.sqrt(np.sum(x**2, axis=0, keepdims=True))
# When norm is 0, output is 0 (0/0 = 0)
y = np.where(l2_norm_axis_0 == 0, 0, x / l2_norm_axis_0)
expect(node, inputs=[x], outputs=[y], name="test_l2normalization_axis_0")
node = onnx.helper.make_node(
"LpNormalization", inputs=["x"], outputs=["y"], axis=1, p=2
)
x = np.array([[3.0, 4.0], [6.0, 8.0]], dtype=np.float32)
l2_norm_axis_1 = np.sqrt(np.sum(x**2, axis=1, keepdims=True))
y = x / l2_norm_axis_1
expect(node, inputs=[x], outputs=[y], name="test_l2normalization_axis_1")
LpPool consumes an input tensor X and applies Lp pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Lp pooling consisting of computing the Lp norm on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following:
output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - {kernelSpatialShape}) / strides_spatial_shape[i] + 1)
or
output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - {kernelSpatialShape}) / strides_spatial_shape[i] + 1)
if ceil_mode is enabled pad_shape[i] is the sum of pads along axis i.
auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - {kernelSpatialShape} + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
And pad shape will be following if SAME_UPPER or SAME_LOWER:
pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + {kernelSpatialShape} - input_spatial_shape[i]
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#LpPool-1">1</a>, <a href="Changelog.md#LpPool-2">2</a>, <a href="Changelog.md#LpPool-11">11</a>, <a href="Changelog.md#LpPool-18">18</a>
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
p = 3
kernel_shape = [2]
strides = [1]
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=kernel_shape,
strides=strides,
p=p,
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)
expect(node, inputs=[x], outputs=[y], name="test_lppool_1d_default")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
p = 4
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
p = 2
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[1, 1],
dilations=[2, 2],
p=p,
)
x = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
]
).astype(np.float32)
y = np.array(
[
[
[
[14.560219778561036, 16.24807680927192],
[21.633307652783937, 23.49468024894146],
]
]
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_dilations")
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
p = 3
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[2, 2, 2, 2],
p=p,
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = pad_top = pad_right = pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, extra_pads = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = np.pad(
x,
(
(0, 0),
(0, 0),
(extra_pads[0], extra_pads[2]),
(extra_pads[1], extra_pads[3]),
),
mode="constant",
constant_values=0,
)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"LPPOOL",
pads_required=extra_pads,
pads=pads,
p=p,
)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_pads")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
p = 4
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_LOWER",
p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=0,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(
padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", pads, pads, p=p
)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_same_lower")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
p = 2
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_UPPER",
p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=0,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(
padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", pads, pads, p=p
)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_same_upper")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
p = 2
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
strides=[3, 3],
p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)
expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_strides")
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
p = 3
node = onnx.helper.make_node(
"LpPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
p=p,
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)
expect(node, inputs=[x], outputs=[y], name="test_lppool_3d_default")
Matrix product that behaves like numpy.matmul.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#MatMul-1">1</a>, <a href="Changelog.md#MatMul-9">9</a>
node = onnx.helper.make_node(
"MatMul",
inputs=["a", "b"],
outputs=["c"],
)
# 2d
a = np.random.randn(3, 4).astype(np.float32)
b = np.random.randn(4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_2d")
# 3d
a = np.random.randn(2, 3, 4).astype(np.float32)
b = np.random.randn(2, 4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_3d")
# 4d
a = np.random.randn(1, 2, 3, 4).astype(np.float32)
b = np.random.randn(1, 2, 4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_4d")
# broadcasting
a = np.random.randn(3, 1, 3, 4).astype(np.float32)
b = np.random.randn(1, 2, 4, 2).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_bcast")
# 1d + 3d
a = np.random.randn(4).astype(np.float32)
b = np.random.randn(2, 4, 1).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_1d_3d")
# 3d + 1d
a = np.random.randn(1, 2, 4, 3).astype(np.float32)
b = np.random.randn(3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_4d_1d")
# 1d + 1d
a = np.random.randn(3).astype(np.float32)
b = np.random.randn(3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_1d_1d")
Matrix product that behaves like numpy.matmul. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.
This version of the operator has been available since version 10 of the default ONNX operator set.
node = onnx.helper.make_node(
"MatMulInteger",
inputs=["A", "B", "a_zero_point", "b_zero_point"],
outputs=["Y"],
)
A = np.array(
[
[11, 7, 3],
[10, 6, 2],
[9, 5, 1],
[8, 4, 0],
],
dtype=np.uint8,
)
a_zero_point = np.array([12], dtype=np.uint8)
B = np.array(
[
[1, 4],
[2, 5],
[3, 6],
],
dtype=np.uint8,
)
b_zero_point = np.array([0], dtype=np.uint8)
output = np.array(
[
[-38, -83],
[-44, -98],
[-50, -113],
[-56, -128],
],
dtype=np.int32,
)
expect(
node,
inputs=[A, B, a_zero_point, b_zero_point],
outputs=[output],
name="test_matmulinteger",
)
Element-wise max of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Max-1">1</a>, <a href="Changelog.md#Max-6">6</a>, <a href="Changelog.md#Max-8">8</a>, <a href="Changelog.md#Max-12">12</a>
data_0 = np.array([3, 2, 1]).astype(np.float32)
data_1 = np.array([1, 4, 4]).astype(np.float32)
data_2 = np.array([2, 5, 3]).astype(np.float32)
result = np.array([3, 5, 4]).astype(np.float32)
node = onnx.helper.make_node(
"Max",
inputs=["data_0", "data_1", "data_2"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1, data_2],
outputs=[result],
name="test_max_example",
)
node = onnx.helper.make_node(
"Max",
inputs=["data_0"],
outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_max_one_input")
result = np.maximum(data_0, data_1)
node = onnx.helper.make_node(
"Max",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node, inputs=[data_0, data_1], outputs=[result], name="test_max_two_inputs"
)
for op_dtype in all_numeric_dtypes:
data_0 = np.array([3, 2, 1]).astype(op_dtype)
data_1 = np.array([1, 4, 4]).astype(op_dtype)
result = np.array([3, 4, 4]).astype(op_dtype)
node = onnx.helper.make_node(
"Max",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1],
outputs=[result],
name=f"test_max_{np.dtype(op_dtype).name}",
)
MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape is calculated differently depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized. With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):
output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
or
output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
if ceil_mode is enabled. pad_shape[i] is the sum of pads along axis i. Sliding windows that would start in the right padded region are ignored.
auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:
VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):
VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1
And pad shape will be following if SAME_UPPER or SAME_LOWER:
pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
The output of each pooling window is maximum number of elements exclude pad.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#MaxPool-1">1</a>, <a href="Changelog.md#MaxPool-8">8</a>, <a href="Changelog.md#MaxPool-10">10</a>, <a href="Changelog.md#MaxPool-11">11</a>, <a href="Changelog.md#MaxPool-12">12</a>
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2],
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2]
strides = [1]
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")
expect(node, inputs=[x], outputs=[y], name="test_maxpool_1d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
strides=[2, 2],
ceil_mode=True,
)
x = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_ceil")
"""input_shape: [1, 1, 2, 2]
output_shape: [1, 1, 1, 1]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[1, 1],
strides=[2, 2],
ceil_mode=True,
)
x = np.array([[[[1, 2], [3, 4]]]]).astype(np.float32)
y = np.array([[[[1]]]]).astype(np.float32)
expect(
node,
inputs=[x],
outputs=[y],
name="test_maxpool_2d_ceil_output_size_reduce_by_one",
)
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_default")
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[1, 1],
dilations=[2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_dilations")
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
pads=[2, 2, 2, 2],
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = pad_top = pad_right = pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, extra_pads = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=np.nan,
)
y = pool(
padded,
x_shape,
kernel_shape,
strides,
out_shape,
"MAX",
pads_required=extra_pads,
pads=pads,
)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_pads")
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
pads=[2, 2, 2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array(
[
[
[
[13, 14, 15, 15, 15],
[18, 19, 20, 20, 20],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
]
]
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_pads")
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 3, 3]
pad_shape: [2, 2] -> [1, 1, 1, 1] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[3, 3],
strides=[2, 2],
auto_pad="SAME_UPPER",
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array([[[[7, 9, 10], [17, 19, 20], [22, 24, 25]]]]).astype(np.float32)
expect(
node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_same_upper"
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[2, 2], strides=[2, 2]
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)
expect(
node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_strides"
)
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_LOWER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=np.nan,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX", pads, pads)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_lower")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2],
auto_pad="SAME_UPPER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
"SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
"SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
x,
((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
mode="constant",
constant_values=np.nan,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX", pads, pads)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_upper")
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
node = onnx.helper.make_node(
"MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[5, 5], strides=[3, 3]
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, pads = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_strides")
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[5, 5],
pads=[2, 2, 2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.uint8)
y = np.array(
[
[
[
[13, 14, 15, 15, 15],
[18, 19, 20, 20, 20],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
]
]
]
).astype(np.uint8)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_uint8")
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")
expect(node, inputs=[x], outputs=[y], name="test_maxpool_3d_default")
"""input_shape: [1, 1, 4, 4, 4]
output_shape: [1, 1, 2, 2, 2]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
strides=[1, 1, 1],
dilations=[2, 2, 2],
)
x = np.array(
[
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
]
]
]
).astype(np.float32)
y = np.array([[[[[11, 12], [15, 16]], [[11, 12], [15, 16]]]]]).astype(
np.float32
)
expect(node, inputs=[x], outputs=[y], name="test_maxpool_3d_dilations")
"""input_shape: [1, 1, 4, 4, 4]
output_shape: [1, 1, 2, 2, 2]
"""
dilations = [2, 2, 2]
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
ceil_mode = False
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=[2, 2, 2],
strides=[1, 1, 1],
dilations=dilations,
)
x = np.array(
[
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
],
]
]
]
).astype(np.float32)
x_shape = x.shape[2:]
out_shape, pads = get_output_shape_explicit_padding(
None, x_shape, kernel_shape, strides, dilations, ceil_mode=ceil_mode
)
padded = x
y = pool(
padded,
(1, 1, *x_shape),
kernel_shape,
strides,
out_shape,
"MAX",
pads_required=pads,
pads=None,
dilations=dilations,
)
expect(
node, inputs=[x], outputs=[y], name="test_maxpool_3d_dilations_use_ref_impl"
)
x_shape = (32, 32, 32)
dilations = (2, 2, 2)
kernel_shape = (5, 5, 5)
strides = (3, 3, 3)
ceil_mode = True
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y"],
kernel_shape=kernel_shape,
strides=strides,
dilations=dilations,
ceil_mode=ceil_mode,
)
x = np.random.randn(1, 1, *x_shape).astype(np.float32)
out_shape, pads = get_output_shape_explicit_padding(
None, x_shape, kernel_shape, strides, dilations, ceil_mode=ceil_mode
)
padded = np.pad(
x,
(
(0, 0),
(0, 0),
(pads[0], pads[3]),
(pads[1], pads[4]),
(pads[2], pads[5]),
),
mode="constant",
constant_values=0,
)
y = pool(
padded,
(1, 1, *x_shape),
kernel_shape,
strides,
out_shape,
"MAX",
pads_required=pads,
pads=None,
dilations=dilations,
)
expect(
node,
inputs=[x],
outputs=[y],
name="test_maxpool_3d_dilations_use_ref_impl_large",
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y", "z"],
kernel_shape=[5, 5],
pads=[2, 2, 2, 2],
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array(
[
[
[
[13, 14, 15, 15, 15],
[18, 19, 20, 20, 20],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
[23, 24, 25, 25, 25],
]
]
]
).astype(np.float32)
z = np.array(
[
[
[
[12, 13, 14, 14, 14],
[17, 18, 19, 19, 19],
[22, 23, 24, 24, 24],
[22, 23, 24, 24, 24],
[22, 23, 24, 24, 24],
]
]
]
).astype(np.int64)
expect(
node,
inputs=[x],
outputs=[y, z],
name="test_maxpool_with_argmax_2d_precomputed_pads",
)
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
"MaxPool",
inputs=["x"],
outputs=["y", "z"],
kernel_shape=[2, 2],
strides=[2, 2],
storage_order=1,
)
x = np.array(
[
[
[
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25],
]
]
]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)
z = np.array([[[[6, 16], [8, 18]]]]).astype(np.int64)
expect(
node,
inputs=[x],
outputs=[y, z],
name="test_maxpool_with_argmax_2d_precomputed_strides",
)
ROI max pool consumes an input tensor X and region of interests (RoIs) to apply max pooling across each RoI, to produce output 4-D tensor of shape (num_rois, channels, pooled_shape[0], pooled_shape[1]).
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#MaxRoiPool-1">1</a>
MaxUnpool essentially computes the partial inverse of the MaxPool op. The input information to this op is typically the output information from a MaxPool op. The first input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output) from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corresponding to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op. The third (optional) input is a tensor that specifies the output size of the unpooling operation.
MaxUnpool is intended to do 'partial' inverse of the MaxPool op. 'Partial' because all the non-maximal values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling the result of an unpooling operation should give back the original input to the unpooling op.
MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous. The third input argument, output_size, is meant to disambiguate the op and produce output tensor of known/predictable size.
In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads, which define the exact unpooling op. The attributes typically have the same values as the corresponding pooling op that the unpooling op is trying to invert.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#MaxUnpool-9">9</a>, <a href="Changelog.md#MaxUnpool-11">11</a>
node = onnx.helper.make_node(
"MaxUnpool",
inputs=["xT", "xI", "output_shape"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[2, 2],
)
xT = np.array([[[[5, 6], [7, 8]]]], dtype=np.float32)
xI = np.array([[[[5, 7], [13, 15]]]], dtype=np.int64)
output_shape = np.array((1, 1, 5, 5), dtype=np.int64)
y = np.array(
[
[
[
[0, 0, 0, 0, 0],
[0, 5, 0, 6, 0],
[0, 0, 0, 0, 0],
[0, 7, 0, 8, 0],
[0, 0, 0, 0, 0],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[xT, xI, output_shape],
outputs=[y],
name="test_maxunpool_export_with_output_shape",
)
node = onnx.helper.make_node(
"MaxUnpool",
inputs=["xT", "xI"],
outputs=["y"],
kernel_shape=[2, 2],
strides=[2, 2],
)
xT = np.array([[[[1, 2], [3, 4]]]], dtype=np.float32)
xI = np.array([[[[5, 7], [13, 15]]]], dtype=np.int64)
y = np.array(
[[[[0, 0, 0, 0], [0, 1, 0, 2], [0, 0, 0, 0], [0, 3, 0, 4]]]],
dtype=np.float32,
)
expect(
node,
inputs=[xT, xI],
outputs=[y],
name="test_maxunpool_export_without_output_shape",
)
Element-wise mean of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Mean-1">1</a>, <a href="Changelog.md#Mean-6">6</a>, <a href="Changelog.md#Mean-8">8</a>
data_0 = np.array([3, 0, 2]).astype(np.float32)
data_1 = np.array([1, 3, 4]).astype(np.float32)
data_2 = np.array([2, 6, 6]).astype(np.float32)
result = np.array([2, 3, 4]).astype(np.float32)
node = onnx.helper.make_node(
"Mean",
inputs=["data_0", "data_1", "data_2"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1, data_2],
outputs=[result],
name="test_mean_example",
)
node = onnx.helper.make_node(
"Mean",
inputs=["data_0"],
outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_mean_one_input")
result = np.divide(np.add(data_0, data_1), 2.0)
node = onnx.helper.make_node(
"Mean",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node, inputs=[data_0, data_1], outputs=[result], name="test_mean_two_inputs"
)
A MeanVarianceNormalization Function: Perform mean variance normalization
on the input tensor X using formula: (X-EX)/sqrt(E(X-EX)^2)
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#MeanVarianceNormalization-9">9</a>
node = onnx.helper.make_node(
"MeanVarianceNormalization", inputs=["X"], outputs=["Y"]
)
input_data = np.array(
[
[
[[0.8439683], [0.5665144], [0.05836735]],
[[0.02916367], [0.12964272], [0.5060197]],
[[0.79538304], [0.9411346], [0.9546573]],
],
[
[[0.17730942], [0.46192095], [0.26480448]],
[[0.6746842], [0.01665257], [0.62473077]],
[[0.9240844], [0.9722341], [0.11965699]],
],
[
[[0.41356155], [0.9129373], [0.59330076]],
[[0.81929934], [0.7862604], [0.11799799]],
[[0.69248444], [0.54119414], [0.07513223]],
],
],
dtype=np.float32,
)
# Calculate expected output data
data_mean = np.mean(input_data, axis=(0, 2, 3), keepdims=1)
data_mean_squared = np.power(data_mean, 2)
data_squared = np.power(input_data, 2)
data_squared_mean = np.mean(data_squared, axis=(0, 2, 3), keepdims=1)
std = np.sqrt(data_squared_mean - data_mean_squared)
expected_output = (input_data - data_mean) / (std + 1e-9)
expect(node, inputs=[input_data], outputs=[expected_output], name="test_mvn")
Generate a MelWeightMatrix that can be used to re-weight a Tensor containing a linearly sampled frequency spectra (from DFT or STFT) into num_mel_bins frequency information based on the [lower_edge_hertz, upper_edge_hertz] range on the mel scale. This function defines the mel scale in terms of a frequency in hertz according to the following formula:
mel(f) = 2595 * log10(1 + f/700)
In the returned matrix, all the triangles (filterbanks) have a peak value of 1.0.
The returned MelWeightMatrix can be used to right-multiply a spectrogram S of shape [frames, num_spectrogram_bins] of linear scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram" M of shape [frames, num_mel_bins].
This version of the operator has been available since version 17 of the default ONNX operator set.
node = onnx.helper.make_node(
"MelWeightMatrix",
inputs=[
"num_mel_bins",
"dft_length",
"sample_rate",
"lower_edge_hertz",
"upper_edge_hertz",
],
outputs=["output"],
)
num_mel_bins = np.int32(8)
dft_length = np.int32(16)
sample_rate = np.int32(8192)
lower_edge_hertz = np.float32(0)
upper_edge_hertz = np.float32(8192 / 2)
num_spectrogram_bins = dft_length // 2 + 1
frequency_bins = np.arange(0, num_mel_bins + 2)
low_frequency_mel = 2595 * np.log10(1 + lower_edge_hertz / 700)
high_frequency_mel = 2595 * np.log10(1 + upper_edge_hertz / 700)
mel_step = (high_frequency_mel - low_frequency_mel) / frequency_bins.shape[0]
frequency_bins = frequency_bins * mel_step + low_frequency_mel
frequency_bins = 700 * (np.power(10, (frequency_bins / 2595)) - 1)
frequency_bins = ((dft_length + 1) * frequency_bins) // sample_rate
frequency_bins = frequency_bins.astype(int)
output = np.zeros((num_spectrogram_bins, num_mel_bins))
output.flags.writeable = True
for i in range(num_mel_bins):
lower_frequency_value = frequency_bins[i] # left
center_frequency_point = frequency_bins[i + 1] # center
higher_frequency_point = frequency_bins[i + 2] # right
low_to_center = center_frequency_point - lower_frequency_value
if low_to_center == 0:
output[center_frequency_point, i] = 1
else:
for j in range(lower_frequency_value, center_frequency_point + 1):
output[j, i] = float(j - lower_frequency_value) / float(
low_to_center
)
center_to_high = higher_frequency_point - center_frequency_point
if center_to_high > 0:
for j in range(center_frequency_point, higher_frequency_point):
output[j, i] = float(higher_frequency_point - j) / float(
center_to_high
)
# Expected output
# 1.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 1.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
output = output.astype(np.float32)
expect(
node,
inputs=[
num_mel_bins,
dft_length,
sample_rate,
lower_edge_hertz,
upper_edge_hertz,
],
outputs=[output],
name="test_melweightmatrix",
)
Element-wise min of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Min-1">1</a>, <a href="Changelog.md#Min-6">6</a>, <a href="Changelog.md#Min-8">8</a>, <a href="Changelog.md#Min-12">12</a>
data_0 = np.array([3, 2, 1]).astype(np.float32)
data_1 = np.array([1, 4, 4]).astype(np.float32)
data_2 = np.array([2, 5, 0]).astype(np.float32)
result = np.array([1, 2, 0]).astype(np.float32)
node = onnx.helper.make_node(
"Min",
inputs=["data_0", "data_1", "data_2"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1, data_2],
outputs=[result],
name="test_min_example",
)
node = onnx.helper.make_node(
"Min",
inputs=["data_0"],
outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_min_one_input")
result = np.minimum(data_0, data_1)
node = onnx.helper.make_node(
"Min",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node, inputs=[data_0, data_1], outputs=[result], name="test_min_two_inputs"
)
for op_dtype in all_numeric_dtypes:
data_0 = np.array([3, 2, 1]).astype(op_dtype)
data_1 = np.array([1, 4, 4]).astype(op_dtype)
result = np.array([1, 2, 1]).astype(op_dtype)
node = onnx.helper.make_node(
"Min",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1],
outputs=[result],
name=f"test_min_{np.dtype(op_dtype).name}",
)
Mish: A Self Regularized Non-Monotonic Neural Activation Function.
Perform the linear unit element-wise on the input tensor X using formula:
mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Mish-18">18</a>
node = onnx.helper.make_node("Mish", inputs=["X"], outputs=["Y"])
input_data = np.linspace(-10, 10, 10000, dtype=np.float32)
# Calculate expected output data
expected_output = input_data * np.tanh(np.log1p(np.exp(input_data)))
expect(node, inputs=[input_data], outputs=[expected_output], name="test_mish")
Performs an element-wise binary modulo operation.
The semantics and supported data types depend on the value of the fmod attribute which must be 0 (default), or 1.
If the fmod attribute is set to 0, T is constrained to integer data types and the semantics follow that of the Python %-operator.
The sign of the result is that of the divisor.
If fmod is set to 1, the behavior of this operator follows that of the fmod function in C and T is constrained to floating point data types.
The result of this operator is the remainder of the division operation x / y where x and y are respective elements of A and B. The result is exactly the value x - n * y, where n is x / y with its fractional part truncated.
The returned value has the same sign as x (except if x is -0) and is less or equal to |y| in magnitude.
The following special cases apply when fmod is set to 1:
x is -0 and y is greater than zero, either +0 or -0 may be returned.x is ±∞ and y is not NaN, NaN is returned.y is ±0 and x is not NaN, NaN should be returned.y is ±∞ and x is finite, x is returned.NaN, NaN is returned.This operator supports multidirectional (i.e., NumPy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Mod-10">10</a>
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.arange(0, 30).reshape([3, 2, 5]).astype(np.int32)
y = np.array([7]).astype(np.int32)
z = np.mod(x, y)
# array([[[0, 1, 2, 3, 4],
# [5, 6, 0, 1, 2]],
# [[3, 4, 5, 6, 0],
# [1, 2, 3, 4, 5]],
# [[6, 0, 1, 2, 3],
# [4, 5, 6, 0, 1]]], dtype=int32)
expect(node, inputs=[x, y], outputs=[z], name="test_mod_broadcast")
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)
x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int64)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int64)
z = np.fmod(x, y) # expected output [ 0, 1, 5, 0, -1, 3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_int64_fmod")
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)
x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float16)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float16)
z = np.fmod(
x, y
) # expected output [-0.10156, 0.3984 , 5. , 0.10156, -0.3984 , 3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float16")
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)
x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float32)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float32)
z = np.fmod(
x, y
) # expected output [-0.10000038, 0.39999962, 5. , 0.10000038, -0.39999962, 3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float32")
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)
x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float64)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float64)
z = np.fmod(x, y) # expected output [-0.1, 0.4, 5. , 0.1, -0.4, 3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float64")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int16)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int16)
z = np.mod(x, y) # expected output [ 0, -2, 5, 0, 2, 3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int16")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int32)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int32)
z = np.mod(x, y) # expected output [ 0, -2, 5, 0, 2, 3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int32")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int64)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int64)
z = np.mod(x, y) # expected output [ 0, -2, 5, 0, 2, 3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int64")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int8)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int8)
z = np.mod(x, y) # expected output [ 0, -2, 5, 0, 2, 3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int8")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([4, 7, 5]).astype(np.uint16)
y = np.array([2, 3, 8]).astype(np.uint16)
z = np.mod(x, y) # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint16")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([4, 7, 5]).astype(np.uint32)
y = np.array([2, 3, 8]).astype(np.uint32)
z = np.mod(x, y) # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint32")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([4, 7, 5]).astype(np.uint64)
y = np.array([2, 3, 8]).astype(np.uint64)
z = np.mod(x, y) # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint64")
node = onnx.helper.make_node(
"Mod",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([4, 7, 5]).astype(np.uint8)
y = np.array([2, 3, 8]).astype(np.uint8)
z = np.mod(x, y) # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint8")
Performs element-wise binary multiplication (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Mul-1">1</a>, <a href="Changelog.md#Mul-6">6</a>, <a href="Changelog.md#Mul-7">7</a>, <a href="Changelog.md#Mul-13">13</a>
node = onnx.helper.make_node(
"Mul",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.float32)
z = x * y # expected output [4., 10., 18.]
expect(node, inputs=[x, y], outputs=[z], name="test_mul_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.int8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int8)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_int8")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.int16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.int16)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_int16")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_uint8")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint16)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_uint16")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint32)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_uint32")
x = np.random.randint(4, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint64)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_uint64")
node = onnx.helper.make_node(
"Mul",
inputs=["x", "y"],
outputs=["z"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_bcast")
Generate a tensor of samples from a multinomial distribution according to the probabilities of each of the possible outcomes.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Multinomial-7">7</a>
Neg takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where each element flipped sign, y = -x, is applied to the tensor elementwise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Neg-1">1</a>, <a href="Changelog.md#Neg-6">6</a>
node = onnx.helper.make_node(
"Neg",
inputs=["x"],
outputs=["y"],
)
x = np.array([-4, 2]).astype(np.float32)
y = np.negative(x) # expected output [4., -2.],
expect(node, inputs=[x], outputs=[y], name="test_neg_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.negative(x)
expect(node, inputs=[x], outputs=[y], name="test_neg")
A NegativeLogLikelihoodLoss operator computes (weighted) negative log likelihood loss. Its "input" tensor has the shape of (N, C, d1, d2, ..., dk) where k >= 0. The "input" tensor contains log-probabilities for input[n, :, d_1, d_2,..., d_k] being in a class of [0, C). The operator's "target" input tensor has the shape of (N, d1, d2, ..., dk). It encodes class labels (one of C classes) or it may contain a special value (indicated by an attribute ignore_index) for N x d1 x d2 x ... x dk samples. The loss value for input[n, :, d_1, d_2,...d_k] being classified as class c = target[n][d_1][d_2]...[d_k] is computed as:
loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k].
When an optional "weight" is provided, the sample loss is calculated as:
loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k] * weight[c].
loss is zero for the case when target-value equals ignore_index.
loss[n][d_1][d_2]...[d_k] = 0, when target[n][d_1][d_2]...[d_k] = ignore_index
If "reduction" attribute is set to "none", the operator's output will be the above loss with shape (N, d1, d2, ..., dk). If "reduction" attribute is set to "mean" (the default attribute value), the output loss is (weight) averaged:
mean(loss), if "weight" is not provided,
or if weight is provided,
sum(loss) / sum(weight[target[n][d_1][d_2]...[d_k]]]), for all samples.
If "reduction" attribute is set to "sum", the output is a scalar: sum(loss).
See also https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss.
Example 1:
// negative log likelihood loss, "none" reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
loss = np.zeros((N, d1))
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1]
// print(loss)
// [[-3. -2.]
// [-0. -2.]]
Example 2:
// weighted negative log likelihood loss, sum reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
weight = [0.2, 0.3, 0.1]
loss = np.zeros((N, d1))
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1] * weight[c]
loss = np.sum(loss)
// print(loss)
// -1.1
Example 3:
// weighted negative log likelihood loss, mean reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
weight = [0.2, 0.3, 0.1]
loss = np.zeros((N, d1))
weight_total = 0
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1] * weight[c]
weight_total = weight_total + weight[c]
loss = np.sum(loss) / weight_total
// print(loss)
// -1.57
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#NegativeLogLikelihoodLoss-12">12</a>, <a href="Changelog.md#NegativeLogLikelihoodLoss-13">13</a>
reduction = "none"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C = 3, 5
np.random.seed(0)
input = np.random.rand(N, C).astype(np.float32)
target = np.random.randint(0, high=C, size=(N,)).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NC",
)
reduction = "mean"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1",
)
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
target[0][0] = np.int64(1)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1_ii",
)
reduction = "mean"
ignore_index = np.int64(-1)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1 = 3, 5, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
target[0][0] = -1
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1_mean_weight_negative_ii",
)
reduction = "mean"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
)
N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1_weight",
)
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
target[0][0] = np.int64(1)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1_weight_ii",
)
reduction = "none"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2",
)
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
target[0][0][0] = np.int64(1)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_no_weight_reduction_mean_ii",
)
reduction = "mean"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_reduction_mean",
)
reduction = "sum"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=None, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_reduction_sum",
)
reduction = "none"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_with_weight",
)
reduction = "mean"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_with_weight_reduction_mean",
)
reduction = "sum"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_with_weight_reduction_sum",
)
reduction = "sum"
ignore_index = np.int64(0)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
target[0][0][0] = np.int64(0)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2_with_weight_reduction_sum_ii",
)
reduction = "none"
ignore_index = np.int64(-5)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
np.int64
)
target[0][0][0][0] = -5
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2d3_none_no_weight_negative_ii",
)
reduction = "sum"
ignore_index = np.int64(10)
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C = 3, 5
np.random.seed(0)
input = np.random.rand(N, C).astype(np.float32)
target = np.random.randint(0, high=C, size=(N)).astype(np.int64)
target[0] = 10
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2d3_sum_weight_high_ii",
)
reduction = "mean"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target", "weight"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
target = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, weight=weight, reduction=reduction
)
expect(
node,
inputs=[input, target, weight],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2d3d4d5_mean_weight",
)
reduction = "none"
node = onnx.helper.make_node(
"NegativeLogLikelihoodLoss",
inputs=["input", "target"],
outputs=["loss"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
target = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
input, target, reduction=reduction
)
expect(
node,
inputs=[input, target],
outputs=[negative_log_likelihood_loss],
name="test_nllloss_NCd1d2d3d4d5_none_no_weight",
)
Filter out boxes that have high intersection-over-union (IOU) overlap with previously selected boxes. Bounding boxes with score less than score_threshold are removed. Bounding box format is indicated by attribute center_point_box. Boxes are suppressed if their IOU with a previously selected box is strictly greater than iou_threshold (i.e., boxes with IOU exactly equal to the threshold are kept). Note that this algorithm is agnostic to where the origin is in the coordinate system and more generally is invariant to orthogonal transformations and translations of the coordinate system; thus translating or reflections of the coordinate system result in the same boxes being selected by the algorithm. The selected_indices output is a set of integers indexing into the input collection of bounding boxes representing the selected boxes. The bounding box coordinates corresponding to the selected indices can then be obtained using the Gather or GatherND operation.
This version of the operator has been available since version 11 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#NonMaxSuppression-10">10</a>
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
center_point_box=1,
)
boxes = np.array(
[
[
[0.5, 0.5, 1.0, 1.0],
[0.5, 0.6, 1.0, 1.0],
[0.5, 0.4, 1.0, 1.0],
[0.5, 10.5, 1.0, 1.0],
[0.5, 10.6, 1.0, 1.0],
[0.5, 100.5, 1.0, 1.0],
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_center_point_box_format",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[1.0, 1.0, 0.0, 0.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, 0.9, 1.0, -0.1],
[0.0, 10.0, 1.0, 11.0],
[1.0, 10.1, 0.0, 11.1],
[1.0, 101.0, 0.0, 100.0],
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_flipped_coordinates",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 1.0, 1.0],
]
]
).astype(np.float32)
scores = np.array(
[[[0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 0]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_identical_boxes",
)
"""Test boundary condition where IoU exactly equals threshold.
This test verifies that the comparison is strict (>), not inclusive (>=).
When IoU exactly equals the threshold, boxes should be KEPT, not suppressed.
This follows PyTorch's NMS implementation.
"""
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
# Two boxes with 50% overlap in each dimension
# box1=[0,0,1,1], box2=[0.5,0.5,1.5,1.5]
# Intersection area = 0.5 * 0.5 = 0.25
# Union area = 1.0 + 1.0 - 0.25 = 1.75
# IoU = 0.25 / 1.75 (exact value computed below as float32)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0], # box 0
[0.5, 0.5, 1.5, 1.5], # box 1 - overlaps box 0
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.8]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
# Compute the exact IoU value and use it as threshold
# This ensures the threshold exactly equals the IoU
exact_iou = np.float32(0.25 / 1.75)
iou_threshold = np.array([exact_iou]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
# Both boxes should be selected because IoU == threshold (not > threshold)
selected_indices = np.array([[0, 0, 0], [0, 0, 1]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_iou_threshold_boundary",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_limit_output_size",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array([[[0.0, 0.0, 1.0, 1.0]]]).astype(np.float32)
scores = np.array([[[0.9]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 0]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_single_box",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_suppress_by_IOU",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
]
]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.4]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0]]).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_suppress_by_IOU_and_scores",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
],
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
],
]
).astype(np.float32)
scores = np.array(
[[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]], [[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array(
[[0, 0, 3], [0, 0, 0], [1, 0, 3], [1, 0, 0]]
).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_two_batches",
)
node = onnx.helper.make_node(
"NonMaxSuppression",
inputs=[
"boxes",
"scores",
"max_output_boxes_per_class",
"iou_threshold",
"score_threshold",
],
outputs=["selected_indices"],
)
boxes = np.array(
[
[
[0.0, 0.0, 1.0, 1.0],
[0.0, 0.1, 1.0, 1.1],
[0.0, -0.1, 1.0, 0.9],
[0.0, 10.0, 1.0, 11.0],
[0.0, 10.1, 1.0, 11.1],
[0.0, 100.0, 1.0, 101.0],
]
]
).astype(np.float32)
scores = np.array(
[[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3], [0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array(
[[0, 0, 3], [0, 0, 0], [0, 1, 3], [0, 1, 0]]
).astype(np.int64)
expect(
node,
inputs=[
boxes,
scores,
max_output_boxes_per_class,
iou_threshold,
score_threshold,
],
outputs=[selected_indices],
name="test_nonmaxsuppression_two_classes",
)
Returns the indices of the elements that are non-zero (in row-major order - by dimension). NonZero behaves similar to numpy.nonzero: https://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html, but for scalar input, NonZero produces output shape (0, N) instead of (1, N), which is different from Numpy's behavior.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#NonZero-9">9</a>
node = onnx.helper.make_node(
"NonZero",
inputs=["condition"],
outputs=["result"],
)
condition = np.array([[1, 0], [1, 1]], dtype=bool)
result = np.array(
np.nonzero(condition), dtype=np.int64
) # expected output [[0, 1, 1], [0, 0, 1]]
expect(node, inputs=[condition], outputs=[result], name="test_nonzero_example")
Returns the negation of the input tensor element-wise.
This version of the operator has been available since version 1 of the default ONNX operator set.
node = onnx.helper.make_node(
"Not",
inputs=["x"],
outputs=["not"],
)
# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_2d")
# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_3d")
# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_4d")
Produces a one-hot tensor based on inputs. The locations represented by the index values in the 'indices' input tensor will have 'on_value' and the other locations will have 'off_value' in the output tensor, where 'on_value' and 'off_value' are specified as part of required input argument 'values', which is a two-element tensor of format [off_value, on_value]. The rank of the output tensor will be one greater than the rank of the input tensor. The additional dimension is for one-hot representation. The additional dimension will be inserted at the position specified by 'axis'. If 'axis' is not specified then then additional dimension will be inserted as the innermost dimension, i.e. axis=-1. The size of the additional dimension is specified by required scalar input 'depth'. The type of the output tensor is the same as the type of the 'values' input. Any entries in the 'indices' input tensor with values outside the range [-depth, depth-1] will result in one-hot representation with all 'off_value' values in the output tensor.
when axis = 0:
output[input[i, j, k], i, j, k] = 1 for all i, j, k and 0 otherwise.
when axis = -1:
output[i, j, k, input[i, j, k]] = 1 for all i, j, k and 0 otherwise.
This version of the operator has been available since version 11 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#OneHot-9">9</a>
axisValue = 1
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
"OneHot",
inputs=["indices", "depth", "values"],
outputs=["y"],
axis=axisValue,
)
indices = np.array([[1, 9], [2, 4]], dtype=np.float32)
depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
node,
inputs=[indices, depth, values],
outputs=[y],
name="test_onehot_with_axis",
)
axisValue = -2
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
"OneHot",
inputs=["indices", "depth", "values"],
outputs=["y"],
axis=axisValue,
)
indices = np.array([[1, 9], [2, 4]], dtype=np.float32)
depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
node,
inputs=[indices, depth, values],
outputs=[y],
name="test_onehot_with_negative_axis",
)
axisValue = 1
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
"OneHot",
inputs=["indices", "depth", "values"],
outputs=["y"],
axis=axisValue,
)
indices = np.array([0, -7, -8], dtype=np.int64)
# print(y)
# [[3. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
# [1. 1. 1. 3. 1. 1. 1. 1. 1. 1.]
# [1. 1. 3. 1. 1. 1. 1. 1. 1. 1.]]
depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
node,
inputs=[indices, depth, values],
outputs=[y],
name="test_onehot_negative_indices",
)
on_value = 5
off_value = 2
output_type = np.int32
node = onnx.helper.make_node(
"OneHot", inputs=["indices", "depth", "values"], outputs=["y"]
)
indices = np.array([0, 7, 8], dtype=np.int64)
depth = np.float32(12)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
node,
inputs=[indices, depth, values],
outputs=[y],
name="test_onehot_without_axis",
)
Constructs an optional-type value containing either an empty optional of a certain type specified by the attribute, or a non-empty value containing the input element.
This version of the operator has been available since version 15 of the default ONNX operator set.
If the input is a tensor or sequence type, it returns the input. If the input is an optional type, it outputs the element in the input. It is an error if the input is an empty optional-type (i.e. does not have an element) and the behavior is undefined in this case.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#OptionalGetElement-15">15</a>
Returns true if (1) the input is an optional-type and contains an element, or, (2) the input is a tensor or sequence type. If the input is not provided or is an empty optional-type, this op returns false.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#OptionalHasElement-15">15</a>
optional = None
tensor_type_proto = onnx.helper.make_tensor_type_proto(
elem_type=onnx.TensorProto.INT32, shape=[]
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)
# OptionalHasElement takes a tensor or optional as input
for input_type_proto in [tensor_type_proto, optional_type_proto]:
input_name_options = {
"empty": "optional_input",
"empty_no_input_name": "",
"empty_no_input": None,
}
for test_name_surfix, input_name in input_name_options.items():
if input_type_proto == tensor_type_proto and input_name:
# the input tensor cannot be empty if input name is provided.
continue
node = onnx.helper.make_node(
"OptionalHasElement",
inputs=[] if input_name is None else [input_name],
outputs=["output"],
)
output = optional_has_element_reference_implementation(optional)
test_name = (
"test_optional_has_element_"
+ test_name_surfix
+ (
"_optional_input"
if input_type_proto == optional_type_proto
else "_tensor_input"
)
)
expect(
node,
inputs=[optional] if input_name else [],
outputs=[output],
input_type_protos=[input_type_proto] if input_name else [],
name=test_name,
)
optional = [np.array([1, 2, 3, 4]).astype(np.int32)]
tensor_type_proto = onnx.helper.make_tensor_type_proto(
elem_type=onnx.TensorProto.INT32,
shape=[
4,
],
)
seq_type_proto = onnx.helper.make_sequence_type_proto(tensor_type_proto)
optional_type_proto = onnx.helper.make_optional_type_proto(seq_type_proto)
node = onnx.helper.make_node(
"OptionalGetElement", inputs=["optional_input"], outputs=["output"]
)
output = optional_get_element_reference_implementation(optional)
expect(
node,
inputs=[optional],
outputs=[output],
input_type_protos=[optional_type_proto],
name="test_optional_get_element_optional_sequence",
)
expect(
node,
inputs=[optional],
outputs=[output],
input_type_protos=[seq_type_proto],
name="test_optional_get_element_sequence",
)
optional = np.array([1, 2, 3, 4]).astype(np.float32)
tensor_type_proto = onnx.helper.make_tensor_type_proto(
elem_type=onnx.TensorProto.FLOAT,
shape=[
4,
],
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)
node = onnx.helper.make_node(
"OptionalGetElement", inputs=["optional_input"], outputs=["output"]
)
output = optional_get_element_reference_implementation(optional)
expect(
node,
inputs=[optional],
outputs=[output],
input_type_protos=[optional_type_proto],
name="test_optional_get_element_optional_tensor",
)
expect(
node,
inputs=[optional],
outputs=[output],
input_type_protos=[tensor_type_proto],
name="test_optional_get_element_tensor",
)
optional = np.array([1, 2, 3, 4]).astype(np.float32)
tensor_type_proto = onnx.helper.make_tensor_type_proto(
elem_type=onnx.TensorProto.FLOAT,
shape=[
4,
],
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)
# OptionalHasElement takes a tensor or optional as input
for input_type_protos in [tensor_type_proto, optional_type_proto]:
node = onnx.helper.make_node(
"OptionalHasElement", inputs=["optional_input"], outputs=["output"]
)
output = optional_has_element_reference_implementation(optional)
test_name = "test_optional_has_element_" + (
"optional_input"
if input_type_protos == optional_type_proto
else "tensor_input"
)
expect(
node,
inputs=[optional],
outputs=[output],
input_type_protos=[optional_type_proto],
name=test_name,
)
Returns the tensor resulted from performing the or logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 7 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Or-1">1</a>
node = onnx.helper.make_node(
"Or",
inputs=["x", "y"],
outputs=["or"],
)
# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or2d")
# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or3d")
# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or4d")
node = onnx.helper.make_node(
"Or",
inputs=["x", "y"],
outputs=["or"],
)
# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast3v1d")
# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast3v2d")
# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v2d")
# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v3d")
# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v4d")
PRelu takes input data (Tensor<T>) and slope tensor as input, and produces one
output data (Tensor<T>) where the function f(x) = slope * x for x < 0,
f(x) = x for x >= 0., is applied to the data tensor elementwise.
This operator supports unidirectional broadcasting (tensor slope should be unidirectional broadcastable to input tensor X); for more details please check the doc.
This version of the operator has been available since version 16 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#PRelu-1">1</a>, <a href="Changelog.md#PRelu-6">6</a>, <a href="Changelog.md#PRelu-7">7</a>, <a href="Changelog.md#PRelu-9">9</a>
node = onnx.helper.make_node(
"PRelu",
inputs=["x", "slope"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
slope = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * slope
expect(node, inputs=[x, slope], outputs=[y], name="test_prelu_example")
node = onnx.helper.make_node(
"PRelu",
inputs=["x", "slope"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
slope = np.random.randn(5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * slope
expect(node, inputs=[x, slope], outputs=[y], name="test_prelu_broadcast")
Given a tensor containing the data to be padded (data), a tensor containing the number of start and end pad values for axis (pads), (optionally) a mode, and (optionally) constant_value,
a padded tensor (output) is generated.
The three supported modes are (similar to corresponding modes supported by numpy.pad):
constant(default) - pads with a given constant value as specified by constant_value (which defaults to 0, empty string, or False)
reflect - pads with the reflection of the vector mirrored on the first and last values of the vector along each axis
edge - pads with the edge values of array
wrap - wrap-around padding as if the data tensor forms a torus
Example 1 (constant mode):
Insert 0 pads to the beginning of the second dimension.
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [0, 2, 0, 0]
mode = 'constant'
constant_value = 0.0
output = [
[0.0, 0.0, 1.0, 1.2],
[0.0, 0.0, 2.3, 3.4],
[0.0, 0.0, 4.5, 5.7],
]
Example 2 (reflect mode):
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [0, 2, 0, 0]
mode = 'reflect'
output = [
[1.0, 1.2, 1.0, 1.2],
[2.3, 3.4, 2.3, 3.4],
[4.5, 5.7, 4.5, 5.7],
]
Example 3 (edge mode):
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [0, 2, 0, 0]
mode = 'edge'
output = [
[1.0, 1.0, 1.0, 1.2],
[2.3, 2.3, 2.3, 3.4],
[4.5, 4.5, 4.5, 5.7],
]
Example 4 (wrap mode):
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [2, 1, 1, 1]
mode = 'wrap'
output = [
[3.4, 2.3, 3.4, 2.3],
[5.7, 4.5, 5.7, 4.5],
[1.2, 1.0, 1.2, 1.0],
[3.4, 2.3, 3.4, 2.3],
[5.7, 4.5, 5.7, 4.5],
[1.2, 1.0, 1.2, 1.0],
]
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Pad-1">1</a>, <a href="Changelog.md#Pad-2">2</a>, <a href="Changelog.md#Pad-11">11</a>, <a href="Changelog.md#Pad-13">13</a>, <a href="Changelog.md#Pad-18">18</a>, <a href="Changelog.md#Pad-19">19</a>, <a href="Changelog.md#Pad-21">21</a>, <a href="Changelog.md#Pad-23">23</a>, <a href="Changelog.md#Pad-24">24</a>
node = onnx.helper.make_node(
"Pad", inputs=["x", "pads", "value"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 0, 1, 3, 0, 0, 2, 4]).astype(
np.int64
) # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
y = pad_impl(x, pads, "constant", 1.2)
expect(node, inputs=[x, pads, value], outputs=[y], name="test_constant_pad")
node = onnx.helper.make_node(
"Pad", inputs=["x", "pads", "value", "axes"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 3, 0, 4]).astype(
np.int64
) # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
axes = np.array([1, 3], dtype=np.int64)
y = pad_impl(
x,
pads,
"constant",
1.2,
[1, 3],
)
expect(
node,
inputs=[x, pads, value, axes],
outputs=[y],
name="test_constant_pad_axes",
)
node = onnx.helper.make_node(
"Pad", inputs=["x", "pads", "value", "axes"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 3, 0, 4]).astype(
np.int64
) # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
axes = np.array([-3, -1], dtype=np.int64)
y = pad_impl(
x,
pads,
"constant",
1.2,
[-3, -1],
)
expect(
node,
inputs=[x, pads, value, axes],
outputs=[y],
name="test_constant_pad_negative_axes",
)
for mode in ("edge", "reflect", "wrap"):
node = onnx.helper.make_node(
"Pad", inputs=["x", "pads"], outputs=["y"], mode=mode
)
x = np.random.randn(1, 3, 4, 5).astype(np.int32)
pads = np.array([0, 0, 1, 1, 0, 0, 1, 1]).astype(
np.int64
) # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
y = pad_impl(x, pads, mode)
expect(node, inputs=[x, pads], outputs=[y], name=f"test_{mode}_pad")
Pow takes input data (Tensor<T>) and exponent Tensor, and
produces one output data (Tensor<T>) where the function f(x) = x^exponent,
is applied to the data tensor elementwise.
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 15 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Pow-1">1</a>, <a href="Changelog.md#Pow-7">7</a>, <a href="Changelog.md#Pow-12">12</a>, <a href="Changelog.md#Pow-13">13</a>
node = onnx.helper.make_node(
"Pow",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y) # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_example")
x = np.arange(60).reshape(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = pow(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_pow")
node = onnx.helper.make_node(
"Pow",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array(2).astype(np.float32)
z = pow(x, y) # expected output [1., 4., 9.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_bcast_scalar")
node = onnx.helper.make_node(
"Pow",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([[1, 2, 3], [4, 5, 6]]).astype(np.float32)
y = np.array([1, 2, 3]).astype(np.float32)
# expected output [[1, 4, 27], [4, 25, 216]]
z = pow(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_pow_bcast_array")
node = onnx.helper.make_node(
"Pow",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.int64)
z = pow(x, y) # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_int64")
x = np.array([1, 2, 3]).astype(np.int64)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y) # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int64_float32")
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.int32)
z = pow(x, y) # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_int32")
x = np.array([1, 2, 3]).astype(np.int32)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y) # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int32_float32")
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.uint64)
z = pow(x, y) # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_uint64")
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.uint32)
z = pow(x, y) # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_uint32")
x = np.array([1, 2, 3]).astype(np.int64)
y = np.array([4, 5, 6]).astype(np.int64)
z = pow(x, y) # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int64_int64")
x = np.array([1, 2, 3]).astype(np.int32)
y = np.array([4, 5, 6]).astype(np.int32)
z = pow(x, y) # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int32_int32")
The convolution operator consumes a quantized input tensor, its scale and zero point, a quantized filter, its scale and zero point, and output's scale and zero point, and computes the quantized output. Each scale and zero-point pair must have same shape. It means they must be either scalars (per tensor) or 1-D tensors (per output channel). Each input or output and its related zero point must have same type. When bias is present it must be quantized using scale = input scale * weight scale and zero point as 0.
This version of the operator has been available since version 10 of the default ONNX operator set.
node = onnx.helper.make_node(
"QLinearConv",
inputs=[
"x",
"x_scale",
"x_zero_point",
"w",
"w_scale",
"w_zero_point",
"y_scale",
"y_zero_point",
],
outputs=["y"],
)
x = np.array(
[
[255, 174, 162, 25, 203, 168, 58],
[15, 59, 237, 95, 129, 0, 64],
[56, 242, 153, 221, 168, 12, 166],
[232, 178, 186, 195, 237, 162, 237],
[188, 39, 124, 77, 80, 102, 43],
[127, 230, 21, 83, 41, 40, 134],
[255, 154, 92, 141, 42, 148, 247],
],
dtype=np.uint8,
).reshape((1, 1, 7, 7))
x_scale = np.float32(0.00369204697)
x_zero_point = np.uint8(132)
w = np.array([0], dtype=np.uint8).reshape((1, 1, 1, 1))
w_scale = np.array([0.00172794575], dtype=np.float32)
w_zero_point = np.array([255], dtype=np.uint8)
y_scale = np.float32(0.00162681262)
y_zero_point = np.uint8(123)
output = np.array(
[
[0, 81, 93, 230, 52, 87, 197],
[240, 196, 18, 160, 126, 255, 191],
[199, 13, 102, 34, 87, 243, 89],
[23, 77, 69, 60, 18, 93, 18],
[67, 216, 131, 178, 175, 153, 212],
[128, 25, 234, 172, 214, 215, 121],
[0, 101, 163, 114, 213, 107, 8],
],
dtype=np.uint8,
).reshape((1, 1, 7, 7))
expect(
node,
inputs=[
x,
x_scale,
x_zero_point,
w,
w_scale,
w_zero_point,
y_scale,
y_zero_point,
],
outputs=[output],
name="test_qlinearconv",
)
Matrix product that behaves like numpy.matmul. It consumes two quantized input tensors, their scales and zero points, scale and zero point of output, and computes the quantized output. The quantization formula is y = saturate((x / y_scale) + y_zero_point). For (x / y_scale), it is rounding to nearest ties to even. Refer to https://en.wikipedia.org/wiki/Rounding for details. Scale and zero point must have same shape. They must be either scalar (per tensor) or N-D tensor (per row for 'a' and per column for 'b'). Scalar refers to per tensor quantization whereas N-D refers to per row or per column quantization. If the input is 2D of shape [M, K] then zero point and scale tensor may be an M element vector [v_1, v_2, ..., v_M] for per row quantization and K element vector of shape [v_1, v_2, ..., v_K] for per column quantization. If the input is N-D tensor with shape [D1, D2, M, K] then zero point and scale tensor may have shape [D1, D2, M, 1] for per row quantization and shape [D1, D2, 1, K] for per column quantization. Production must never overflow, and accumulation may overflow if and only if in 32 bits.
This version of the operator has been available since version 21 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#QLinearMatMul-10">10</a>
for quant_type_name in ["uint8", "int8"]:
quant_type = getattr(np, quant_type_name)
for dtype_name in ["float32", "float16"]:
dtype = getattr(np, dtype_name)
node = onnx.helper.make_node(
"QLinearMatMul",
inputs=[
"a",
"a_scale",
"a_zero_point",
"b",
"b_scale",
"b_zero_point",
"y_scale",
"y_zero_point",
],
outputs=["y"],
)
# 2D
a = np.array([[208, 236, 0, 238], [3, 214, 255, 29]])
if quant_type == np.int8:
a -= 127
a = a.astype(quant_type)
a_scale = np.array([0.0066], dtype=dtype)
a_zero_point = np.array(
[113 - 127] if quant_type == np.int8 else [113], dtype=quant_type
)
b = np.array(
[[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]]
)
if quant_type == np.int8:
b -= 127
b = b.astype(quant_type)
b_scale = np.array([0.00705], dtype=dtype)
b_zero_point = np.array(
[114 - 127] if quant_type == np.int8 else [114], dtype=quant_type
)
y_scale = np.array([0.0107], dtype=dtype)
y_zero_point = np.array(
[118 - 127] if quant_type == np.int8 else [118], dtype=quant_type
)
if quant_type == np.int8:
output = np.array([[41, -12, -9], [1, -75, 20]])
else:
output = np.array([[168, 115, 255], [1, 66, 151]])
output = output.astype(quant_type)
expect(
node,
inputs=[
a,
a_scale,
a_zero_point,
b,
b_scale,
b_zero_point,
y_scale,
y_zero_point,
],
outputs=[output],
name=f"test_qlinearmatmul_2D_{quant_type_name}_{dtype_name}",
)
# 3D
a = np.array(
[
[[208, 236, 0, 238], [3, 214, 255, 29]],
[[208, 236, 0, 238], [3, 214, 255, 29]],
],
)
if quant_type == np.int8:
a -= 127
a = a.astype(quant_type)
a_scale = np.array([0.0066], dtype=dtype)
a_zero_point = np.array(
[113 - 127] if quant_type == np.int8 else [113], dtype=quant_type
)
b = np.array(
[
[[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]],
[[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]],
],
)
if quant_type == np.int8:
b -= 127
b = b.astype(quant_type)
b_scale = np.array([0.00705], dtype=dtype)
b_zero_point = np.array([114], dtype=quant_type)
y_scale = np.array([0.0107], dtype=dtype)
y_zero_point = np.array(
[118 - 127] if quant_type == np.int8 else [118], dtype=quant_type
)
if quant_type == np.int8:
if dtype == np.float32:
output = np.array(
[
[[-86, 117, 120], [115, 39, -121]],
[[-86, 117, 120], [115, 39, -121]],
]
)
else:
output = np.array(
[
[[-86, 116, 119], [115, 39, -121]],
[[-86, 116, 119], [115, 39, -121]],
]
)
else:
output = np.array(
[
[[168, 115, 255], [1, 66, 151]],
[[168, 115, 255], [1, 66, 151]],
]
)
output = output.astype(quant_type)
expect(
node,
inputs=[
a,
a_scale,
a_zero_point,
b,
b_scale,
b_zero_point,
y_scale,
y_zero_point,
],
outputs=[output],
name=f"test_qlinearmatmul_3D_{quant_type_name}_{dtype_name}",
)
The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the
low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization
granularity. The quantization formula is y = saturate((x / y_scale) + y_zero_point).
Saturation is done according to:
For (x / y_scale), it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
y_zero_point and y must have the same type. y_zero_point is usually not used for quantization to float8 and 4bit types, but the quantization
formula remains the same for consistency, and the type of the attribute y_zero_point still determines the quantization type.
x and y_scale are allowed to have different types. The type of y_scale determines the precision of the division operation between x and
y_scale, unless the precision attribute is specified.
There are three supported quantization granularities, determined by the shape of y_scale.
In all cases, y_zero_point must have the same shape as y_scale.
y_scale is a scalar.(D0, ..., Di, ..., Dn) and axis=i, y_scale is a 1-D tensor of length Di.x shape (D0, ..., Di, ..., Dn), axis=i, and block size B: y_scale shape is
(D0, ..., ceil(Di/B), ..., Dn).This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#QuantizeLinear-10">10</a>, <a href="Changelog.md#QuantizeLinear-13">13</a>, <a href="Changelog.md#QuantizeLinear-19">19</a>, <a href="Changelog.md#QuantizeLinear-21">21</a>, <a href="Changelog.md#QuantizeLinear-23">23</a>, <a href="Changelog.md#QuantizeLinear-24">24</a>
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array(
[
[
[[-162, 10], [-100, 232], [-20, -50]],
[[-76, 0], [0, 252], [32, -44]],
[[245, -485], [-960, -270], [-375, -470]],
],
],
dtype=np.float32,
)
y_scale = np.array([2, 4, 5], dtype=np.float32)
y_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (x / y_scale.reshape(1, 3, 1, 1) + y_zero_point.reshape(1, 3, 1, 1)).astype(
np.uint8
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_axis",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=1,
block_size=2,
)
x = np.array(
[
[6.0, 12.0, 50.0, 5.0],
[1.0, 8.0, 4.0, 5.0],
[0.0, 20.0, 10.0, 4.0],
],
dtype=np.float32,
)
y_scale = np.array(
[
[1.5, 2.5],
[3.0, 4.9],
[5.1, 6.9],
],
dtype=np.float32,
)
y_zero_point = np.array(
[
[0, 1],
[1, 0],
[2, 3],
],
dtype=np.uint8,
)
# x.shape = (3, 4)
# y_scale.shape = (3, 2)
assert y_scale.shape == y_zero_point.shape
block_axis = 1
# The block shape is [x.shape[i] // y_scale.shape[i] for i in range(len(x.shape))] = (1, 2)
assert all(
x.shape[i] == y_scale.shape[i]
for i in range(len(x.shape))
if i != block_axis
)
assert x.shape[block_axis] % y_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // y_scale.shape[block_axis]
# Create element-wise scale and zero point
y_scale_elementwise = np.repeat(y_scale, repeats=repeats, axis=block_axis)
y_zero_point_elementwise = np.repeat(
y_zero_point, repeats=repeats, axis=block_axis
)
y = np.rint(x / y_scale_elementwise + y_zero_point_elementwise).astype(np.uint8)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_blocked_asymmetric",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale"],
outputs=["y"],
axis=1,
block_size=2,
output_dtype=TensorProto.INT16,
)
x = np.array(
[
[6.0, -8, -10, 5.0],
[1.0, 8.0, 4.0, 5.0],
[0.0, 20.0, 10.0, 4.0],
],
dtype=np.float32,
)
y_scale = np.array(
[
[1.5, 2.5],
[3.0, 4.9],
[5.1, 6.9],
],
dtype=np.float32,
)
# x.shape = (3, 4)
# y_scale.shape = (3, 2)
block_axis = 1
# The block shape is [x.shape[i] // y_scale.shape[i] for i in range(len(x.shape))] = (1, 2)
assert all(
x.shape[i] == y_scale.shape[i]
for i in range(len(x.shape))
if i != block_axis
)
assert x.shape[block_axis] % y_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // y_scale.shape[block_axis]
# Create element-wise scale and zero point
y_scale_elementwise = np.repeat(y_scale, repeats=repeats, axis=block_axis)
y_val = np.clip(
np.rint(x / y_scale_elementwise), a_min=-32768, a_max=32767
).astype(np.int16)
y = make_tensor(
"y",
TensorProto.INT16,
x.shape,
y_val,
)
expect(
node,
inputs=[x, y_scale],
outputs=[y],
name="test_quantizelinear_blocked_symmetric",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("y_zero_point", TensorProto.FLOAT8E4M3FN, [1], [0])
y = make_tensor("y", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, 96])
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_e4m3fn",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("y_zero_point", TensorProto.FLOAT8E5M2, [1], [0.0])
y = make_tensor("y", TensorProto.FLOAT8E5M2, [5], [0, 0.5, 1, 49152, 96])
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_e5m2",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=0,
)
x = np.array(
[
[0.0, 2.5, 4.8, 8.6],
[-30, -20, 6, 9],
[-0.0, -2.5, -4.8, -8.6],
]
).astype(np.float32)
y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
"y_zero_point",
TensorProto.FLOAT4E2M1,
y_scale.shape,
np.zeros_like(y_scale),
)
y = make_tensor(
"y",
TensorProto.FLOAT4E2M1,
x.shape,
[0, 1, 2, 4, -6, -6, 2, 3, 0, -0.5, -1, -2],
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_float4e2m1",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array(
[
0.0,
-514.0,
3.0,
-3.0,
2.9,
-2.9,
3.1,
-3.1,
65022.0,
-66046.0,
65023.0,
-66047.0,
65024.0,
-66048.0,
70000.0,
-70000.0,
]
).astype(np.float32)
y_scale = np.float32(2.0)
y_zero_point = np.int16(256)
y = np.array(
[
256,
-1,
258,
254,
257,
255,
258,
254,
32767,
-32767,
32767,
-32768,
32767,
-32768,
32767,
-32768,
]
).astype(np.int16)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_int16",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=0,
)
x = np.array(
[
[0.0, 2.5, 4.8, 8.6],
[-4.0, -3.0, 1.0, 2.0],
[-0.0, -2.5, -4.8, -8.6],
],
dtype=np.float32,
)
y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
"y_zero_point", TensorProto.INT2, y_scale.shape, np.zeros_like(y_scale)
)
y = make_tensor(
"y", TensorProto.INT2, x.shape, [0, 1, 1, 1, -1, -1, 0, 1, 0, -1, -1, -2]
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_int2",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=0,
)
x = np.array(
[
[0.0, 2.5, 4.8, 8.6],
[-30, -20, 6, 9],
[12, 15, 16, 40],
]
).astype(np.float32)
y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
"y_zero_point", TensorProto.INT4, y_scale.shape, np.ones_like(y_scale)
)
y = make_tensor(
"y", TensorProto.INT4, x.shape, [1, 2, 3, 5, -8, -6, 3, 4, 4, 5, 5, 7]
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_int4",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array([0, 2, 3, 1000, -254, -1000]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = np.uint8(128)
y = np.array([128, 129, 130, 255, 1, 0]).astype(np.uint8)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
)
x = np.array(
[
0.0,
-128.0,
3.0,
-3.0,
2.9,
-2.9,
3.1,
-3.1,
65536.0,
-65534.0,
70000.0,
-70000.0,
]
).astype(np.float32)
y_scale = np.float32(2.0)
y_zero_point = np.uint16(32767)
y = np.array(
[
32767,
32703,
32769,
32765,
32768,
32766,
32769,
32765,
65535,
0,
65535,
0,
]
).astype(np.uint16)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_uint16",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=0,
)
x = np.array(
[
[0.0, 2.5, 4.8, 8.6],
[-2.0, -1.0, 1.0, 3.0],
[4.0, 5.0, 6.0, 7.0],
],
dtype=np.float32,
)
y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
"y_zero_point", TensorProto.UINT2, y_scale.shape, np.zeros_like(y_scale)
)
y = make_tensor(
"y", TensorProto.UINT2, x.shape, [0, 1, 2, 3, 0, 0, 0, 1, 1, 1, 2, 2]
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_uint2",
)
node = onnx.helper.make_node(
"QuantizeLinear",
inputs=["x", "y_scale", "y_zero_point"],
outputs=["y"],
axis=0,
)
x = np.array(
[
[0.0, 2.5, 4.8, 8.6],
[-30, -20, 6, 9],
[12, 15, 16, 40],
]
).astype(np.float32)
y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
"y_zero_point", TensorProto.UINT4, y_scale.shape, np.ones_like(y_scale)
)
y = make_tensor(
"y", TensorProto.UINT4, x.shape, [1, 2, 3, 5, 0, 0, 3, 4, 4, 5, 5, 11]
)
expect(
node,
inputs=[x, y_scale, y_zero_point],
outputs=[y],
name="test_quantizelinear_uint4",
)
This is RMS normalization defined in ONNX as function as described in the paper https://arxiv.org/pdf/1910.07467.
The overall computation can be split into two stages. The root mean squared norm is taken over the last D dimensions,
where D is the dimension of normalized_shape. For example, if normalized_shape is (3, 5) (a 2-dimensional shape),
the rms norm is computed over the last 2 dimensions of the input. The computation required by standardization can be
described by the following equations.
XSquared = Mul(X, X) XSquaredMean = ReduceMean<axes=normalized_axes>(XSquared) MeanSquareEpsilon = Add(XSquaredMean, epsilon) RMS = Sqrt(MeanSquareEpsilon) Normalized = Div(X, RMS)
where normalized_axes is [axis, ..., rank of X - 1]. The variables RMS stand for root mean square,
Depending on stash_type attribute, the actual computation
must happen in different floating-point precision.
For example, if stash_type is 1, this operator casts
all input variables to 32-bit float, perform the computation, and
finally cast Normalized back to the original type of X.
The second stage then scales the outcome of the first stage using:
Y= Mul(Normalized, Scale)
Let d[i] indicate the i-th dimension of X.
If X's shape is [d[0], ..., d[axis-1], d[axis], ..., d[rank-1]],
the shape of RMS is [d[0], ..., d[axis-1], 1, ..., 1].
Y and X have the same shape. This operator supports unidirectional broadcasting
(Scale should be unidirectional broadcastable to tensor X);
for more details please check the doc.
This version of the operator has been available since version 23 of the default ONNX operator set.
X = np.random.randn(3, 4).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
Y = _rms_normalization(X, W, axis=axis)
node = onnx.helper.make_node(
"RMSNormalization",
inputs=["X", "W"],
outputs=["Y"],
axis=axis,
)
if axis < 0:
name = f"test_rms_normalization_2d_axis_negative_{-axis}"
else:
name = f"test_rms_normalization_2d_axis{axis}"
expect(node, inputs=[X, W], outputs=[Y], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
epsilon = 1e-1
X = np.random.randn(2, 3, 5).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
Y = _rms_normalization(X, W, axis=axis, epsilon=epsilon)
node = onnx.helper.make_node(
"RMSNormalization",
inputs=["X", "W"],
outputs=["Y"],
axis=axis,
epsilon=epsilon,
)
if axis < 0:
name = f"test_rms_normalization_3d_axis_negative_{-axis}_epsilon"
else:
name = f"test_rms_normalization_3d_axis{axis}_epsilon"
expect(node, inputs=[X, W], outputs=[Y], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
X = np.random.randn(2, 3, 4, 5).astype(np.float32)
# Default axis in RMSNormalization is -1.
normalized_shape = calculate_normalized_shape(X.shape, -1)
W = np.random.randn(*normalized_shape).astype(np.float32)
# Axis is default to -1 in the reference implementation.
Y = _rms_normalization(X, W)
# Not specifying axis attribute means -1.
node = onnx.helper.make_node(
"RMSNormalization",
inputs=["X", "W"],
outputs=["Y"],
)
expect(
node,
inputs=[X, W],
outputs=[Y],
name="test_rms_normalization_default_axis",
)
X = np.random.randn(2, 3, 4, 5).astype(np.float32)
def case(axis: int) -> None:
normalized_shape = calculate_normalized_shape(X.shape, axis)
W = np.random.randn(*normalized_shape).astype(np.float32)
Y = _rms_normalization(X, W, axis=axis)
node = onnx.helper.make_node(
"RMSNormalization",
inputs=["X", "W"],
outputs=["Y"],
axis=axis,
)
if axis < 0:
name = f"test_rms_normalization_4d_axis_negative_{-axis}"
else:
name = f"test_rms_normalization_4d_axis{axis}"
expect(node, inputs=[X, W], outputs=[Y], name=name)
for i in range(len(X.shape)):
case(i)
case(i - len(X.shape))
Computes an one-layer simple RNN. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X - input tensori - input gatet - time step (t-1 means previous time step)Wi - W parameter weight matrix for input gateRi - R recurrence weight matrix for input gateWbi - W parameter bias vector for input gateRbi - R parameter bias vector for input gateWBi - W parameter weight matrix for backward input gateRBi - R recurrence weight matrix for backward input gateWBbi - WR bias vectors for backward input gateRBbi - RR bias vectors for backward input gateH - Hidden statenum_directions - 2 if direction == bidirectional else 1Activation functions:
NOTE: Below are optional
Equations (Default: f=Tanh):
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RNN-1">1</a>, <a href="Changelog.md#RNN-7">7</a>, <a href="Changelog.md#RNN-14">14</a>
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 4
weight_scale = 0.5
layout = 1
node = onnx.helper.make_node(
"RNN",
inputs=["X", "W", "R"],
outputs=["Y", "Y_h"],
hidden_size=hidden_size,
layout=layout,
)
W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)
rnn = RNNHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = rnn.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
name="test_simple_rnn_batchwise",
)
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)
input_size = 2
hidden_size = 4
weight_scale = 0.1
node = onnx.helper.make_node(
"RNN", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)
W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)
rnn = RNNHelper(X=input, W=W, R=R)
_, Y_h = rnn.step()
expect(
node,
inputs=[input, W, R],
outputs=[Y_h.astype(np.float32)],
name="test_simple_rnn_defaults",
)
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
np.float32
)
input_size = 3
hidden_size = 5
custom_bias = 0.1
weight_scale = 0.1
node = onnx.helper.make_node(
"RNN",
inputs=["X", "W", "R", "B"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)
# Adding custom bias
W_B = custom_bias * np.ones((1, hidden_size)).astype(np.float32)
R_B = np.zeros((1, hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)
rnn = RNNHelper(X=input, W=W, R=R, B=B)
_, Y_h = rnn.step()
expect(
node,
inputs=[input, W, R, B],
outputs=[Y_h.astype(np.float32)],
name="test_simple_rnn_with_initial_bias",
)
input = np.array(
[
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]],
[[10.0, 11.0, 12.0], [13.0, 14.0, 15.0], [16.0, 17.0, 18.0]],
]
).astype(np.float32)
input_size = 3
hidden_size = 5
node = onnx.helper.make_node(
"RNN",
inputs=["X", "W", "R", "B"],
outputs=["", "Y_h"],
hidden_size=hidden_size,
)
W = np.random.randn(1, hidden_size, input_size).astype(np.float32)
R = np.random.randn(1, hidden_size, hidden_size).astype(np.float32)
# Adding custom bias
W_B = np.random.randn(1, hidden_size).astype(np.float32)
R_B = np.random.randn(1, hidden_size).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)
rnn = RNNHelper(X=input, W=W, R=R, B=B)
_, Y_h = rnn.step()
expect(
node,
inputs=[input, W, R, B],
outputs=[Y_h.astype(np.float32)],
name="test_rnn_seq_length",
)
Generate a tensor with random values drawn from a normal distribution. The shape
of the tensor is specified by the shape argument and the parameter of the normal distribution
specified by mean and scale.
The data type is specified by the 'dtype' argument. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RandomNormal-1">1</a>
Generate a tensor with random values drawn from a normal distribution.
The shape of the output tensor is copied from the shape of the input tensor,
and the parameters of the normal distribution are specified by mean and scale.
The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message, and be valid as an output type.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RandomNormalLike-1">1</a>
Generate a tensor with random values drawn from a uniform distribution. The shape
of the tensor is specified by the shape argument and the range by low and high.
The data type is specified by the 'dtype' argument. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RandomUniform-1">1</a>
Generate a tensor with random values drawn from a uniform distribution.
The shape of the output tensor is copied from the shape of the input tensor,
and the parameters of the uniform distribution are specified by low and high.
The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message and be valid as an output type.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RandomUniformLike-1">1</a>
Generate a tensor containing a sequence of numbers that begin at start and extends by increments of delta
up to limit (exclusive).
The number of elements in the output of range is computed as below:
number_of_elements = max( ceil( (limit - start) / delta ) , 0 )
The pseudocode determining the contents of the output is shown below:
for(int i=0; i<number_of_elements; ++i) {
output[i] = start + (i * delta);
}
Example 1
Inputs: start = 3, limit = 9, delta = 3
Output: [3, 6]
Example 2
Inputs: start = 10, limit = 4, delta = -2
Output: [10, 8, 6]
This version of the operator has been available since version 11 of the default ONNX operator set.
node = onnx.helper.make_node(
"Range",
inputs=["start", "limit", "delta"],
outputs=["output"],
)
start = np.float32(1)
limit = np.float32(5)
delta = np.float32(2)
output = np.arange(
start, limit, delta, dtype=np.float32
) # expected output [1.0, 3.0]
expect(
node,
inputs=[start, limit, delta],
outputs=[output],
name="test_range_float_type_positive_delta",
)
node = onnx.helper.make_node(
"Range",
inputs=["start", "limit", "delta"],
outputs=["output"],
)
start = np.int32(10)
limit = np.int32(6)
delta = np.int32(-3)
output = np.arange(
start, limit, delta, dtype=np.int32
) # expected output [10, 7]
expect(
node,
inputs=[start, limit, delta],
outputs=[output],
name="test_range_int32_type_negative_delta",
)
Reciprocal takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the reciprocal is, y = 1/x, is applied to the tensor elementwise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Reciprocal-1">1</a>, <a href="Changelog.md#Reciprocal-6">6</a>
node = onnx.helper.make_node(
"Reciprocal",
inputs=["x"],
outputs=["y"],
)
x = np.array([-4, 2]).astype(np.float32)
y = np.reciprocal(x) # expected output [-0.25, 0.5],
expect(node, inputs=[x], outputs=[y], name="test_reciprocal_example")
x = np.random.rand(3, 4, 5).astype(np.float32) + 0.5
y = np.reciprocal(x)
expect(node, inputs=[x], outputs=[y], name="test_reciprocal")
Computes the L1 norm of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields 0.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceL1-1">1</a>, <a href="Changelog.md#ReduceL1-11">11</a>, <a href="Changelog.md#ReduceL1-13">13</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL1", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sum(a=np.abs(data), axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[78.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=None, keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceL1",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[3., 7.], [11., 15.], [19., 23.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_do_not_keepdims_random",
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceL1",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_empty_set",
)
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL1",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3.], [7.]], [[11.], [15.]], [[19.], [23.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_keep_dims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_keep_dims_random",
)
shape = [3, 2, 2]
axes = np.array([-1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL1",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3.], [7.]], [[11.], [15.]], [[19.], [23.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_negative_axes_keep_dims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l1_negative_axes_keep_dims_random",
)
Computes the L2 norm of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields 0.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceL2-1">1</a>, <a href="Changelog.md#ReduceL2-11">11</a>, <a href="Changelog.md#ReduceL2-13">13</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL2", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sqrt(np.sum(a=np.square(data), axis=None, keepdims=keepdims == 1))
# print(reduced)
# [[[25.49509757]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(np.sum(a=np.square(data), axis=None, keepdims=keepdims == 1))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceL2",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[2.23606798, 5.],
# [7.81024968, 10.63014581],
# [13.45362405, 16.2788206]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_do_not_keepdims_random",
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceL2",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_empty_set",
)
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL2",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[[2.23606798], [5.]]
# [[7.81024968], [10.63014581]]
# [[13.45362405], [16.2788206 ]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_keep_dims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_keep_dims_random",
)
shape = [3, 2, 2]
axes = np.array([-1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceL2",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[[2.23606798], [5.]]
# [[7.81024968], [10.63014581]]
# [[13.45362405], [16.2788206 ]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_negative_axes_keep_dims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_l2_negative_axes_keep_dims_random",
)
Computes the log sum of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or undefined otherwise.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceLogSum-1">1</a>, <a href="Changelog.md#ReduceLogSum-11">11</a>, <a href="Changelog.md#ReduceLogSum-13">13</a>
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceLogSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = np.log(zero) # -inf
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_empty_set",
)
node = onnx.helper.make_node(
"ReduceLogSum", inputs=["data", "axes"], outputs=["reduced"]
)
data = np.random.ranf([3, 4, 5]).astype(np.float32)
reduced = np.log(np.sum(data, keepdims=True))
axes = np.array([], dtype=np.int64)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_default",
)
axes = np.array([-2], dtype=np.int64)
node = onnx.helper.make_node(
"ReduceLogSum", inputs=["data", "axes"], outputs=["reduced"]
)
data = np.random.ranf([3, 4, 5]).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=True))
# print(reduced)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_negative_axes",
)
shape = [3, 4, 5]
axes = np.array([2, 1], dtype=np.int64)
node = onnx.helper.make_node(
"ReduceLogSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=0,
)
data = np.random.ranf(shape).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=False))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_desc_axes",
)
axes = np.array([0, 1], dtype=np.int64)
node = onnx.helper.make_node(
"ReduceLogSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=0,
)
data = np.random.ranf(shape).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=False))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_asc_axes",
)
Computes the log sum exponent of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or undefined otherwise.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceLogSumExp-1">1</a>, <a href="Changelog.md#ReduceLogSumExp-11">11</a>, <a href="Changelog.md#ReduceLogSumExp-13">13</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceLogSumExp",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=None, keepdims=keepdims == 1))
# print(reduced)
# [[[60.00671387]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=None, keepdims=keepdims == 1))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceLogSumExp",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[20., 2.31326175]
# [40.00004578, 2.31326175]
# [60.00671387, 2.31326175]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_do_not_keepdims_random",
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceLogSumExp",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = np.log(zero) # -inf
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceLogSumExp",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[[20., 2.31326175]]
# [[40.00004578, 2.31326175]]
# [[60.00671387, 2.31326175]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceLogSumExp",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[[20., 2.31326175]]
# [[40.00004578, 2.31326175]]
# [[60.00671387, 2.31326175]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_negative_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(
np.sum(np.exp(data), axis=tuple(axes.tolist()), keepdims=keepdims == 1)
)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_log_sum_exp_negative_axes_keepdims_random",
)
Computes the max of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or the minimum value of the data type otherwise.
If the input data type is Boolean, the comparison should consider False < True.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 20 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceMax-1">1</a>, <a href="Changelog.md#ReduceMax-11">11</a>, <a href="Changelog.md#ReduceMax-12">12</a>, <a href="Changelog.md#ReduceMax-13">13</a>, <a href="Changelog.md#ReduceMax-18">18</a>
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMax",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[True, True], [True, False], [False, True], [False, False]],
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=bool(keepdims))
# print(reduced)
# [[True],
# [True],
# [True],
# [False]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_bool_inputs",
)
shape = [3, 2, 2]
axes = None
keepdims = 1
node = onnx.helper.make_node(
"ReduceMax", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=axes, keepdims=keepdims == 1)
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_max_default_axes_keepdim_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=axes, keepdims=keepdims == 1)
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_max_default_axes_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceMax",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[20., 2.]
# [40., 2.]
# [60., 2.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_do_not_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_do_not_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceMax",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
one = np.array(np.ones(reduced_shape, dtype=np.float32))
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = -(one / zero) # -inf
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMax",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[20., 2.]]
# [[40., 2.]]
# [[60., 2.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMax",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[20., 2.]]
# [[40., 2.]]
# [[60., 2.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_negative_axes_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_max_negative_axes_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
Computes the mean of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields undefined.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceMean-1">1</a>, <a href="Changelog.md#ReduceMean-11">11</a>, <a href="Changelog.md#ReduceMean-13">13</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMean",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.mean(data, axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[18.25]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=None, keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceMean",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[12.5, 1.5]
# [35., 1.5]
# [57.5, 1.5]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_do_not_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMean",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[12.5, 1.5]]
# [[35., 1.5]]
# [[57.5, 1.5]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMean",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[12.5, 1.5]]
# [[35., 1.5]]
# [[57.5, 1.5]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_negative_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_mean_negative_axes_keepdims_random",
)
Computes the min of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields plus infinity (if supported by the datatype) or the maximum value of the data type otherwise.
If the input data type is Boolean, the comparison should consider False < True.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 20 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceMin-1">1</a>, <a href="Changelog.md#ReduceMin-11">11</a>, <a href="Changelog.md#ReduceMin-12">12</a>, <a href="Changelog.md#ReduceMin-13">13</a>, <a href="Changelog.md#ReduceMin-18">18</a>
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMin",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[True, True], [True, False], [False, True], [False, False]],
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=bool(keepdims))
# print(reduced)
# [[ True],
# [False],
# [False],
# [False]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_bool_inputs",
)
shape = [3, 2, 2]
axes = None
keepdims = 1
node = onnx.helper.make_node(
"ReduceMin", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=axes, keepdims=keepdims == 1)
# print(reduced)
# [[[1.]]]
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_min_default_axes_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=axes, keepdims=keepdims == 1)
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_min_default_axes_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceMin",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[5., 1.]
# [30., 1.]
# [55., 1.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_do_not_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_do_not_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceMin",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
one = np.array(np.ones(reduced_shape, dtype=np.float32))
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = one / zero # inf
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMin",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[5., 1.]]
# [[30., 1.]]
# [[55., 1.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceMin",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[5., 1.]]
# [[30., 1.]]
# [[55., 1.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_negative_axes_keepdims_example",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_min_negative_axes_keepdims_random",
opset_imports=[onnx.helper.make_opsetid("", 18)],
)
Computes the product of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields 1.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceProd-1">1</a>, <a href="Changelog.md#ReduceProd-11">11</a>, <a href="Changelog.md#ReduceProd-13">13</a>
shape = [3, 2, 2]
axes = None
keepdims = 1
node = onnx.helper.make_node(
"ReduceProd", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=axes, keepdims=keepdims == 1)
# print(reduced)
# [[[4.790016e+08]]]
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_prod_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=axes, keepdims=keepdims == 1)
expect(
node,
inputs=[data],
outputs=[reduced],
name="test_reduce_prod_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceProd",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[3., 8.]
# [35., 48.]
# [99., 120.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_do_not_keepdims_random",
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceProd",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.ones(reduced_shape, dtype=np.float32))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceProd",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3., 8.]]
# [[35., 48.]]
# [[99., 120.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceProd",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3., 8.]]
# [[35., 48.]]
# [[99., 120.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_negative_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_prod_negative_axes_keepdims_random",
)
Computes the sum of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields 0.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceSum-1">1</a>, <a href="Changelog.md#ReduceSum-11">11</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[78.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=None, keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[4., 6.]
# [12., 14.]
# [20., 22.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_do_not_keepdims_random",
)
shape = [3, 2, 2]
keepdims = 1
node = onnx.helper.make_node(
"ReduceSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
noop_with_empty_axes=True,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
axes = np.array([], dtype=np.int64)
reduced = np.array(data)
# print(reduced)
# [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_empty_axes_input_noop_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.array(data)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_empty_axes_input_noop",
)
"""Test case with the reduced-axis of size zero."""
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[[4., 6.]]
# [[12., 14.]]
# [[20., 22.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[[4., 6.]]
# [[12., 14.]]
# [[20., 22.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_negative_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_negative_axes_keepdims_random",
)
"""Test case with the non-reduced-axis of size zero."""
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 0, 1]
node = onnx.helper.make_node(
"ReduceSum",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([2], dtype=np.int64)
reduced = np.array([], dtype=np.float32).reshape(reduced_shape)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_empty_set_non_reduced_axis_zero",
)
Computes the sum square of the input tensor's elements along the provided axes. The resulting
tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then
the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
valid. Reduction over an empty set of values yields 0.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims
to False instead of True.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ReduceSumSquare-1">1</a>, <a href="Changelog.md#ReduceSumSquare-11">11</a>, <a href="Changelog.md#ReduceSumSquare-13">13</a>
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSumSquare",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[650.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_default_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=None, keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_default_axes_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
"ReduceSumSquare",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[10., 20.]
# [74., 100.]
# [202., 244.]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_do_not_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_do_not_keepdims_random",
)
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]
node = onnx.helper.make_node(
"ReduceSumSquare",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_empty_set",
)
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSumSquare",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[10., 20.]]
# [[74., 100.]]
# [[202., 244.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_keepdims_random",
)
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
"ReduceSumSquare",
inputs=["data", "axes"],
outputs=["reduced"],
keepdims=keepdims,
)
data = np.array(
[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[10., 20.s]]
# [[74., 100.]]
# [[202., 244.]]]
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_negative_axes_keepdims_example",
)
np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
expect(
node,
inputs=[data, axes],
outputs=[reduced],
name="test_reduce_sum_square_negative_axes_keepdims_random",
)
RegexFullMatch performs a full regex match on each element of the input tensor. If an element fully matches the regex pattern specified as an attribute, the corresponding element in the output is True and it is False otherwise. RE2 regex syntax is used.
This version of the operator has been available since version 20 of the default ONNX operator set.
node = onnx.helper.make_node(
"RegexFullMatch",
inputs=["X"],
outputs=["Y"],
pattern=r"www\.[\w.-]+\.\bcom\b",
)
x = np.array(["www.google.com", "www.facebook.com", "www.bbc.co.uk"]).astype(
object
)
result = np.array([True, True, False])
expect(node, inputs=[x], outputs=[result], name="test_regex_full_match_basic")
node = onnx.helper.make_node(
"RegexFullMatch",
inputs=["X"],
outputs=["Y"],
pattern=r"(\W|^)[\w.\-]{0,25}@(yahoo|gmail)\.com(\W|$)",
)
x = np.array(
[
["[email protected]", "[email protected]"],
["not email", "[email protected]"],
]
).astype(object)
result = np.array([[True, False], [False, True]])
expect(
node,
inputs=[x],
outputs=[result],
name="test_regex_full_match_email_domain",
)
node = onnx.helper.make_node(
"RegexFullMatch",
inputs=["X"],
outputs=["Y"],
pattern=r"(\W|^)[\w.\-]{0,25}@(yahoo|gmail)\.com(\W|$)",
)
x = np.array([[], []]).astype(object)
result = np.array([[], []]).astype(bool)
expect(
node,
inputs=[x],
outputs=[result],
name="test_regex_full_match_empty",
)
Relu takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the rectified linear function, y = max(0, x), is applied to the tensor elementwise.
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Relu-1">1</a>, <a href="Changelog.md#Relu-6">6</a>, <a href="Changelog.md#Relu-13">13</a>
node = onnx.helper.make_node(
"Relu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf)
expect(node, inputs=[x], outputs=[y], name="test_relu")
Reshape the input tensor similar to numpy.reshape. First input is the data tensor, second input is a shape tensor which specifies the output shape. It outputs the reshaped tensor. At most one dimension of the new shape can be -1. In this case, the value is inferred from the size of the tensor and the remaining dimensions. A dimension could also be 0, in which case the actual dimension value is unchanged (i.e. taken from the input tensor). If 'allowzero' is set, and the new shape includes 0, the dimension will be set explicitly to zero (i.e. not taken from input tensor). Shape (second input) could be an empty shape, which means converting to a scalar. The input tensor's shape and the output tensor's shape are required to have the same number of elements.
If the attribute 'allowzero' is set, it is invalid for the specified shape to contain both a zero value and -1, as the value of the dimension corresponding to -1 cannot be determined uniquely.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Reshape-1">1</a>, <a href="Changelog.md#Reshape-5">5</a>, <a href="Changelog.md#Reshape-13">13</a>, <a href="Changelog.md#Reshape-14">14</a>, <a href="Changelog.md#Reshape-19">19</a>, <a href="Changelog.md#Reshape-21">21</a>, <a href="Changelog.md#Reshape-23">23</a>, <a href="Changelog.md#Reshape-24">24</a>
original_shape = [0, 3, 4]
test_cases = {
"allowzero_reordered": np.array([3, 4, 0], dtype=np.int64),
}
data = np.random.random_sample(original_shape).astype(np.float32)
for test_name, shape in test_cases.items():
node = onnx.helper.make_node(
"Reshape",
inputs=["data", "shape"],
outputs=["reshaped"],
allowzero=1, # if allowzero=1, final shape = (3, 4, 0)
# if allowzero=0, final shape = (3, 4, 4)
)
reshaped = reshape_reference_implementation(data, shape, allowzero=1)
expect(
node,
inputs=[data, shape],
outputs=[reshaped],
name="test_reshape_" + test_name,
)
original_shape = [2, 3, 4]
test_cases = {
"reordered_all_dims": np.array([4, 2, 3], dtype=np.int64),
"reordered_last_dims": np.array([2, 4, 3], dtype=np.int64),
"reduced_dims": np.array([2, 12], dtype=np.int64),
"extended_dims": np.array([2, 3, 2, 2], dtype=np.int64),
"one_dim": np.array([24], dtype=np.int64),
"negative_dim": np.array([2, -1, 2], dtype=np.int64),
"negative_extended_dims": np.array([-1, 2, 3, 4], dtype=np.int64),
"zero_dim": np.array([2, 0, 4, 1], dtype=np.int64),
"zero_and_negative_dim": np.array([2, 0, 1, -1], dtype=np.int64),
}
data = np.random.random_sample(original_shape).astype(np.float32)
for test_name, shape in test_cases.items():
node = onnx.helper.make_node(
"Reshape",
inputs=["data", "shape"],
outputs=["reshaped"],
)
reshaped = reshape_reference_implementation(data, shape)
expect(
node,
inputs=[data, shape],
outputs=[reshaped],
name="test_reshape_" + test_name,
)
Resize the input tensor. In general, it calculates every value in the output tensor as a weighted average of neighborhood (a.k.a. sampling locations) in the input tensor. Each dimension value of the output tensor is:
output_dimension = floor(input_dimension * (roi_end - roi_start) * scale)
if input "sizes" is not specified.
This version of the operator has been available since version 19 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Resize-10">10</a>, <a href="Changelog.md#Resize-11">11</a>, <a href="Changelog.md#Resize-13">13</a>, <a href="Changelog.md#Resize-18">18</a>
The coordinate of each dimension is transformed individually. Let's describe a case using axis x as an example.
Denote x_resized as the coordinate of axis x in the resized tensor,
x_original as the coordinate of axis x in the original tensor,
length_original as the length of the original tensor in axis x,
length_resized as the length of the resized tensor in axis x,
scale = length_resized / length_original,
output_width the target length on the axis x which can be a fractional number when it is calculated out of a scale factor,
and output_width_int the effective output width as an integer.
if coordinate_transformation_mode is "half_pixel",
x_original = (x_resized + 0.5) / scale - 0.5
if coordinate_transformation_mode is "half_pixel_symmetric",
adjustment = output_width_int / output_width
center = input_width / 2
offset = center * (1 - adjustment)
x_ori = offset + (x + 0.5) / scale - 0.5
if coordinate_transformation_mode is "pytorch_half_pixel",
x_original = length_resized > 1 ? (x_resized + 0.5) / scale - 0.5 : 0
if coordinate_transformation_mode is "align_corners",
x_original = x_resized * (length_original - 1) / (length_resized - 1)
if coordinate_transformation_mode is "asymmetric",
x_original = x_resized / scale
if coordinate_transformation_mode is "tf_crop_and_resize",
x_original = length_resized > 1 ? start_x * (length_original - 1) + x_resized * (end_x - start_x) * (length_original - 1) / (length_resized - 1) : 0.5 * (start_x + end_x) * (length_original - 1)
.</dd>
<dt><tt>cubic_coeff_a</tt> : float (default is -0.75)</dt> <dd>The coefficient 'a' used in cubic interpolation. Two common choice are -0.5 (in some cases of TensorFlow) and -0.75 (in PyTorch). Check out Equation (4) in https://ieeexplore.ieee.org/document/1163711 for the details. This attribute is valid only if mode is "cubic".</dd> <dt><tt>exclude_outside</tt> : int (default is 0)</dt> <dd>If set to 1, the weight of sampling locations outside the tensor will be set to 0 and the weight will be renormalized so that their sum is 1.0. The default value is 0.</dd> <dt><tt>extrapolation_value</tt> : float (default is 0.0)</dt> <dd>When coordinate_transformation_mode is "tf_crop_and_resize" and x_original is outside the range [0, length_original - 1], this value is used as the corresponding output value. Default is 0.0f.</dd> <dt><tt>keep_aspect_ratio_policy</tt> : string (default is stretch)</dt> <dd> This attribute describes how to interpret the `sizes` input with regard to keeping the original aspect ratio of the input, and it is not applicable when the `scales` input is used.Given a set of sizes, associated with a subset of axes (explicitly provided or default), and assuming d = axes[i], with i being the index of the provided sizes.
If keep_aspect_ratio_policy is "stretch", the original aspect ratio is disregarded, and the input is resized to the specified size:
out_size[d] = sizes[i]
If keep_aspect_ratio_policy is "not_larger", the sizes are adjusted so that no extent of the output is larger than the specified size, while keeping the original aspect ratio:
scale = Min(sizes[i] / in_size[d])
out_size[d] = round_int(scale * in_size[d])
If keep_aspect_ratio_policy is "not_smaller", the sizes are adjusted so that no extent of the output is smaller than the specified size, while keeping the original aspect ratio:
scale = Max(sizes[i] / in_size[d])
out_size[d] = round_int(scale * in_size[d])
For non-resizable axes (those not specified in axes), the output size will be equal to the input size.
Note: round_int stands for computing the nearest integer value, rounding halfway cases up.</dd>
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)
# [[[[ 1.47119141 2.78125 4.08251953]
# [ 6.71142578 8.02148438 9.32275391]
# [11.91650391 13.2265625 14.52783203]]]]
output = interpolate_nd(
data, lambda x, _: cubic_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_cubic",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
cubic_coeff_a=-0.5,
exclude_outside=True,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)
# [[[[ 1.36812675 2.6695014 4.0133367 ]
# [ 6.57362535 7.875 9.2188353 ]
# [11.94896657 13.25034122 14.59417652]]]]
output = interpolate_nd(
data,
lambda x, _: cubic_coeffs(x, A=-0.5),
scale_factors=scales,
exclude_outside=True,
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_cubic_A_n0p5_exclude_outside",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
coordinate_transformation_mode="align_corners",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)
# [[[[ 1. 2.39519159 3.79038317]
# [ 6.58076634 7.97595793 9.37114951]
# [12.16153268 13.55672427 14.95191585]]]]
output = interpolate_nd(
data,
lambda x, _: cubic_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="align_corners",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_cubic_align_corners",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
antialias=1,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)
# [[[[ 2.5180721 4.2858863]
# [ 9.589329 11.357142 ]]]]
output = interpolate_nd(
data, cubic_coeffs_antialias, scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_cubic_antialias",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)
# [[[[2.6666665 4.3333331]]]]
output = interpolate_nd(
data, lambda x, _: linear_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_linear",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="align_corners",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)
# [[[[1. 3.142857]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="align_corners",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_linear_align_corners",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
antialias=1,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)
# [[[[ 2.875 4.5 ]
# [ 9.375 11. ]]]]
output = interpolate_nd(
data, linear_coeffs_antialias, scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_linear_antialias",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="half_pixel_symmetric",
)
data = np.array([[[[1, 2, 3, 4]]]], dtype=np.float32)
scales = np.array([1.0, 1.0, 1.0, 0.6], dtype=np.float32)
# [[[[1.6666667, 3.3333333]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="half_pixel_symmetric",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_linear_half_pixel_symmetric",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="nearest",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)
# [[[[1. 3.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_downsample_scales_nearest",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="cubic",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)
# [[[[ 1.63078704 3.00462963 4.37847222]
# [ 7.12615741 8.5 9.87384259]
# [12.62152778 13.99537037 15.36921296]]]]
output = interpolate_nd(
data, lambda x, _: cubic_coeffs(x), output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_cubic",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="cubic",
antialias=1,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)
# [[[[ 1.7750092 3.1200073 4.4650054]
# [ 7.1550016 8.5 9.844998 ]
# [12.534994 13.8799925 15.224991 ]]]]
output = interpolate_nd(data, cubic_coeffs_antialias, output_size=sizes).astype(
np.float32
)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_cubic_antialias",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="linear",
antialias=1,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)
# [[[[ 2.3636363 3.590909 4.818182 ]
# [ 7.2727275 8.5 9.727273 ]
# [12.181818 13.409091 14.636364 ]]]]
output = interpolate_nd(
data, linear_coeffs_antialias, output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_linear_antialias",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="pytorch_half_pixel",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 3, 1], dtype=np.int64)
# [[[[ 1.6666666]
# [ 7. ]
# [12.333333 ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
output_size=sizes,
coordinate_transformation_mode="pytorch_half_pixel",
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_linear_pytorch_half_pixel",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 1, 3], dtype=np.int64)
# [[[[1. 2. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_nearest",
)
keep_aspect_ratio_policy = "not_larger"
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 3], dtype=np.int64) # Results in 1x2
# [[[[1. 3.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x),
output_size=sizes,
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_nearest_not_larger",
)
keep_aspect_ratio_policy = "not_smaller"
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 3], dtype=np.int64) # Results in 2x3
# [[[[1. 2. 4.]
# [5. 6. 8.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x),
output_size=sizes,
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_downsample_sizes_nearest_not_smaller",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "roi", "", "sizes"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="tf_crop_and_resize",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0, 0, 0.4, 0.6, 1, 1, 0.6, 0.8], dtype=np.float32)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)
# [[[[ 7.6000004 7.9 8.2 ]
# [ 8.8 9.1 9.400001 ]
# [10. 10.3 10.6 ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
output_size=sizes,
roi=roi,
coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)
expect(
node,
inputs=[data, roi, sizes],
outputs=[output],
name="test_resize_tf_crop_and_resize",
)
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "roi", "", "sizes"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="tf_crop_and_resize",
axes=axes,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0.4, 0.6, 0.6, 0.8], dtype=np.float32)
sizes = np.array([3, 3], dtype=np.int64)
# [[[[ 7.6000004 7.9 8.2 ]
# [ 8.8 9.1 9.400001 ]
# [10. 10.3 10.6 ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
output_size=sizes,
roi=roi,
axes=axes,
coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)
expect(
node,
inputs=[data, roi, sizes],
outputs=[output],
name="test_resize_tf_crop_and_resize_axes_2_3",
)
axes = [3, 2]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "roi", "", "sizes"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="tf_crop_and_resize",
axes=axes,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0.6, 0.4, 0.8, 0.6], dtype=np.float32)
sizes = np.array([3, 3], dtype=np.int64)
# [[[[ 7.6000004 7.9 8.2 ]
# [ 8.8 9.1 9.400001 ]
# [10. 10.3 10.6 ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
output_size=sizes,
roi=roi,
axes=axes,
coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)
expect(
node,
inputs=[data, roi, sizes],
outputs=[output],
name="test_resize_tf_crop_and_resize_axes_3_2",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "roi", "", "sizes"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="tf_crop_and_resize",
extrapolation_value=10.0,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0, 0, 0.4, 0.6, 1, 1, 1.2, 1.7], dtype=np.float32)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)
# [[[[ 7.6000004 10. 10. ]
# [12.400001 10. 10. ]
# [10. 10. 10. ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
output_size=sizes,
roi=roi,
coordinate_transformation_mode="tf_crop_and_resize",
extrapolation_value=10.0,
).astype(np.float32)
expect(
node,
inputs=[data, roi, sizes],
outputs=[output],
name="test_resize_tf_crop_and_resize_extrapolation_value",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[ 0.47265625 0.76953125 1.24609375 1.875 2.28125
# 2.91015625 3.38671875 3.68359375]
# [ 1.66015625 1.95703125 2.43359375 3.0625 3.46875
# 4.09765625 4.57421875 4.87109375]
# [ 3.56640625 3.86328125 4.33984375 4.96875 5.375
# 6.00390625 6.48046875 6.77734375]
# [ 6.08203125 6.37890625 6.85546875 7.484375 7.890625
# 8.51953125 8.99609375 9.29296875]
# [ 7.70703125 8.00390625 8.48046875 9.109375 9.515625
# 10.14453125 10.62109375 10.91796875]
# [10.22265625 10.51953125 10.99609375 11.625 12.03125
# 12.66015625 13.13671875 13.43359375]
# [12.12890625 12.42578125 12.90234375 13.53125 13.9375
# 14.56640625 15.04296875 15.33984375]
# [13.31640625 13.61328125 14.08984375 14.71875 15.125
# 15.75390625 16.23046875 16.52734375]]]]
output = interpolate_nd(
data, lambda x, _: cubic_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_cubic",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
cubic_coeff_a=-0.5,
exclude_outside=True,
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[ 0.55882353 0.81494204 1.35698249 1.89705882 2.39705882
# 2.93713516 3.47917561 3.73529412]
# [ 1.58329755 1.83941606 2.38145651 2.92153285 3.42153285
# 3.96160918 4.50364964 4.75976814]
# [ 3.75145936 4.00757787 4.54961832 5.08969466 5.58969466
# 6.12977099 6.67181144 6.92792995]
# [ 5.91176471 6.16788321 6.70992366 7.25 7.75
# 8.29007634 8.83211679 9.08823529]
# [ 7.91176471 8.16788321 8.70992366 9.25 9.75
# 10.29007634 10.83211679 11.08823529]
# [10.07207005 10.32818856 10.87022901 11.41030534 11.91030534
# 12.45038168 12.99242213 13.24854064]
# [12.24023186 12.49635036 13.03839082 13.57846715 14.07846715
# 14.61854349 15.16058394 15.41670245]
# [13.26470588 13.52082439 14.06286484 14.60294118 15.10294118
# 15.64301751 16.18505796 16.44117647]]]]
output = interpolate_nd(
data,
lambda x, _: cubic_coeffs(x, A=-0.5),
scale_factors=scales,
exclude_outside=True,
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_cubic_A_n0p5_exclude_outside",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
coordinate_transformation_mode="align_corners",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[ 1. 1.34110787 1.80029155 2.32944606 2.67055394
# 3.19970845 3.65889213 4. ]
# [ 2.36443149 2.70553936 3.16472303 3.69387755 4.03498542
# 4.56413994 5.02332362 5.36443149]
# [ 4.20116618 4.54227405 5.00145773 5.53061224 5.87172012
# 6.40087464 6.86005831 7.20116618]
# [ 6.31778426 6.65889213 7.1180758 7.64723032 7.98833819
# 8.51749271 8.97667638 9.31778426]
# [ 7.68221574 8.02332362 8.48250729 9.01166181 9.35276968
# 9.8819242 10.34110787 10.68221574]
# [ 9.79883382 10.13994169 10.59912536 11.12827988 11.46938776
# 11.99854227 12.45772595 12.79883382]
# [11.63556851 11.97667638 12.43586006 12.96501458 13.30612245
# 13.83527697 14.29446064 14.63556851]
# [13. 13.34110787 13.80029155 14.32944606 14.67055394
# 15.19970845 15.65889213 16. ]]]]
output = interpolate_nd(
data,
lambda x, _: cubic_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="align_corners",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_cubic_align_corners",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="cubic",
coordinate_transformation_mode="asymmetric",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[ 1. 1.40625 2. 2.5 3. 3.59375 4.
# 4.09375]
# [ 2.625 3.03125 3.625 4.125 4.625 5.21875 5.625
# 5.71875]
# [ 5. 5.40625 6. 6.5 7. 7.59375 8.
# 8.09375]
# [ 7. 7.40625 8. 8.5 9. 9.59375 10.
# 10.09375]
# [ 9. 9.40625 10. 10.5 11. 11.59375 12.
# 12.09375]
# [11.375 11.78125 12.375 12.875 13.375 13.96875 14.375
# 14.46875]
# [13. 13.40625 14. 14.5 15. 15.59375 16.
# 16.09375]
# [13.375 13.78125 14.375 14.875 15.375 15.96875 16.375
# 16.46875]]]]
output = interpolate_nd(
data,
lambda x, _: cubic_coeffs(x, A=-0.75),
scale_factors=scales,
coordinate_transformation_mode="asymmetric",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_cubic_asymmetric",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[1. 1.25 1.75 2. ]
# [1.5 1.75 2.25 2.5 ]
# [2.5 2.75 3.25 3.5 ]
# [3. 3.25 3.75 4. ]]]]
output = interpolate_nd(
data, lambda x, _: linear_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_linear",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="align_corners",
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)
# [[[[1. 1.33333333 1.66666667 2. ]
# [1.66666667 2. 2.33333333 2.66666667]
# [2.33333333 2.66666667 3. 3.33333333]
# [3. 3.33333333 3.66666667 4. ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="align_corners",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_linear_align_corners",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="linear",
coordinate_transformation_mode="half_pixel_symmetric",
)
data = np.array([[[[1, 2], [3, 4]]]], dtype=np.float32)
scales = np.array([1.0, 1.0, 2.3, 2.94], dtype=np.float32)
# [[[[1. , 1.15986395, 1.5 , 1.84013605, 2. ],
# [1.56521738, 1.72508133, 2.06521738, 2.40535343, 2.56521738],
# [2.43478262, 2.59464657, 2.93478262, 3.27491867, 3.43478262],
# [3. , 3.15986395, 3.5 , 3.84013605, 4. ]]]]
output = interpolate_nd(
data,
lambda x, _: linear_coeffs(x),
scale_factors=scales,
coordinate_transformation_mode="half_pixel_symmetric",
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_linear_half_pixel_symmetric",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="nearest",
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 3.0], dtype=np.float32)
# [[[[1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 2. 2. 2.]
# [3. 3. 3. 4. 4. 4.]
# [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), scale_factors=scales
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_nearest",
)
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="nearest",
axes=axes,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([2.0, 3.0], dtype=np.float32)
# [[[[1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 2. 2. 2.]
# [3. 3. 3. 4. 4. 4.]
# [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), scale_factors=scales, axes=axes
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_nearest_axes_2_3",
)
axes = [3, 2]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "scales"],
outputs=["Y"],
mode="nearest",
axes=axes,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([3.0, 2.0], dtype=np.float32)
# [[[[1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 2. 2. 2.]
# [3. 3. 3. 4. 4. 4.]
# [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), scale_factors=scales, axes=axes
).astype(np.float32)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_resize_upsample_scales_nearest_axes_3_2",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="cubic",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 9, 10], dtype=np.int64)
# [[[[ 0.45507922 0.64057922 0.97157922 1.42257922 1.90732922
# 2.22332922 2.70807922 3.15907922 3.49007922 3.67557922]
# [ 1.39437963 1.57987963 1.91087963 2.36187963 2.84662963
# 3.16262963 3.64737963 4.09837963 4.42937963 4.61487963]
# [ 2.95130693 3.13680693 3.46780693 3.91880693 4.40355693
# 4.71955693 5.20430693 5.65530693 5.98630693 6.17180693]
# [ 5.20525069 5.39075069 5.72175069 6.17275069 6.65750069
# 6.97350069 7.45825069 7.90925069 8.24025069 8.42575069]
# [ 6.88975 7.07525 7.40625 7.85725 8.342
# 8.658 9.14275 9.59375 9.92475 10.11025 ]
# [ 8.57424931 8.75974931 9.09074931 9.54174931 10.02649931
# 10.34249931 10.82724931 11.27824931 11.60924931 11.79474931]
# [10.82819307 11.01369307 11.34469307 11.79569307 12.28044307
# 12.59644307 13.08119307 13.53219307 13.86319307 14.04869307]
# [12.38512037 12.57062037 12.90162037 13.35262037 13.83737037
# 14.15337037 14.63812037 15.08912037 15.42012037 15.60562037]
# [13.32442078 13.50992078 13.84092078 14.29192078 14.77667078
# 15.09267078 15.57742078 16.02842078 16.35942078 16.54492078]]]]
output = interpolate_nd(
data, lambda x, _: cubic_coeffs(x), output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_cubic",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 7, 8], dtype=np.int64)
# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest",
)
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
sizes = np.array([7, 8], dtype=np.int64)
# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), output_size=sizes, axes=axes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_axes_2_3",
)
axes = [3, 2]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
sizes = np.array([8, 7], dtype=np.int64)
# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x), output_size=sizes, axes=axes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_axes_3_2",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
coordinate_transformation_mode="half_pixel",
nearest_mode="ceil",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 8, 8], dtype=np.int64)
# [[[[ 1. 2. 2. 3. 3. 4. 4. 4.]
# [ 5. 6. 6. 7. 7. 8. 8. 8.]
# [ 5. 6. 6. 7. 7. 8. 8. 8.]
# [ 9. 10. 10. 11. 11. 12. 12. 12.]
# [ 9. 10. 10. 11. 11. 12. 12. 12.]
# [13. 14. 14. 15. 15. 16. 16. 16.]
# [13. 14. 14. 15. 15. 16. 16. 16.]
# [13. 14. 14. 15. 15. 16. 16. 16.]]]]
output = interpolate_nd(
data, lambda x, _: nearest_coeffs(x, mode="ceil"), output_size=sizes
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_ceil_half_pixel",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
coordinate_transformation_mode="align_corners",
nearest_mode="floor",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 8, 8], dtype=np.int64)
# [[[[ 1. 1. 1. 2. 2. 3. 3. 4.]
# [ 1. 1. 1. 2. 2. 3. 3. 4.]
# [ 1. 1. 1. 2. 2. 3. 3. 4.]
# [ 5. 5. 5. 6. 6. 7. 7. 8.]
# [ 5. 5. 5. 6. 6. 7. 7. 8.]
# [ 9. 9. 9. 10. 10. 11. 11. 12.]
# [ 9. 9. 9. 10. 10. 11. 11. 12.]
# [13. 13. 13. 14. 14. 15. 15. 16.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x, mode="floor"),
output_size=sizes,
coordinate_transformation_mode="align_corners",
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_floor_align_corners",
)
keep_aspect_ratio_policy = "not_larger"
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
sizes = np.array([7, 8], dtype=np.int64) # Results in 7x7
# [[[[1. 1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2.]
# [3. 3. 3. 3. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x),
output_size=sizes,
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_not_larger",
)
keep_aspect_ratio_policy = "not_smaller"
axes = [2, 3]
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
sizes = np.array([7, 8], dtype=np.int64) # Results in 8x8
# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [1. 1. 1. 1. 2. 2. 2. 2.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]
# [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x),
output_size=sizes,
axes=axes,
keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_not_smaller",
)
node = onnx.helper.make_node(
"Resize",
inputs=["X", "", "", "sizes"],
outputs=["Y"],
mode="nearest",
coordinate_transformation_mode="asymmetric",
nearest_mode="round_prefer_ceil",
)
data = np.array(
[
[
[
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
]
],
dtype=np.float32,
)
sizes = np.array([1, 1, 8, 8], dtype=np.int64)
# [[[[ 1. 2. 2. 3. 3. 4. 4. 4.]
# [ 5. 6. 6. 7. 7. 8. 8. 8.]
# [ 5. 6. 6. 7. 7. 8. 8. 8.]
# [ 9. 10. 10. 11. 11. 12. 12. 12.]
# [ 9. 10. 10. 11. 11. 12. 12. 12.]
# [13. 14. 14. 15. 15. 16. 16. 16.]
# [13. 14. 14. 15. 15. 16. 16. 16.]
# [13. 14. 14. 15. 15. 16. 16. 16.]]]]
output = interpolate_nd(
data,
lambda x, _: nearest_coeffs(x, mode="round_prefer_ceil"),
output_size=sizes,
coordinate_transformation_mode="asymmetric",
).astype(np.float32)
expect(
node,
inputs=[data, sizes],
outputs=[output],
name="test_resize_upsample_sizes_nearest_round_prefer_ceil_asymmetric",
)
Reverse batch of sequences having different lengths specified by sequence_lens.
For each slice i iterating on batch axis, the operator reverses the first sequence_lens[i] elements on time axis, and copies elements whose index's beyond sequence_lens[i] to the output. So the output slice i contains reversed sequences on the first sequence_lens[i] elements, then have original values copied for the other elements.
Example 1: input = [[0.0, 4.0, 8.0, 12.0], [1.0, 5.0, 9.0, 13.0], [2.0, 6.0, 10.0, 14.0], [3.0, 7.0, 11.0, 15.0]] sequence_lens = [4, 3, 2, 1] time_axis = 0 batch_axis = 1
output = [[3.0, 6.0, 9.0, 12.0],
[2.0, 5.0, 8.0, 13.0],
[1.0, 4.0, 10.0, 14.0],
[0.0, 7.0, 11.0, 15.0]]
Example 2: input = [[0.0, 1.0, 2.0, 3.0 ], [4.0, 5.0, 6.0, 7.0 ], [8.0, 9.0, 10.0, 11.0], [12.0, 13.0, 14.0, 15.0]] sequence_lens = [1, 2, 3, 4] time_axis = 1 batch_axis = 0
output = [[0.0, 1.0, 2.0, 3.0 ],
[5.0, 4.0, 6.0, 7.0 ],
[10.0, 9.0, 8.0, 11.0],
[15.0, 14.0, 13.0, 12.0]]
This version of the operator has been available since version 10 of the default ONNX operator set.
node = onnx.helper.make_node(
"ReverseSequence",
inputs=["x", "sequence_lens"],
outputs=["y"],
time_axis=1,
batch_axis=0,
)
x = np.array(
[
[0.0, 1.0, 2.0, 3.0],
[4.0, 5.0, 6.0, 7.0],
[8.0, 9.0, 10.0, 11.0],
[12.0, 13.0, 14.0, 15.0],
],
dtype=np.float32,
)
sequence_lens = np.array([1, 2, 3, 4], dtype=np.int64)
y = np.array(
[
[0.0, 1.0, 2.0, 3.0],
[5.0, 4.0, 6.0, 7.0],
[10.0, 9.0, 8.0, 11.0],
[15.0, 14.0, 13.0, 12.0],
],
dtype=np.float32,
)
expect(
node,
inputs=[x, sequence_lens],
outputs=[y],
name="test_reversesequence_batch",
)
node = onnx.helper.make_node(
"ReverseSequence",
inputs=["x", "sequence_lens"],
outputs=["y"],
time_axis=0,
batch_axis=1,
)
x = np.array(
[
[0.0, 4.0, 8.0, 12.0],
[1.0, 5.0, 9.0, 13.0],
[2.0, 6.0, 10.0, 14.0],
[3.0, 7.0, 11.0, 15.0],
],
dtype=np.float32,
)
sequence_lens = np.array([4, 3, 2, 1], dtype=np.int64)
y = np.array(
[
[3.0, 6.0, 9.0, 12.0],
[2.0, 5.0, 8.0, 13.0],
[1.0, 4.0, 10.0, 14.0],
[0.0, 7.0, 11.0, 15.0],
],
dtype=np.float32,
)
expect(
node,
inputs=[x, sequence_lens],
outputs=[y],
name="test_reversesequence_time",
)
Region of Interest (RoI) align operation described in the Mask R-CNN paper. RoiAlign consumes an input tensor X and region of interests (rois) to apply pooling across each RoI; it produces a 4-D tensor of shape (num_rois, C, output_height, output_width).
RoiAlign is proposed to avoid the misalignment by removing quantizations while converting from original image into feature map and from feature map into RoI feature; in each ROI bin, the value of the sampled locations are computed directly through bilinear interpolation.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#RoiAlign-10">10</a>, <a href="Changelog.md#RoiAlign-16">16</a>
node = onnx.helper.make_node(
"RoiAlign",
inputs=["X", "rois", "batch_indices"],
outputs=["Y"],
spatial_scale=1.0,
output_height=5,
output_width=5,
sampling_ratio=2,
coordinate_transformation_mode="output_half_pixel",
)
X, batch_indices, rois = get_roi_align_input_values()
# (num_rois, C, output_height, output_width)
Y = np.array(
[
[
[
[0.4664, 0.4466, 0.3405, 0.5688, 0.6068],
[0.3714, 0.4296, 0.3835, 0.5562, 0.3510],
[0.2768, 0.4883, 0.5222, 0.5528, 0.4171],
[0.4713, 0.4844, 0.6904, 0.4920, 0.8774],
[0.6239, 0.7125, 0.6289, 0.3355, 0.3495],
]
],
[
[
[0.3022, 0.4305, 0.4696, 0.3978, 0.5423],
[0.3656, 0.7050, 0.5165, 0.3172, 0.7015],
[0.2912, 0.5059, 0.6476, 0.6235, 0.8299],
[0.5916, 0.7389, 0.7048, 0.8372, 0.8893],
[0.6227, 0.6153, 0.7097, 0.6154, 0.4585],
]
],
[
[
[0.2384, 0.3379, 0.3717, 0.6100, 0.7601],
[0.3767, 0.3785, 0.7147, 0.9243, 0.9727],
[0.5749, 0.5826, 0.5709, 0.7619, 0.8770],
[0.5355, 0.2566, 0.2141, 0.2796, 0.3600],
[0.4365, 0.3504, 0.2887, 0.3661, 0.2349],
]
],
],
dtype=np.float32,
)
expect(
node,
inputs=[X, rois, batch_indices],
outputs=[Y],
name="test_roialign_aligned_false",
)
node = onnx.helper.make_node(
"RoiAlign",
inputs=["X", "rois", "batch_indices"],
outputs=["Y"],
spatial_scale=1.0,
output_height=5,
output_width=5,
sampling_ratio=2,
coordinate_transformation_mode="half_pixel",
)
X, batch_indices, rois = get_roi_align_input_values()
# (num_rois, C, output_height, output_width)
Y = np.array(
[
[
[
[0.5178, 0.3434, 0.3229, 0.4474, 0.6344],
[0.4031, 0.5366, 0.4428, 0.4861, 0.4023],
[0.2512, 0.4002, 0.5155, 0.6954, 0.3465],
[0.3350, 0.4601, 0.5881, 0.3439, 0.6849],
[0.4932, 0.7141, 0.8217, 0.4719, 0.4039],
]
],
[
[
[0.3070, 0.2187, 0.3337, 0.4880, 0.4870],
[0.1871, 0.4914, 0.5561, 0.4192, 0.3686],
[0.1433, 0.4608, 0.5971, 0.5310, 0.4982],
[0.2788, 0.4386, 0.6022, 0.7000, 0.7524],
[0.5774, 0.7024, 0.7251, 0.7338, 0.8163],
]
],
[
[
[0.2393, 0.4075, 0.3379, 0.2525, 0.4743],
[0.3671, 0.2702, 0.4105, 0.6419, 0.8308],
[0.5556, 0.4543, 0.5564, 0.7502, 0.9300],
[0.6626, 0.5617, 0.4813, 0.4954, 0.6663],
[0.6636, 0.3721, 0.2056, 0.1928, 0.2478],
]
],
],
dtype=np.float32,
)
expect(
node,
inputs=[X, rois, batch_indices],
outputs=[Y],
name="test_roialign_aligned_true",
)
X = np.array(
[
[
[
[
0.2764,
0.715,
0.1958,
0.3416,
0.4638,
0.0259,
0.2963,
0.6518,
0.4856,
0.725,
],
[
0.9637,
0.0895,
0.2919,
0.6753,
0.0234,
0.6132,
0.8085,
0.5324,
0.8992,
0.4467,
],
[
0.3265,
0.8479,
0.9698,
0.2471,
0.9336,
0.1878,
0.4766,
0.4308,
0.34,
0.2162,
],
[
0.0206,
0.172,
0.2155,
0.4394,
0.0653,
0.3406,
0.7724,
0.3921,
0.2541,
0.5799,
],
[
0.4062,
0.2194,
0.4473,
0.4687,
0.7109,
0.9327,
0.9815,
0.632,
0.1728,
0.6119,
],
[
0.3097,
0.1283,
0.4984,
0.5068,
0.4279,
0.0173,
0.4388,
0.043,
0.4671,
0.7119,
],
[
0.1011,
0.8477,
0.4726,
0.1777,
0.9923,
0.4042,
0.1869,
0.7795,
0.9946,
0.9689,
],
[
0.1366,
0.3671,
0.7011,
0.6234,
0.9867,
0.5585,
0.6985,
0.5609,
0.8788,
0.9928,
],
[
0.5697,
0.8511,
0.6711,
0.9406,
0.8751,
0.7496,
0.165,
0.1049,
0.1559,
0.2514,
],
[
0.7012,
0.4056,
0.7879,
0.3461,
0.0415,
0.2998,
0.5094,
0.3727,
0.5482,
0.0502,
],
]
]
],
dtype=np.float32,
)
rois = np.array(
[[0.0, 0.0, 9.0, 9.0], [0.0, 5.0, 4.0, 9.0], [5.0, 5.0, 9.0, 9.0]],
dtype=np.float32,
)
batch_indices = np.array([0, 0, 0], dtype=np.int64)
Y = np.array(
[
[
[
[0.3445228, 0.37310338, 0.37865096, 0.446696, 0.37991184],
[0.4133513, 0.5455125, 0.6651902, 0.55805874, 0.27110294],
[0.21223956, 0.40924096, 0.8417618, 0.792561, 0.37196714],
[0.46835402, 0.39741728, 0.8012819, 0.4969306, 0.5495158],
[0.3595896, 0.5196813, 0.5403741, 0.23814403, 0.19992709],
]
],
[
[
[0.30517197, 0.5086199, 0.3189761, 0.4054401, 0.47630402],
[0.50862, 0.8477, 0.37808004, 0.24936005, 0.79384017],
[0.17620805, 0.29368007, 0.44870415, 0.4987201, 0.63148826],
[0.51066005, 0.8511, 0.5368801, 0.9406, 0.70008016],
[0.4487681, 0.51066035, 0.5042561, 0.5643603, 0.42004836],
]
],
[
[
[0.21062402, 0.3510401, 0.37416005, 0.5967599, 0.46507207],
[0.32336006, 0.31180006, 0.6236001, 0.9946, 0.7751202],
[0.35744014, 0.5588001, 0.35897616, 0.7030401, 0.6353923],
[0.5996801, 0.27940005, 0.17948808, 0.35152006, 0.31769615],
[0.3598083, 0.40752012, 0.2385281, 0.43856013, 0.26313624],
]
],
],
dtype=np.float32,
)
node = onnx.helper.make_node(
"RoiAlign",
inputs=["X", "rois", "batch_indices"],
mode="max",
outputs=["Y"],
spatial_scale=1.0,
output_height=5,
output_width=5,
sampling_ratio=2,
coordinate_transformation_mode="output_half_pixel",
)
expect(
node,
inputs=[X, rois, batch_indices],
outputs=[Y],
name="test_roialign_mode_max",
)
RotaryEmbedding is the implementation of rotary positional embeddings (RoPE) based on the paper https://arxiv.org/pdf/2104.09864. The key advantage of RoPE is that it allows the model to understand both the absolute position of a token and the relative distances between tokens. This is achieved through a rotational mechanism where the extent of rotation is computed based on the token's absolute position (position_ids).
The rotational mechanism is defined by sine and cosine functions that are used to represent the rotation angles. For each token in the sequence, its positional embedding is computed by rotating its embedding vector. This is done by splitting the embedding vector either into two halves or interleaving every alternate token and applying the rotation matrix to each half of the embedding vector. The rotation matrix is parameterized by the token's position in the sequence. The rotated halves of the embedding vector are concatenated to form the final positional embedding for each token. The rotated positional embeddings are used in the self-attention mechanism. The rotation ensures that the model captures both absolute and relative positional information.
Rotary embeddings are defined using the following algorithm:
def rotary_embedding(
input: np.ndarray,
cos_cache: np.ndarray,
sin_cache: np.ndarray,
position_ids: np.ndarray | None = None,
interleaved=None,
rotary_embedding_dim=None,
num_heads=None,
) -> np.ndarray:
original_input_shape = input.shape
# First ensure input to be processed has shape [batch_size, seq_len, num_heads, head_size]
if len(input.shape) == 4:
input = np.transpose(input, (0, 2, 1, 3))
batch_size = input.shape[0]
sequence_length = input.shape[1]
if len(input.shape) == 3:
hidden_size = input.shape[2]
assert num_heads != 0
head_size = int(hidden_size / num_heads)
new_shape = [batch_size, sequence_length, num_heads, head_size]
input = np.reshape(input, new_shape)
assert len(input.shape) == 4
head_size = input.shape[3]
# Fully or partially perform rotation on input based on rotary_embedding_dim attribute
if rotary_embedding_dim is None or rotary_embedding_dim == 0:
# If rotary_embedding_dim not provided, perform full rotation by using head_size
rotary_embedding_dim = head_size
x_rotate = input[:, :, :, :rotary_embedding_dim]
x_not_rotate = input[:, :, :, rotary_embedding_dim:]
rotary_embedding_dim_half = int(rotary_embedding_dim / 2)
# Retrieve sin and cos caches using position ids
if position_ids is not None:
cos_cache = cos_cache[
position_ids
] # Shape: [batch_size, sequence_length, rotary_embedding_dim/2]
sin_cache = sin_cache[
position_ids
] # Shape: [batch_size, sequence_length, rotary_embedding_dim/2]
# Shape: [batch_size, sequence_length, rotary_embedding_dim/2]
if cos_cache.shape[-1] != rotary_embedding_dim_half:
raise ValueError(
f"Last dimension of cos cache ({cos_cache.shape[-1]}) does not match rotary_embedding_dim/2 ({rotary_embedding_dim_half})."
)
if sin_cache.shape[-1] != rotary_embedding_dim_half:
raise ValueError(
f"Last dimension of sin cache ({sin_cache.shape[-1]}) does not match rotary_embedding_dim/2 ({rotary_embedding_dim_half})."
)
cos_cache = np.expand_dims(
cos_cache, axis=2
) # Shape: [batch_size, sequence_length, 1, rotary_embedding_dim/2]
sin_cache = np.expand_dims(
sin_cache, axis=2
) # Shape: [batch_size, sequence_length, 1, rotary_embedding_dim/2]
# Either divide the input in halves or interleave (based on interleaved attribute)
if interleaved:
x1 = x_rotate[:, :, :, 0::2]
x2 = x_rotate[:, :, :, 1::2]
else:
x1, x2 = np.split(x_rotate, 2, axis=-1)
# Calculate real and imaginary values
real = (cos_cache * x1) - (sin_cache * x2)
imag = (sin_cache * x1) + (cos_cache * x2)
# Inserted rotated embeddings back to the original input
if interleaved:
# x_rotate[:, :, :, 0::2] = real
# x_rotate[:, :, :, 1::2] = imag
real = np.expand_dims(real, axis=-1)
imag = np.expand_dims(imag, axis=-1)
x_rotate_concat = np.concatenate((real, imag), axis=-1)
x_rotate = np.reshape(x_rotate_concat, x_rotate.shape)
else:
x_rotate = np.concatenate((real, imag), axis=-1)
output = np.concatenate((x_rotate, x_not_rotate), axis=-1)
if len(original_input_shape) == 3:
output = np.reshape(output, original_input_shape)
else:
output = np.transpose(output, (0, 2, 1, 3))
return output
This version of the operator has been available since version 23 of the default ONNX operator set.
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache", "position_ids"],
outputs=["output"],
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
position_ids_data = np.random.uniform(0, 50, (2, 3)).astype(np.int64)
sin_cache_data = np.random.rand(50, 4).astype(np.float32)
cos_cache_data = np.random.rand(50, 4).astype(np.float32)
expected_output = rotary_embedding(
input_data, cos_cache_data, sin_cache_data, position_ids=position_ids_data
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data, position_ids_data],
outputs=[expected_output],
name="test_rotary_embedding",
)
num_heads = 4
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache", "position_ids"],
outputs=["output"],
num_heads=num_heads,
)
input_data = np.random.rand(2, 3, 32).astype(np.float32)
position_ids_data = np.random.uniform(0, 50, (2, 3)).astype(np.int64)
sin_cache_data = np.random.rand(50, 4).astype(np.float32)
cos_cache_data = np.random.rand(50, 4).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
position_ids=position_ids_data,
num_heads=num_heads,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data, position_ids_data],
outputs=[expected_output],
name="test_rotary_embedding_3d_input",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache", "position_ids"],
outputs=["output"],
interleaved=1,
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
position_ids_data = np.random.uniform(0, 50, (2, 3)).astype(np.int64)
sin_cache_data = np.random.rand(50, 4).astype(np.float32)
cos_cache_data = np.random.rand(50, 4).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
position_ids=position_ids_data,
interleaved=1,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data, position_ids_data],
outputs=[expected_output],
name="test_rotary_embedding_interleaved",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache"],
outputs=["output"],
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
sin_cache_data = np.random.rand(2, 3, 4).astype(np.float32)
cos_cache_data = np.random.rand(2, 3, 4).astype(np.float32)
expected_output = rotary_embedding(input_data, cos_cache_data, sin_cache_data)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data],
outputs=[expected_output],
name="test_rotary_embedding_no_position_ids",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache"],
outputs=["output"],
interleaved=1,
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
sin_cache_data = np.random.rand(2, 3, 4).astype(np.float32)
cos_cache_data = np.random.rand(2, 3, 4).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
interleaved=1,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data],
outputs=[expected_output],
name="test_rotary_embedding_no_position_ids_interleaved",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache"],
outputs=["output"],
rotary_embedding_dim=4,
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
sin_cache_data = np.random.rand(2, 3, 2).astype(np.float32)
cos_cache_data = np.random.rand(2, 3, 2).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
rotary_embedding_dim=4,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data],
outputs=[expected_output],
name="test_rotary_embedding_no_position_ids_rotary_dim",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache", "position_ids"],
outputs=["output"],
rotary_embedding_dim=4,
interleaved=1,
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
position_ids_data = np.random.uniform(0, 50, (2, 3)).astype(np.int64)
sin_cache_data = np.random.rand(50, 2).astype(np.float32)
cos_cache_data = np.random.rand(50, 2).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
position_ids=position_ids_data,
interleaved=1,
rotary_embedding_dim=4,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data, position_ids_data],
outputs=[expected_output],
name="test_rotary_embedding_with_interleaved_rotary_dim",
)
node = onnx.helper.make_node(
"RotaryEmbedding",
inputs=["input", "cos_cache", "sin_cache", "position_ids"],
outputs=["output"],
rotary_embedding_dim=4,
)
input_data = np.random.rand(2, 4, 3, 8).astype(np.float32)
position_ids_data = np.random.uniform(0, 50, (2, 3)).astype(np.int64)
sin_cache_data = np.random.rand(50, 2).astype(np.float32)
cos_cache_data = np.random.rand(50, 2).astype(np.float32)
expected_output = rotary_embedding(
input_data,
cos_cache_data,
sin_cache_data,
position_ids=position_ids_data,
rotary_embedding_dim=4,
)
expect(
node,
inputs=[input_data, cos_cache_data, sin_cache_data, position_ids_data],
outputs=[expected_output],
name="test_rotary_embedding_with_rotary_dim",
)
Round takes one input Tensor and rounds the values, element-wise, meaning it finds the nearest integer for each value. In case of halves, the rule is to round them to the nearest even integer. If input x is integral, +0, -0, NaN, or infinite, x itself is returned. The output tensor has the same shape and type as the input.
Examples:
round([0.9]) = [1.0]
round([2.5]) = [2.0]
round([2.3]) = [2.0]
round([1.5]) = [2.0]
round([-4.5]) = [-4.0]
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Round-11">11</a>
node = onnx.helper.make_node(
"Round",
inputs=["x"],
outputs=["y"],
)
x = np.array(
[
0.1,
0.5,
0.9,
1.2,
1.5,
1.8,
2.3,
2.5,
2.7,
-1.1,
-1.5,
-1.9,
-2.2,
-2.5,
-2.8,
]
).astype(np.float32)
# expected output
y = np.array(
[
0.0,
0.0,
1.0,
1.0,
2.0,
2.0,
2.0,
2.0,
3.0,
-1.0,
-2.0,
-2.0,
-2.0,
-2.0,
-3.0,
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_round")
Computes the Short-time Fourier Transform of the signal.
This version of the operator has been available since version 17 of the default ONNX operator set.
signal = np.arange(0, 128, dtype=np.float32).reshape(1, 128, 1)
length = np.array(16).astype(np.int64)
onesided_length = (length >> 1) + 1
step = np.array(8).astype(np.int64)
no_window = "" # optional input, not supplied
node = onnx.helper.make_node(
"STFT",
inputs=["signal", "frame_step", no_window, "frame_length"],
outputs=["output"],
)
nstfts = ((signal.shape[1] - length) // step) + 1
# [batch_size][frames][frame_length][2]
output = np.empty([1, nstfts, onesided_length, 2], dtype=np.float32)
for i in range(nstfts):
start = i * step
stop = i * step + length
complex_out = np.fft.fft(signal[0, start:stop, 0])[0:onesided_length]
output[0, i] = np.stack((complex_out.real, complex_out.imag), axis=1)
output = output.astype(signal.dtype)
expect(node, inputs=[signal, step, length], outputs=[output], name="test_stft")
node = onnx.helper.make_node(
"STFT",
inputs=["signal", "frame_step", "window"],
outputs=["output"],
)
# Test with window
a0 = 0.5
a1 = 0.5
window = a0 + a1 * np.cos(
2 * np.pi * np.arange(0, length, 1, dtype=np.float32) / length
)
nstfts = 1 + (signal.shape[1] - window.shape[0]) // step
# [batch_size][frames][frame_length][2]
output = np.empty([1, nstfts, onesided_length, 2], dtype=np.float32)
for i in range(nstfts):
start = i * step
stop = i * step + length
complex_out = np.fft.fft(signal[0, start:stop, 0] * window)[
0:onesided_length
]
output[0, i] = np.stack((complex_out.real, complex_out.imag), axis=1)
window = window.astype(signal.dtype)
output = output.astype(signal.dtype)
expect(
node,
inputs=[signal, step, window],
outputs=[output],
name="test_stft_with_window",
)
Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip, and is intended to enable generalizations of RNN-like constructs for sequence-to-sequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hidden-state in RNNs, also referred to as loop-carried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.
The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hidden-state values of RNN-like constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).
Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.
The scan operation returns the final values of the state_variables as well as the scan_outputs.
The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.
The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.
The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a non-zero axis may be less efficient than scanning axis zero.
The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.
Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initial-states and scan-inputs are listed together as one input parameter. Similarly, the final-states and scan-outputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scan-inputs.
The behavior of
Scan <
num_scan_inputs = m,
body = loop-body,
scan_input_axes = [axis_1, ..., axis_m]
> (init_1, ..., init_n, scan_1, ..., scan_m)
is equivalent to the following pseudo-code:
// scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i
// scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.
sequence_length = scan_1.shape[axis_1];
// initialize state-variables
st_1 = init_1; ... st_n = init_n;
// initialize scan-output variables: [] denotes an empty tensor
scan_out_1 = []; ...; scan_out_k = [];
// identify number of iterations:
// execute loop
for (int t = 0; t < sequence_length; ++t) {
// generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor
// of rank one less than T obtained by indexing T at position t along axis k.
si_1 = scan_1<axis=axis_1>[t];
... ;
si_m = scan_m<axis=axis_m>[t];
// execute loop-body
st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m)
// accumulate the scan-output elements
scan_out_1 = Concat<axis=0>(scan_out_1, so_1); ... ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
}
return st_1, ..., st_n, scan_out_1, ..., scan_out_k;
Sample usage: Encoding RNN using a Scan
The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.
graph rnn-encoding {
%H_0 = ...
%X = ...
%Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](%H_0, %X)
return %Y, %Y_h
}
graph rnn-cell-1 (
%H_tminus1[FLOAT, tensor]
%X_t[FLOAT, tensor]
) {
%Wi = ...
%Ri = ...
%Wbi = ...
%Rbi = ...
%t1 = X_t * (Wi^T)
%t2 = H_tminus1*(Ri^T)
%t3 = Add(%t1, %t2)
%t4 = Add(%t3, %Wbi)
%t5 = Add(%t4, %Rbi)
%Ht = Tanh(%t5)
%Accumulate = Identity(%Ht)
return %Ht, %Accumulate
}
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Scan-8">8</a>, <a href="Changelog.md#Scan-9">9</a>, <a href="Changelog.md#Scan-11">11</a>, <a href="Changelog.md#Scan-16">16</a>, <a href="Changelog.md#Scan-19">19</a>, <a href="Changelog.md#Scan-21">21</a>, <a href="Changelog.md#Scan-23">23</a>, <a href="Changelog.md#Scan-24">24</a>
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
"sum_in", onnx.TensorProto.FLOAT, [2]
)
next_ = onnx.helper.make_tensor_value_info("next", onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info(
"sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
"scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
"Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
"Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
[add_node, id_node], "scan_body", [sum_in, next_], [sum_out, scan_out]
)
# create scan op node
no_sequence_lens = "" # optional input, not supplied
node = onnx.helper.make_node(
"Scan",
inputs=[no_sequence_lens, "initial", "x"],
outputs=["y", "z"],
num_scan_inputs=1,
body=scan_body,
)
# create inputs for batch-size 1, sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((1, 2))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((1, 3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((1, 2))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((1, 3, 2))
expect(
node,
inputs=[initial, x],
outputs=[y, z],
name="test_scan_sum",
opset_imports=[onnx.helper.make_opsetid("", 8)],
)
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
"sum_in", onnx.TensorProto.FLOAT, [2]
)
next_ = onnx.helper.make_tensor_value_info("next", onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info(
"sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
"scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
"Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
"Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
[add_node, id_node], "scan_body", [sum_in, next_], [sum_out, scan_out]
)
# create scan op node
node = onnx.helper.make_node(
"Scan",
inputs=["initial", "x"],
outputs=["y", "z"],
num_scan_inputs=1,
body=scan_body,
)
# create inputs for sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((2,))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((2,))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((3, 2))
expect(
node,
inputs=[initial, x],
outputs=[y, z],
name="test_scan9_sum",
opset_imports=[onnx.helper.make_opsetid("", 9)],
)
This operator is deprecated. Please use ScatterElements, which provides the same functionality.
Scatter takes three inputs data, updates, and indices of the same
rank r >= 1 and an optional attribute axis that identifies an axis of data
(by default, the outer-most axis, that is axis 0). The output of the operation
is produced by creating a copy of the input data, and then updating its value
to values specified by updates at specific index positions specified by
indices. Its output shape is the same as the shape of data.
For each entry in updates, the target index in data is obtained by combining
the corresponding entry in indices with the index of the entry itself: the
index-value for dimension = axis is obtained from the value of the corresponding
entry in indices and the index-value for dimension != axis is obtained from the
index of the entry itself.
For instance, in a 2-D tensor case, the update corresponding to the [i][j] entry is performed as below:
output[indices[i][j]][j] = updates[i][j] if axis = 0,
output[i][indices[i][j]] = updates[i][j] if axis = 1,
This operator is the inverse of GatherElements. It is similar to Torch's Scatter operation.
Example 1:
data = [
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
]
indices = [
[1, 0, 2],
[0, 2, 1],
]
updates = [
[1.0, 1.1, 1.2],
[2.0, 2.1, 2.2],
]
output = [
[2.0, 1.1, 0.0]
[1.0, 0.0, 2.2]
[0.0, 2.1, 1.2]
]
Example 2:
data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
indices = [[1, 3]]
updates = [[1.1, 2.1]]
axis = 1
output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
This version of the operator has been deprecated since version 11 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Scatter-9">9</a>
axis = 1
node = onnx.helper.make_node(
"Scatter",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter(data, indices, updates, axis=axis)
# print(y) produces
# [[1.0, 1.1, 3.0, 2.1, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_with_axis",
opset_imports=[helper.make_opsetid("", 10)],
)
node = onnx.helper.make_node(
"Scatter",
inputs=["data", "indices", "updates"],
outputs=["y"],
)
data = np.zeros((3, 3), dtype=np.float32)
indices = np.array([[1, 0, 2], [0, 2, 1]], dtype=np.int64)
updates = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]], dtype=np.float32)
y = scatter(data, indices, updates)
# print(y) produces
# [[2.0, 1.1, 0.0],
# [1.0, 0.0, 2.2],
# [0.0, 2.1, 1.2]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_without_axis",
opset_imports=[helper.make_opsetid("", 10)],
)
ScatterElements takes three inputs data, updates, and indices of the same
rank r >= 1 and an optional attribute axis that identifies an axis of data
(by default, the outer-most axis, that is axis 0). The output of the operation
is produced by creating a copy of the input data, and then updating its value
to values specified by updates at specific index positions specified by
indices. Its output shape is the same as the shape of data.
For each entry in updates, the target index in data is obtained by combining
the corresponding entry in indices with the index of the entry itself: the
index-value for dimension = axis is obtained from the value of the corresponding
entry in indices and the index-value for dimension != axis is obtained from the
index of the entry itself.
reduction allows specification of an optional reduction operation, which is applied to all values in updates
tensor into output at the specified indices.
In cases where reduction is set to "none", indices should not have duplicate entries: that is, if idx1 != idx2,
then indices[idx1] != indices[idx2]. For instance, in a 2-D tensor case, the update
corresponding to the [i][j] entry is performed as below:
output[indices[i][j]][j] = updates[i][j] if axis = 0,
output[i][indices[i][j]] = updates[i][j] if axis = 1,
When reduction is set to some reduction function f, the update corresponding to the [i][j] entry is performed as below:
output[indices[i][j]][j] = f(output[indices[i][j]][j], updates[i][j]) if axis = 0,
output[i][indices[i][j]] = f(output[i][indices[i][j]], updates[i][j]) if axis = 1,
where the f is +, *, max or min as specified.
This operator is the inverse of GatherElements. It is similar to Torch's Scatter operation.
(Opset 18 change): Adds max/min to the set of allowed reduction ops.
Example 1:
data = [
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
]
indices = [
[1, 0, 2],
[0, 2, 1],
]
updates = [
[1.0, 1.1, 1.2],
[2.0, 2.1, 2.2],
]
output = [
[2.0, 1.1, 0.0]
[1.0, 0.0, 2.2]
[0.0, 2.1, 1.2]
]
Example 2:
data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
indices = [[1, 3]]
updates = [[1.1, 2.1]]
axis = 1
output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ScatterElements-11">11</a>, <a href="Changelog.md#ScatterElements-13">13</a>, <a href="Changelog.md#ScatterElements-16">16</a>
axis = 1
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter_elements(data, indices, updates, axis)
# print(y) produces
# [[1.0, 1.1, 3.0, 2.1, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_with_axis",
)
axis = 1
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
reduction="add",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter_elements(data, indices, updates, axis, reduction="add")
# print(y) produces
# [[1.0, 5.2, 3.0, 4.0, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_with_duplicate_indices",
)
axis = 1
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, -3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter_elements(data, indices, updates, axis)
# print(y) produces
# [[1.0, 1.1, 2.1, 4.0, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_with_negative_indices",
)
axis = 1
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
reduction="max",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter_elements(data, indices, updates, axis, reduction="max")
# print(y) produces
# [[1.0, 2.1, 3.0, 4.0, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_with_reduction_max",
)
axis = 1
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
axis=axis,
reduction="min",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)
y = scatter_elements(data, indices, updates, axis, reduction="min")
# print(y) produces
# [[1.0, 1.1, 3.0, 4.0, 5.0]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_with_reduction_min",
)
node = onnx.helper.make_node(
"ScatterElements",
inputs=["data", "indices", "updates"],
outputs=["y"],
)
data = np.zeros((3, 3), dtype=np.float32)
indices = np.array([[1, 0, 2], [0, 2, 1]], dtype=np.int64)
updates = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]], dtype=np.float32)
y = scatter_elements(data, indices, updates)
# print(y) produces
# [[2.0, 1.1, 0.0],
# [1.0, 0.0, 2.2],
# [0.0, 2.1, 1.2]]
expect(
node,
inputs=[data, indices, updates],
outputs=[y],
name="test_scatter_elements_without_axis",
)
ScatterND takes three inputs data tensor of rank r >= 1, indices tensor of rank q >= 1,
and updates tensor of rank q + r - indices.shape[-1] - 1. The output of the operation
is produced by creating a copy of the input data, and then updating its value to values
specified by updates at specific index positions specified by indices. Its output shape
is the same as the shape of data.
indices is an integer tensor. Let k denote indices.shape[-1], the last dimension in the shape of indices.
indices is treated as a (q-1)-dimensional tensor of k-tuples, where each k-tuple is a partial-index into data.
Hence, k can be a value at most the rank of data. When k equals rank(data), each update entry specifies an
update to a single element of the tensor. When k is less than rank(data) each update entry specifies an
update to a slice of the tensor. Index values are allowed to be negative, as per the usual
convention for counting backwards from the end, but are expected in the valid range.
updates is treated as a (q-1)-dimensional tensor of replacement-slice-values. Thus, the
first (q-1) dimensions of updates.shape must match the first (q-1) dimensions of indices.shape.
The remaining dimensions of updates correspond to the dimensions of the
replacement-slice-values. Each replacement-slice-value is a (r-k) dimensional tensor,
corresponding to the trailing (r-k) dimensions of data. Thus, the shape of updates
must equal indices.shape[0:q-1] ++ data.shape[k:r-1], where ++ denotes the concatenation
of shapes.
The output is calculated via the following equation:
output = np.copy(data)
update_indices = indices.shape[:-1]
for idx in np.ndindex(update_indices):
output[tuple(indices[idx])] = updates[idx]
The order of iteration in the above loop is not specified. In particular, indices should not have duplicate entries: that is, if idx1 != idx2, then indices[idx1] != indices[idx2]. This ensures that the output value does not depend on the iteration order.
reduction allows specification of an optional reduction operation, which is applied to all values in updates
tensor into output at the specified indices.
In cases where reduction is set to "none", indices should not have duplicate entries: that is, if idx1 != idx2,
then indices[idx1] != indices[idx2]. This ensures that the output value does not depend on the iteration order.
When reduction is set to some reduction function f, output is calculated as follows:
output = np.copy(data)
update_indices = indices.shape[:-1]
for idx in np.ndindex(update_indices):
output[tuple(indices[idx])] = f(output[tuple(indices[idx])], updates[idx])
where the f is +, *, max or min as specified.
This operator is the inverse of GatherND.
(Opset 18 change): Adds max/min to the set of allowed reduction ops.
Example 1:
data = [1, 2, 3, 4, 5, 6, 7, 8]
indices = [[4], [3], [1], [7]]
updates = [9, 10, 11, 12]
output = [1, 11, 3, 10, 9, 6, 7, 12]
Example 2:
data = [[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
indices = [[0], [2]]
updates = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]]
output = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ScatterND-11">11</a>, <a href="Changelog.md#ScatterND-13">13</a>, <a href="Changelog.md#ScatterND-16">16</a>
node = onnx.helper.make_node(
"ScatterND",
inputs=["data", "indices", "updates"],
outputs=["y"],
)
data = np.array(
[
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
],
dtype=np.float32,
)
indices = np.array([[0], [2]], dtype=np.int64)
updates = np.array(
[
[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
],
dtype=np.float32,
)
# Expecting output as np.array(
# [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
# [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
# [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
# [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates)
expect(
node,
inputs=[data, indices, updates],
outputs=[output],
name="test_scatternd",
)
node = onnx.helper.make_node(
"ScatterND",
inputs=["data", "indices", "updates"],
outputs=["y"],
reduction="add",
)
data = np.array(
[
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
],
dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
[
[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
],
dtype=np.float32,
)
# Expecting output as np.array(
# [[[7, 8, 9, 10], [13, 14, 15, 16], [18, 17, 16, 15], [16, 15, 14, 13]],
# [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
# [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
# [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="add")
expect(
node,
inputs=[data, indices, updates],
outputs=[output],
name="test_scatternd_add",
)
node = onnx.helper.make_node(
"ScatterND",
inputs=["data", "indices", "updates"],
outputs=["y"],
reduction="max",
)
data = np.array(
[
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
],
dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
[
[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
],
dtype=np.float32,
)
# Expecting output as np.array(
# [[[5, 5, 5, 5], [6, 6, 7, 8], [8, 7, 7, 7], [8, 8 ,8, 8]],
# [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
# [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
# [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="max")
expect(
node,
inputs=[data, indices, updates],
outputs=[output],
name="test_scatternd_max",
)
node = onnx.helper.make_node(
"ScatterND",
inputs=["data", "indices", "updates"],
outputs=["y"],
reduction="min",
)
data = np.array(
[
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
],
dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
[
[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
],
dtype=np.float32,
)
# Expecting output as np.array(
# [[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 3, 2, 1]],
# [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
# [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
# [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="min")
expect(
node,
inputs=[data, indices, updates],
outputs=[output],
name="test_scatternd_min",
)
node = onnx.helper.make_node(
"ScatterND",
inputs=["data", "indices", "updates"],
outputs=["y"],
reduction="mul",
)
data = np.array(
[
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
],
dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
[
[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
],
dtype=np.float32,
)
# Expecting output as np.array(
# [[[5, 10, 15, 20], [60, 72, 84, 96], [168, 147, 126, 105], [128, 96, 64, 32]],
# [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
# [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
# [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="mul")
expect(
node,
inputs=[data, indices, updates],
outputs=[output],
name="test_scatternd_multiply",
)
Selu takes one input data (Tensor<T>) and produces one output data
(Tensor<T>) where the scaled exponential linear unit function,
y = gamma * (alpha * e^x - alpha) for x <= 0, y = gamma * x for x > 0,
is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Selu-1">1</a>, <a href="Changelog.md#Selu-6">6</a>
node = onnx.helper.make_node(
"Selu", inputs=["x"], outputs=["y"], alpha=2.0, gamma=3.0
)
x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-3.79272318, 0., 3.]
y = (
np.clip(x, 0, np.inf) * 3.0
+ (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0 * 3.0
)
expect(node, inputs=[x], outputs=[y], name="test_selu_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = (
np.clip(x, 0, np.inf) * 3.0
+ (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0 * 3.0
)
expect(node, inputs=[x], outputs=[y], name="test_selu")
default_alpha = 1.67326319217681884765625
default_gamma = 1.05070102214813232421875
node = onnx.helper.make_node(
"Selu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = (
np.clip(x, 0, np.inf) * default_gamma
+ (np.exp(np.clip(x, -np.inf, 0)) - 1) * default_alpha * default_gamma
)
expect(node, inputs=[x], outputs=[y], name="test_selu_default")
Outputs a tensor copy from the tensor at 'position' in 'input_sequence'.
Accepted range for 'position' is in [-n, n - 1], where n is the number of tensors in 'input_sequence'.
Negative value means counting positions from the back.
This version of the operator has been available since version 11 of the default ONNX operator set.
Construct a tensor sequence containing 'inputs' tensors. All tensors in 'inputs' must have the same data type.
This version of the operator has been available since version 11 of the default ONNX operator set.
Construct an empty tensor sequence, with given data type.
This version of the operator has been available since version 11 of the default ONNX operator set.
Outputs a tensor sequence that removes the tensor at 'position' from 'input_sequence'.
Accepted range for 'position' is in [-n, n - 1], where n is the number of tensors in 'input_sequence'.
Negative value means counting positions from the back.
'position' is optional, by default it erases the last tensor from 'input_sequence'.
This version of the operator has been available since version 11 of the default ONNX operator set.
Outputs a tensor sequence that inserts 'tensor' into 'input_sequence' at 'position'.
'tensor' must have the same data type as 'input_sequence'.
Accepted range for 'position' is in [-n, n], where n is the number of tensors in 'input_sequence'.
Negative value means counting positions from the back.
'position' is optional, by default it inserts 'tensor' to the back of 'input_sequence'.
This version of the operator has been available since version 11 of the default ONNX operator set.
test_cases = {
"at_back": [np.array([10, 11, 12]).astype(np.int64)],
"at_front": [np.array([-2, -1, 0]), np.array([0]).astype(np.int64)],
}
sequence = [
np.array([1, 2, 3, 4]).astype(np.int64),
np.array([5, 6, 7]).astype(np.int64),
np.array([8, 9]).astype(np.int64),
]
for test_name, test_inputs in test_cases.items():
tensor = test_inputs[0].astype(np.int64)
if len(test_inputs) > 1:
node = onnx.helper.make_node(
"SequenceInsert",
inputs=["sequence", "tensor", "position"],
outputs=["output_sequence"],
)
position = test_inputs[1]
inserted = sequence_insert_reference_implementation(
sequence, tensor, position
)
expect(
node,
inputs=[sequence, tensor, position],
outputs=[inserted],
name="test_sequence_insert_" + test_name,
)
else:
node = onnx.helper.make_node(
"SequenceInsert",
inputs=["sequence", "tensor"],
outputs=["output_sequence"],
)
inserted = sequence_insert_reference_implementation(sequence, tensor)
expect(
node,
inputs=[sequence, tensor],
outputs=[inserted],
name="test_sequence_insert_" + test_name,
)
Produces a scalar(tensor of empty shape) containing the number of tensors in 'input_sequence'.
This version of the operator has been available since version 11 of the default ONNX operator set.
Applies a sub-graph to each sample in the input sequence(s).
Inputs can be either tensors or sequences, with the exception of the first input which must be a sequence. The length of the first input sequence will determine the number of samples in the outputs. Any other sequence inputs should have the same number of samples. The number of inputs and outputs, should match the one of the subgraph.
For each i-th element in the output, a sample will be extracted from the input sequence(s) at the i-th position and the sub-graph will be applied to it. The outputs will contain the outputs of the sub-graph for each sample, in the same order as in the input.
This operator assumes that processing each sample is independent and could executed in parallel or in any order. Users cannot expect any specific ordering in which each subgraph is computed.
This version of the operator has been available since version 17 of the default ONNX operator set.
body = onnx.helper.make_graph(
[onnx.helper.make_node("Add", ["in0", "in1"], ["out0"])],
"seq_map_body",
[
onnx.helper.make_tensor_value_info(
"in0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"in1", onnx.TensorProto.FLOAT, ["N"]
),
],
[onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["N"])],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["x0", "x1"], outputs=["y0"], body=body
)
x0 = [np.random.uniform(0.0, 1.0, 10).astype(np.float32) for k in range(3)]
x1 = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y0 = [x0[i] + x1 for i in range(3)]
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"]),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
]
expect(
node,
inputs=[x0, x1],
outputs=[y0],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_add_1_sequence_1_tensor",
)
body = onnx.helper.make_graph(
[onnx.helper.make_node("Add", ["in0", "in1"], ["out0"])],
"seq_map_body",
[
onnx.helper.make_tensor_value_info(
"in0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"in1", onnx.TensorProto.FLOAT, ["N"]
),
],
[onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["N"])],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["x0", "x1"], outputs=["y0"], body=body
)
N = [np.random.randint(1, 10) for _ in range(3)]
x0 = [np.random.uniform(0.0, 1.0, N[k]).astype(np.float32) for k in range(3)]
x1 = [np.random.uniform(0.0, 1.0, N[k]).astype(np.float32) for k in range(3)]
y0 = [x0[k] + x1[k] for k in range(3)]
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
]
expect(
node,
inputs=[x0, x1],
outputs=[y0],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_add_2_sequences",
)
body = onnx.helper.make_graph(
[onnx.helper.make_node("Shape", ["x"], ["shape"])],
"seq_map_body",
[
onnx.helper.make_tensor_value_info(
"x", onnx.TensorProto.FLOAT, ["H", "W", "C"]
)
],
[onnx.helper.make_tensor_value_info("shape", onnx.TensorProto.INT64, [3])],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["in_seq"], outputs=["shapes"], body=body
)
shapes = [
np.array([40, 30, 3], dtype=np.int64),
np.array([20, 10, 3], dtype=np.int64),
np.array([10, 5, 3], dtype=np.int64),
]
x0 = [np.zeros(shape, dtype=np.float32) for shape in shapes]
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(
onnx.TensorProto.FLOAT, ["H", "W", "C"]
)
),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.INT64, [3])
),
]
expect(
node,
inputs=[x0],
outputs=[shapes],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_extract_shapes",
)
body = onnx.helper.make_graph(
[onnx.helper.make_node("Identity", ["in0"], ["out0"])],
"seq_map_body",
[onnx.helper.make_tensor_value_info("in0", onnx.TensorProto.FLOAT, ["N"])],
[onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["M"])],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["x"], outputs=["y"], body=body
)
x = [np.random.uniform(0.0, 1.0, 10).astype(np.float32) for _ in range(3)]
y = x
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
]
expect(
node,
inputs=[x],
outputs=[y],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_identity_1_sequence",
)
body = onnx.helper.make_graph(
[
onnx.helper.make_node("Identity", ["in0"], ["out0"]),
onnx.helper.make_node("Identity", ["in1"], ["out1"]),
],
"seq_map_body",
[
onnx.helper.make_tensor_value_info(
"in0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"in1", onnx.TensorProto.FLOAT, ["M"]
),
],
[
onnx.helper.make_tensor_value_info(
"out0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"out1", onnx.TensorProto.FLOAT, ["M"]
),
],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["x0", "x1"], outputs=["y0", "y1"], body=body
)
x0 = [
np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
for _ in range(3)
]
x1 = np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
y0 = x0
y1 = [x1 for _ in range(3)]
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"]),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
),
]
expect(
node,
inputs=[x0, x1],
outputs=[y0, y1],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_identity_1_sequence_1_tensor",
)
body = onnx.helper.make_graph(
[
onnx.helper.make_node("Identity", ["in0"], ["out0"]),
onnx.helper.make_node("Identity", ["in1"], ["out1"]),
],
"seq_map_body",
[
onnx.helper.make_tensor_value_info(
"in0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"in1", onnx.TensorProto.FLOAT, ["M"]
),
],
[
onnx.helper.make_tensor_value_info(
"out0", onnx.TensorProto.FLOAT, ["N"]
),
onnx.helper.make_tensor_value_info(
"out1", onnx.TensorProto.FLOAT, ["M"]
),
],
)
node = onnx.helper.make_node(
"SequenceMap", inputs=["x0", "x1"], outputs=["y0", "y1"], body=body
)
x0 = [
np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
for _ in range(3)
]
x1 = [
np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
for _ in range(3)
]
y0 = x0
y1 = x1
input_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
),
]
output_type_protos = [
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
),
onnx.helper.make_sequence_type_proto(
onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
),
]
expect(
node,
inputs=[x0, x1],
outputs=[y0, y1],
input_type_protos=input_type_protos,
output_type_protos=output_type_protos,
name="test_sequence_map_identity_2_sequences",
)
Takes a tensor as input and outputs an 1D int64 tensor containing the shape of the input tensor. Optional attributes start and end can be used to compute a slice of the input tensor's shape. If start axis is omitted, the slice starts from axis 0. The end axis, if specified, is exclusive (and the returned value will not include the size of that axis). If the end axis is omitted, the axes upto the last one will be included. Negative axes indicate counting back from the last axis. Note that axes will be clamped to the range [0, r], where r is the rank of the input tensor if they are out-of-range (after adding r in the case of negative axis). Thus, specifying any end value > r is equivalent to specifying an end value of r, and specifying any start value < -r is equivalent to specifying a start value of 0. If start > end, the result will be an empty shape.
Examples:
Input tensor with shape: [2, 3, 4]
No attributes specified.
Output: [2, 3, 4]
Input tensor with shape: [2, 3, 4]
start: -1
Output: [4]
Input tensor with shape: [2, 3, 4]
end: -1
Output: [2, 3]
Input tensor with shape: [2, 3, 4]
start: 1
end: 2
Output: [3]
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Shape-1">1</a>, <a href="Changelog.md#Shape-13">13</a>, <a href="Changelog.md#Shape-15">15</a>, <a href="Changelog.md#Shape-19">19</a>, <a href="Changelog.md#Shape-21">21</a>, <a href="Changelog.md#Shape-23">23</a>, <a href="Changelog.md#Shape-24">24</a>
x = np.array(
[
[1, 2, 3],
[4, 5, 6],
]
).astype(np.float32)
test_shape("_example", x) # preserve names of original test cases
x = np.random.randn(3, 4, 5).astype(np.float32)
test_shape("", x) # preserve names of original test cases
test_shape("_start_1", x, start=1)
test_shape("_end_1", x, end=1)
test_shape("_start_negative_1", x, start=-1)
test_shape("_end_negative_1", x, end=-1)
test_shape("_start_1_end_negative_1", x, start=1, end=-1)
test_shape("_start_1_end_2", x, start=1, end=2)
test_shape("_clip_start", x, start=-10)
test_shape("_clip_end", x, end=10)
test_shape("_start_greater_than_end", x, start=2, end=1)
Shrink takes one input data (Tensor<numeric>) and produces one Tensor output, having same datatype and shape with input. It has two attributes, lambd and bias. The formula of this operator is: If x < -lambd, y = x + bias; If x > lambd, y = x - bias; Otherwise, y = 0.
This version of the operator has been available since version 9 of the default ONNX operator set.
node = onnx.helper.make_node(
"Shrink",
inputs=["x"],
outputs=["y"],
lambd=1.5,
)
X = np.arange(-2.0, 2.1, dtype=np.float32)
Y = np.array([-2, 0, 0, 0, 2], dtype=np.float32)
expect(node, inputs=[X], outputs=[Y], name="test_shrink_hard")
node = onnx.helper.make_node(
"Shrink",
inputs=["x"],
outputs=["y"],
lambd=1.5,
bias=1.5,
)
X = np.arange(-2.0, 2.1, dtype=np.float32)
Y = np.array([-0.5, 0, 0, 0, 0.5], dtype=np.float32)
expect(node, inputs=[X], outputs=[Y], name="test_shrink_soft")
Sigmoid takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the sigmoid function, y = 1 / (1 + exp(-x)), is applied to the tensor elementwise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sigmoid-1">1</a>, <a href="Changelog.md#Sigmoid-6">6</a>
node = onnx.helper.make_node(
"Sigmoid",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = 1.0 / (
1.0 + np.exp(np.negative(x))
) # expected output [0.26894143, 0.5, 0.7310586]
expect(node, inputs=[x], outputs=[y], name="test_sigmoid_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = 1.0 / (1.0 + np.exp(np.negative(x)))
expect(node, inputs=[x], outputs=[y], name="test_sigmoid")
Calculate the sign of the given input tensor element-wise. If input > 0, output 1. if input < 0, output -1. if input == 0, output 0.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sign-9">9</a>
node = onnx.helper.make_node(
"Sign",
inputs=["x"],
outputs=["y"],
)
x = np.array(range(-5, 6)).astype(np.float32)
y = np.sign(x)
expect(node, inputs=[x], outputs=[y], name="test_sign")
Calculates the sine of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sin-7">7</a>
node = onnx.helper.make_node(
"Sin",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.sin(x)
expect(node, inputs=[x], outputs=[y], name="test_sin_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.sin(x)
expect(node, inputs=[x], outputs=[y], name="test_sin")
Calculates the hyperbolic sine of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sinh-9">9</a>
node = onnx.helper.make_node(
"Sinh",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.sinh(x) # expected output [-1.17520118, 0., 1.17520118]
expect(node, inputs=[x], outputs=[y], name="test_sinh_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.sinh(x)
expect(node, inputs=[x], outputs=[y], name="test_sinh")
Takes a tensor as input and outputs a int64 scalar that equals to the total number of elements of the input tensor.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Size-1">1</a>, <a href="Changelog.md#Size-13">13</a>, <a href="Changelog.md#Size-19">19</a>, <a href="Changelog.md#Size-21">21</a>, <a href="Changelog.md#Size-23">23</a>, <a href="Changelog.md#Size-24">24</a>
node = onnx.helper.make_node(
"Size",
inputs=["x"],
outputs=["y"],
)
x = np.array(
[
[1, 2, 3],
[4, 5, 6],
]
).astype(np.float32)
y = np.array(6).astype(np.int64)
expect(node, inputs=[x], outputs=[y], name="test_size_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.array(x.size).astype(np.int64)
expect(node, inputs=[x], outputs=[y], name="test_size")
Produces a slice of the input tensor along multiple axes. Similar to numpy: https://numpy.org/doc/stable/user/basics.indexing.html?highlight=slice#slicing-and-striding
Slice uses the starts, ends, axes and steps inputs to select a sub-tensor
of its input data tensor.
An effective starts[i], ends[i], and steps[i] must be computed for each i
in [0, ... r-1] where r = rank(input) as follows:
If axes are omitted, they are set to [0, ..., r-1].
If steps are omitted, they are set to [1, ..., 1] of length len(starts)
The effective values are initialized as start[i] = 0, ends[i] = dims[i] where
dims are the dimensions of input and steps[i] = 1.
All negative elements of axes are made non-negative by adding r to them, where
r =rank(input).
All negative values in starts[i] and ends[i] have dims[axes[i]] added to them,
where dims are the dimensions of input. Then start[axes[i]] is the adjusted
starts[i] is clamped into the range [0, dims[axes[i]]] for positive stepping
and [0, dims[axes[i]]-1] for negative stepping.
The clamping for the adjusted ends[i] depends on the sign of steps[i] and must
accommodate copying 0 through dims[axes[i]] elements, so for positive stepping
ends[axes[i]] is clamped to [0, dims[axes[i]]], while for negative stepping it
is clamped to [-1, dims[axes[i]]-1].
Finally, steps[axes[i]] = steps[i].
For slicing to the end of a dimension with unknown size, it is recommended to pass
in INT_MAX when slicing forward and 'INT_MIN' when slicing backward.
Example 1:
data = [
[1, 2, 3, 4],
[5, 6, 7, 8],
]
axes = [0, 1]
starts = [1, 0]
ends = [2, 3]
steps = [1, 2]
result = [
[5, 7],
]
Example 2:
data = [
[1, 2, 3, 4],
[5, 6, 7, 8],
]
starts = [0, 1]
ends = [-1, 1000]
result = [
[2, 3, 4],
]
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Slice-1">1</a>, <a href="Changelog.md#Slice-10">10</a>, <a href="Changelog.md#Slice-11">11</a>
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes", "steps"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
y = x[0:3, 0:10]
starts = np.array([0, 0], dtype=np.int64)
ends = np.array([3, 10], dtype=np.int64)
axes = np.array([0, 1], dtype=np.int64)
steps = np.array([1, 1], dtype=np.int64)
expect(
node, inputs=[x, starts, ends, axes, steps], outputs=[y], name="test_slice"
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
y = x[:, :, 3:4]
expect(
node, inputs=[x, starts, ends], outputs=[y], name="test_slice_default_axes"
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
axes = np.array([0, 1, 2], dtype=np.int64)
y = x[:, :, 3:4]
expect(
node,
inputs=[x, starts, ends, axes],
outputs=[y],
name="test_slice_default_steps",
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes", "steps"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([1], dtype=np.int64)
ends = np.array([1000], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 1:1000]
expect(
node,
inputs=[x, starts, ends, axes, steps],
outputs=[y],
name="test_slice_end_out_of_bounds",
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes", "steps"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0], dtype=np.int64)
ends = np.array([-1], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 0:-1]
expect(
node,
inputs=[x, starts, ends, axes, steps],
outputs=[y],
name="test_slice_neg",
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes", "steps"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([20, 10, 4], dtype=np.int64)
ends = np.array([0, 0, 1], dtype=np.int64)
axes = np.array([0, 1, 2], dtype=np.int64)
steps = np.array([-1, -3, -2]).astype(np.int64)
y = x[20:0:-1, 10:0:-3, 4:1:-2]
expect(
node,
inputs=[x, starts, ends, axes, steps],
outputs=[y],
name="test_slice_neg_steps",
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
axes = np.array([0, -2, -1], dtype=np.int64)
y = x[:, :, 3:4]
expect(
node,
inputs=[x, starts, ends, axes],
outputs=[y],
name="test_slice_negative_axes",
)
node = onnx.helper.make_node(
"Slice",
inputs=["x", "starts", "ends", "axes", "steps"],
outputs=["y"],
)
x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([1000], dtype=np.int64)
ends = np.array([1000], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 1000:1000]
expect(
node,
inputs=[x, starts, ends, axes, steps],
outputs=[y],
name="test_slice_start_out_of_bounds",
)
The operator computes the normalized exponential values for the given input:
Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)
The "axis" attribute indicates the dimension along which Softmax will be performed. The output tensor has the same shape and contains the Softmax values of the corresponding input.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Softmax-1">1</a>, <a href="Changelog.md#Softmax-11">11</a>
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
)
x = np.array([[-1, 0, 1]]).astype(np.float32)
# expected output [[0.09003058, 0.24472848, 0.66524094]]
y = softmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_example")
x = np.array([[0, 1, 2, 3], [10000, 10001, 10002, 10003]]).astype(np.float32)
# expected output
# [[0.032058604 0.08714432 0.23688284 0.6439143 ]
# [0.032058604 0.08714432 0.23688284 0.6439143 ]]
y = softmax(x)
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_softmax_large_number")
x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
axis=0,
)
y = softmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_0")
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
axis=1,
)
y = softmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_1")
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
axis=2,
)
y = softmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_2")
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
axis=-1,
)
y = softmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_negative_axis")
# default axis is -1
node = onnx.helper.make_node(
"Softmax",
inputs=["x"],
outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_softmax_default_axis")
Loss function that measures the softmax cross entropy between 'scores' and 'labels'. This operator first computes a loss tensor whose shape is identical to the labels input. If the input is 2-D with shape (N, C), the loss tensor may be a N-element vector L = (l_1, l_2, ..., l_N). If the input is N-D tensor with shape (N, C, D1, D2, ..., Dk), the loss tensor L may have (N, D1, D2, ..., Dk) as its shape and L[i,][j_1][j_2]...[j_k] denotes a scalar element in L. After L is available, this operator can optionally do a reduction operator.
The loss for one sample, l_i, can calculated as follows:
l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk], where i is the index of classes.
or
l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk] * weights[c], if 'weights' is provided.
loss is zero for the case when label-value equals ignore_index.
l[i][d1][d2]...[dk] = 0, when labels[n][d1][d2]...[dk] = ignore_index
where:
p = Softmax(scores)
y = Log(p)
c = labels[i][d1][d2]...[dk]
Finally, L is optionally reduced:
ReduceSum(L) / ReduceSum(W),
where tensor W is of shape (N, D1, D2, ..., Dk) and W[n][d1][d2]...[dk] = weights[labels[i][d1][d2]...[dk]].This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#SoftmaxCrossEntropyLoss-12">12</a>
reduction = "mean"
ignore_index = np.int64(-1)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1 = 3, 5, 6
np.random.seed(0)
x = np.random.rand(N, C, dim1).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
labels[0][0] = -1
weight = np.random.rand(C).astype(np.float32)
sce = softmaxcrossentropy(
x, labels, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[x, labels, weight],
outputs=[sce],
name="test_sce_NCd1_mean_weight_negative_ii",
)
reduction = "mean"
ignore_index = np.int64(-1)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1 = 3, 5, 6
np.random.seed(0)
x = np.random.rand(N, C, dim1).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
labels[0][0] = -1
weight = np.random.rand(C).astype(np.float32)
loss, log_prob = softmaxcrossentropy(
x,
labels,
weight=weight,
reduction=reduction,
ignore_index=ignore_index,
get_log_prob=True,
)
expect(
node,
inputs=[x, labels, weight],
outputs=[loss, log_prob],
name="test_sce_NCd1_mean_weight_negative_ii_log_prob",
)
reduction = "none"
ignore_index = np.int64(-5)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
np.int64
)
labels[0][0][0][0] = -5
sce = softmaxcrossentropy(
x, labels, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[x, labels],
outputs=[sce],
name="test_sce_NCd1d2d3_none_no_weight_negative_ii",
)
reduction = "none"
ignore_index = np.int64(-5)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
np.int64
)
labels[0][0][0][0] = -5
loss, log_prob = softmaxcrossentropy(
x, labels, reduction=reduction, ignore_index=ignore_index, get_log_prob=True
)
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_NCd1d2d3_none_no_weight_negative_ii_log_prob",
)
reduction = "sum"
ignore_index = np.int64(10)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C = 3, 5
np.random.seed(0)
x = np.random.rand(N, C).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N)).astype(np.int64)
labels[0] = 10
weight = np.random.rand(C).astype(np.float32)
sce = softmaxcrossentropy(
x, labels, weight=weight, reduction=reduction, ignore_index=ignore_index
)
expect(
node,
inputs=[x, labels, weight],
outputs=[sce],
name="test_sce_NCd1d2d3_sum_weight_high_ii",
)
reduction = "sum"
ignore_index = np.int64(10)
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
N, C = 3, 5
np.random.seed(0)
x = np.random.rand(N, C).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N)).astype(np.int64)
labels[0] = 10
weight = np.random.rand(C).astype(np.float32)
loss, log_prob = softmaxcrossentropy(
x,
labels,
weight=weight,
reduction=reduction,
ignore_index=ignore_index,
get_log_prob=True,
)
expect(
node,
inputs=[x, labels, weight],
outputs=[loss, log_prob],
name="test_sce_NCd1d2d3_sum_weight_high_ii_log_prob",
)
reduction = "mean"
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
sce = softmaxcrossentropy(x, labels, weight=weight, reduction=reduction)
expect(
node,
inputs=[x, labels, weight],
outputs=[sce],
name="test_sce_NCd1d2d3d4d5_mean_weight",
)
reduction = "mean"
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)
loss, log_prob = softmaxcrossentropy(
x, labels, weight=weight, reduction=reduction, get_log_prob=True
)
expect(
node,
inputs=[x, labels, weight],
outputs=[loss, log_prob],
name="test_sce_NCd1d2d3d4d5_mean_weight_log_prob",
)
reduction = "none"
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
sce = softmaxcrossentropy(x, labels, reduction=reduction)
expect(
node,
inputs=[x, labels],
outputs=[sce],
name="test_sce_NCd1d2d3d4d5_none_no_weight",
)
reduction = "none"
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
)
N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
loss, log_prob = softmaxcrossentropy(
x, labels, reduction=reduction, get_log_prob=True
)
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_NCd1d2d3d4d5_none_no_weight_log_prob",
)
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels)
# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_mean")
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
y = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, y)
# Check results
expect(node, inputs=[x, y], outputs=[sce], name="test_sce_mean_3d")
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
y = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(x, y, get_log_prob=True)
# Check results
expect(
node,
inputs=[x, y],
outputs=[loss, log_prob],
name="test_sce_mean_3d_log_prob",
)
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(x, labels, get_log_prob=True)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_mean_log_prob",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, ignore_index=ignore_index)
# Check results
expect(
node, inputs=[x, labels], outputs=[sce], name="test_sce_mean_no_weight_ii"
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, ignore_index=ignore_index)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[sce],
name="test_sce_mean_no_weight_ii_3d",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, ignore_index=ignore_index, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_mean_no_weight_ii_3d_log_prob",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(
x, labels, reduction=reduction, ignore_index=ignore_index
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[sce],
name="test_sce_mean_no_weight_ii_4d",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, reduction=reduction, ignore_index=ignore_index, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_mean_no_weight_ii_4d_log_prob",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(2)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, ignore_index=ignore_index, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_mean_no_weight_ii_log_prob",
)
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[sce],
name="test_sce_mean_weight",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(0)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(0)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[sce],
name="test_sce_mean_weight_ii",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(1)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(1)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[sce],
name="test_sce_mean_weight_ii_3d",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(1)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(1)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, weight=weights, ignore_index=ignore_index, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[loss, log_prob],
name="test_sce_mean_weight_ii_3d_log_prob",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(
x, labels, reduction=reduction, weight=weights, ignore_index=ignore_index
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[sce],
name="test_sce_mean_weight_ii_4d",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x,
labels,
reduction=reduction,
weight=weights,
ignore_index=ignore_index,
get_log_prob=True,
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[loss, log_prob],
name="test_sce_mean_weight_ii_4d_log_prob",
)
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(0)
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
ignore_index=ignore_index,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(0)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, weight=weights, ignore_index=ignore_index, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[loss, log_prob],
name="test_sce_mean_weight_ii_log_prob",
)
# Define operator attributes.
reduction = "mean"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, weight=weights, get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[loss, log_prob],
name="test_sce_mean_weight_log_prob",
)
# Define operator attributes.
reduction = "none"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, reduction="none")
# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_none")
# Define operator attributes.
reduction = "none"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, reduction="none", get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_none_log_prob",
)
# Define operator attributes.
reduction = "none"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, reduction="none")
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[sce],
name="test_sce_none_weights",
)
# Define operator attributes.
reduction = "none"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y", "w"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, weight=weights, reduction="none", get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels, weights],
outputs=[loss, log_prob],
name="test_sce_none_weights_log_prob",
)
# Define operator attributes.
reduction = "sum"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, reduction="sum")
# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_sum")
# Define operator attributes.
reduction = "sum"
# Create operator.
node = onnx.helper.make_node(
"SoftmaxCrossEntropyLoss",
inputs=["x", "y"],
outputs=["z", "log_prob"],
reduction=reduction,
)
# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
x, labels, reduction="sum", get_log_prob=True
)
# Check results
expect(
node,
inputs=[x, labels],
outputs=[loss, log_prob],
name="test_sce_sum_log_prob",
)
Softplus takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the softplus function, y = ln(exp(x) + 1), is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Softplus-1">1</a>
node = onnx.helper.make_node(
"Softplus",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.log(
np.exp(x) + 1
) # expected output [0.31326166, 0.69314718, 1.31326163]
expect(node, inputs=[x], outputs=[y], name="test_softplus_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.log(np.exp(x) + 1)
expect(node, inputs=[x], outputs=[y], name="test_softplus")
Calculates the softsign (x/(1+|x|)) of the given input tensor element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Softsign-1">1</a>
node = onnx.helper.make_node(
"Softsign",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-0.5, 0, 0.5]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_softsign_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = x / (1 + np.abs(x))
expect(node, inputs=[x], outputs=[y], name="test_softsign")
SpaceToDepth rearranges blocks of spatial data into depth. More specifically, this op outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#SpaceToDepth-1">1</a>
node = onnx.helper.make_node(
"SpaceToDepth",
inputs=["x"],
outputs=["y"],
blocksize=2,
)
# (1, 1, 4, 6) input tensor
x = np.array(
[
[
[
[0, 6, 1, 7, 2, 8],
[12, 18, 13, 19, 14, 20],
[3, 9, 4, 10, 5, 11],
[15, 21, 16, 22, 17, 23],
]
]
]
).astype(np.float32)
# (1, 4, 2, 3) output tensor
y = np.array(
[
[
[[0, 1, 2], [3, 4, 5]],
[[6, 7, 8], [9, 10, 11]],
[[12, 13, 14], [15, 16, 17]],
[[18, 19, 20], [21, 22, 23]],
]
]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_spacetodepth_example")
b, c, h, w = shape = (2, 2, 6, 6)
blocksize = 2
node = onnx.helper.make_node(
"SpaceToDepth",
inputs=["x"],
outputs=["y"],
blocksize=blocksize,
)
x = np.random.random_sample(shape).astype(np.float32)
tmp = np.reshape(
x, [b, c, h // blocksize, blocksize, w // blocksize, blocksize]
)
tmp = np.transpose(tmp, [0, 3, 5, 1, 2, 4])
y = np.reshape(tmp, [b, c * (blocksize**2), h // blocksize, w // blocksize])
expect(node, inputs=[x], outputs=[y], name="test_spacetodepth")
Split a tensor into a list of tensors, along the specified 'axis'.
Either input 'split' or the attribute 'num_outputs' should be specified, but not both.
If the attribute 'num_outputs' is specified, then the tensor is split into equal sized parts.
If the tensor is not evenly splittable into num_outputs, the last chunk will be smaller.
If the input 'split' is specified, it indicates the sizes of each output in the split.
This version of the operator has been available since version 18 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Split-1">1</a>, <a href="Changelog.md#Split-2">2</a>, <a href="Changelog.md#Split-11">11</a>, <a href="Changelog.md#Split-13">13</a>
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2", "output_3"],
axis=0,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0]).astype(np.float32),
np.array([5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_1d_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2"],
axis=0,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_1d_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2", "output_3"],
axis=0,
num_outputs=3,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0]).astype(np.float32),
np.array([5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_1d_opset18",
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2"],
axis=0,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_1d_opset18",
)
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]).astype(np.float32)
# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2", "output_3", "output_4"],
num_outputs=4,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0]).astype(np.float32),
np.array([5.0, 6.0]).astype(np.float32),
np.array([7.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_1d_uneven_split_opset18",
)
node_input = np.array(
[[1.0, 2.0, 3.0, 4.0, 5.0, 6.0], [7.0, 8.0, 9.0, 10.0, 11.0, 12.0]]
).astype(np.float32)
node = onnx.helper.make_node(
"Split", inputs=["input"], outputs=["output_1", "output_2"], axis=1
)
expected_outputs = [
np.array([[1.0, 2.0, 3.0], [7.0, 8.0, 9.0]]).astype(np.float32),
np.array([[4.0, 5.0, 6.0], [10.0, 11.0, 12.0]]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_2d_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2"],
axis=1,
)
expected_outputs = [
np.array([[1.0, 2.0], [7.0, 8.0]]).astype(np.float32),
np.array([[3.0, 4.0, 5.0, 6.0], [9.0, 10.0, 11.0, 12.0]]).astype(
np.float32
),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_2d_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
node_input = np.array(
[[1.0, 2.0, 3.0, 4.0, 5.0, 6.0], [7.0, 8.0, 9.0, 10.0, 11.0, 12.0]]
).astype(np.float32)
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2"],
axis=1,
num_outputs=2,
)
expected_outputs = [
np.array([[1.0, 2.0, 3.0], [7.0, 8.0, 9.0]]).astype(np.float32),
np.array([[4.0, 5.0, 6.0], [10.0, 11.0, 12.0]]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_2d",
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2"],
axis=1,
)
expected_outputs = [
np.array([[1.0, 2.0], [7.0, 8.0]]).astype(np.float32),
np.array([[3.0, 4.0, 5.0, 6.0], [9.0, 10.0, 11.0, 12.0]]).astype(
np.float32
),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_2d_opset18",
)
node_input = np.array(
[
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],
[9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0],
]
).astype(np.float32)
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2", "output_3"],
axis=1,
num_outputs=3,
)
expected_outputs = [
np.array([[1.0, 2.0, 3.0], [9.0, 10.0, 11.0]]).astype(np.float32),
np.array([[4.0, 5.0, 6.0], [12.0, 13.0, 14.0]]).astype(np.float32),
np.array([[7.0, 8.0], [15.0, 16.0]]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_2d_uneven_split_opset18",
)
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)
# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
"Split", inputs=["input"], outputs=["output_1", "output_2", "output_3"]
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0]).astype(np.float32),
np.array([5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_default_axis_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split", inputs=["input", "split"], outputs=["output_1", "output_2"]
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_default_axis_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)
# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
"Split",
inputs=["input"],
outputs=["output_1", "output_2", "output_3"],
num_outputs=3,
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0]).astype(np.float32),
np.array([5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input],
outputs=expected_outputs,
name="test_split_equal_parts_default_axis_opset18",
)
split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Split", inputs=["input", "split"], outputs=["output_1", "output_2"]
)
expected_outputs = [
np.array([1.0, 2.0]).astype(np.float32),
np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_variable_parts_default_axis_opset18",
)
# 1-dimensional tensor with dimension_size=0
node_input = np.array([]).astype(np.float32)
# Split empty tensor to tensors of size zero
split = np.array([0, 0, 0]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2", "output_3"],
)
expected_outputs = [
np.array([]).astype(np.float32),
np.array([]).astype(np.float32),
np.array([]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_zero_size_splits_opset13",
opset_imports=[onnx.helper.make_opsetid("", 13)],
)
# 1-dimensional tensor with dimension_size=0
node_input = np.array([]).astype(np.float32)
# Split empty tensor to tensors of size zero
split = np.array([0, 0, 0]).astype(np.int64)
node = onnx.helper.make_node(
"Split",
inputs=["input", "split"],
outputs=["output_1", "output_2", "output_3"],
)
expected_outputs = [
np.array([]).astype(np.float32),
np.array([]).astype(np.float32),
np.array([]).astype(np.float32),
]
expect(
node,
inputs=[node_input, split],
outputs=expected_outputs,
name="test_split_zero_size_splits_opset18",
)
Split a tensor into a sequence of tensors, along the specified 'axis'.
Lengths of the parts can be specified using the optional argument 'split'.
If the argument split' is not specified, a default scalar value of 1 is used as the value of split'.
'split' must contain only positive numbers.
'split' is either a scalar (tensor of empty shape), or a 1-D tensor.
If 'split' is a scalar, then 'input' will be split into chunks all of size 'split'
if possible. The last chunk alone may be smaller than 'split' if the 'input' size
along the given axis 'axis' is not divisible by 'split'.
If 'split' is a 1-dimensional tensor, the input tensor is split into 'size(split)' chunks,
with lengths of the parts on 'axis' specified in 'split'. In this scenario, the sum of entries
in 'split' must be equal to the dimension size of input tensor on 'axis'.
This version of the operator has been available since version 24 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#SplitToSequence-11">11</a>
data = np.arange(18).reshape((3, 6)).astype(np.float32)
node = onnx.helper.make_node(
"SplitToSequence",
["data"],
["seq"],
axis=1,
keepdims=0,
)
expected_outputs = [[data[:, i] for i in range(data.shape[1])]]
expect(
node,
inputs=[data],
outputs=expected_outputs,
name="test_split_to_sequence_nokeepdims",
)
data = np.arange(18).reshape((3, 6)).astype(np.float32)
split = np.array(2, dtype=np.int64)
node = onnx.helper.make_node(
"SplitToSequence", ["data", "split"], ["seq"], axis=1
)
expected_outputs = [
[
np.array([[0.0, 1.0], [6.0, 7.0], [12.0, 13.0]], dtype=np.float32),
np.array([[2.0, 3.0], [8.0, 9.0], [14.0, 15.0]], dtype=np.float32),
np.array([[4.0, 5.0], [10.0, 11.0], [16.0, 17.0]], dtype=np.float32),
]
]
expect(
node,
inputs=[data, split],
outputs=expected_outputs,
name="test_split_to_sequence_1",
)
data = np.arange(18).reshape((3, 6)).astype(np.float32)
split = np.array([1, 2], dtype=np.int64)
node = onnx.helper.make_node(
"SplitToSequence", ["data", "split"], ["seq"], axis=0
)
expected_outputs = [
[
data[:1],
data[1:],
]
]
expect(
node,
inputs=[data, split],
outputs=expected_outputs,
name="test_split_to_sequence_2",
)
Square root takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the square root is, y = x^0.5, is applied to the tensor elementwise. If x is negative, then it will return NaN.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sqrt-1">1</a>, <a href="Changelog.md#Sqrt-6">6</a>
node = onnx.helper.make_node(
"Sqrt",
inputs=["x"],
outputs=["y"],
)
x = np.array([1, 4, 9]).astype(np.float32)
y = np.sqrt(x) # expected output [1., 2., 3.]
expect(node, inputs=[x], outputs=[y], name="test_sqrt_example")
x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
y = np.sqrt(x)
expect(node, inputs=[x], outputs=[y], name="test_sqrt")
Remove single-dimensional entries from the shape of a tensor.
Takes an input axes with a list of axes to squeeze.
If axes is not provided, all the single dimensions will be removed from
the shape. If an axis is selected with shape entry not equal to one, an error is raised.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Squeeze-1">1</a>, <a href="Changelog.md#Squeeze-11">11</a>, <a href="Changelog.md#Squeeze-13">13</a>, <a href="Changelog.md#Squeeze-21">21</a>, <a href="Changelog.md#Squeeze-23">23</a>, <a href="Changelog.md#Squeeze-24">24</a>
node = onnx.helper.make_node(
"Squeeze",
inputs=["x", "axes"],
outputs=["y"],
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
axes = np.array([0], dtype=np.int64)
y = np.squeeze(x, axis=0)
expect(node, inputs=[x, axes], outputs=[y], name="test_squeeze")
node = onnx.helper.make_node(
"Squeeze",
inputs=["x", "axes"],
outputs=["y"],
)
x = np.random.randn(1, 3, 1, 5).astype(np.float32)
axes = np.array([-2], dtype=np.int64)
y = np.squeeze(x, axis=-2)
expect(node, inputs=[x, axes], outputs=[y], name="test_squeeze_negative_axes")
StringConcat concatenates string tensors elementwise (with NumPy-style broadcasting support)
This version of the operator has been available since version 20 of the default ONNX operator set.
node = onnx.helper.make_node(
"StringConcat",
inputs=["x", "y"],
outputs=["result"],
)
x = np.array(["abc", "def"]).astype("object")
y = np.array([".com", ".net"]).astype("object")
result = np.array(["abc.com", "def.net"]).astype("object")
expect(node, inputs=[x, y], outputs=[result], name="test_string_concat")
x = np.array(["cat", "dog", "snake"]).astype("object")
y = np.array(["s"]).astype("object")
result = np.array(["cats", "dogs", "snakes"]).astype("object")
expect(
node,
inputs=[x, y],
outputs=[result],
name="test_string_concat_broadcasting",
)
x = np.array("cat").astype("object")
y = np.array("s").astype("object")
result = np.array("cats").astype("object")
expect(
node,
inputs=[x, y],
outputs=[result],
name="test_string_concat_zero_dimensional",
)
x = np.array(["abc", ""]).astype("object")
y = np.array(["", "abc"]).astype("object")
result = np.array(["abc", "abc"]).astype("object")
expect(
node,
inputs=[x, y],
outputs=[result],
name="test_string_concat_empty_string",
)
x = np.array(["的", "中"]).astype("object")
y = np.array(["的", "中"]).astype("object")
result = np.array(["的的", "中中"]).astype("object")
expect(
node,
inputs=[x, y],
outputs=[result],
name="test_string_concat_utf8",
)
StringNormalization performs string operations for basic cleaning. This operator has only one input (denoted by X) and only one output (denoted by Y). This operator first examines the elements in the X, and removes elements specified in "stopwords" attribute. After removing stop words, the intermediate result can be further lowercased, uppercased, or just returned depending the "case_change_action" attribute. This operator only accepts [C]- and [1, C]-tensor. If all elements in X are dropped, the output will be the empty value of string tensor with shape [1] if input shape is [C] and shape [1, 1] if input shape is [1, C].
This version of the operator has been available since version 10 of the default ONNX operator set.
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["tuesday", "wednesday", "thursday"]).astype(object)
stopwords = ["monday"]
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
case_change_action="LOWER",
is_case_sensitive=1,
stopwords=stopwords,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_export_monday_casesensintive_lower",
)
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["tuesday", "wednesday", "thursday"]).astype(object)
stopwords = ["monday"]
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
is_case_sensitive=1,
stopwords=stopwords,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_export_monday_casesensintive_nochangecase",
)
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["TUESDAY", "WEDNESDAY", "THURSDAY"]).astype(object)
stopwords = ["monday"]
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
case_change_action="UPPER",
is_case_sensitive=1,
stopwords=stopwords,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_export_monday_casesensintive_upper",
)
input = np.array(["monday", "monday"]).astype(object)
output = np.array([""]).astype(object)
stopwords = ["monday"]
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
case_change_action="UPPER",
is_case_sensitive=1,
stopwords=stopwords,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_export_monday_empty_output",
)
input = (
np.array(
["Monday", "tuesday", "wednesday", "Monday", "tuesday", "wednesday"]
)
.astype(object)
.reshape([1, 6])
)
# It does upper case cecedille, accented E
# and german umlaut but fails
# with german eszett
output = (
np.array(["TUESDAY", "WEDNESDAY", "TUESDAY", "WEDNESDAY"])
.astype(object)
.reshape([1, 4])
)
stopwords = ["monday"]
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
case_change_action="UPPER",
stopwords=stopwords,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_export_monday_insensintive_upper_twodim",
)
input = np.array(["monday", "tuesday"]).astype(object)
output = input
# No stopwords. This is a NOOP
node = onnx.helper.make_node(
"StringNormalizer",
inputs=["x"],
outputs=["y"],
is_case_sensitive=1,
)
expect(
node,
inputs=[input],
outputs=[output],
name="test_strnormalizer_nostopwords_nochangecase",
)
StringSplit splits a string tensor's elements into substrings based on a delimiter attribute and a maxsplit attribute.
The first output of this operator is a tensor of strings representing the substrings from splitting each input string on the delimiter substring. This tensor has one additional rank compared to the input tensor in order to store the substrings for each input element (where the input tensor is not empty). Note that, in order to ensure the same number of elements are present in the final dimension, this tensor will pad empty strings as illustrated in the examples below. Consecutive delimiters are not grouped together and are deemed to delimit empty strings, except if the delimiter is unspecified or is the empty string (""). In the case where the delimiter is unspecified or the empty string, consecutive whitespace characters are regarded as a single separator and leading or trailing whitespace is removed in the output.
The second output tensor represents the number of substrings generated. maxsplit can be used to limit the number of splits performed - after the maxsplitth split if the string is not fully split, the trailing suffix of input string after the final split point is also added. For elements where fewer splits are possible than specified in maxsplit, it has no effect.
This version of the operator has been available since version 20 of the default ONNX operator set.
node = onnx.helper.make_node(
"StringSplit",
inputs=["x"],
outputs=["substrings", "length"],
delimiter=".",
maxsplit=None,
)
x = np.array(["abc.com", "def.net"]).astype(object)
substrings = np.array([["abc", "com"], ["def", "net"]]).astype(object)
length = np.array([2, 2], dtype=np.int64)
expect(
node,
inputs=[x],
outputs=[substrings, length],
name="test_string_split_basic",
)
node = onnx.helper.make_node(
"StringSplit",
inputs=["x"],
outputs=["substrings", "length"],
delimiter="-",
maxsplit=None,
)
x = np.array(["o-n-n--x-", "o-n----nx"]).astype(object)
substrings = np.array(
[["o", "n", "n", "", "x", ""], ["o", "n", "", "", "", "nx"]]
).astype(object)
length = np.array([6, 6], dtype=np.int64)
expect(
node,
inputs=[x],
outputs=[substrings, length],
name="test_string_split_consecutive_delimiters",
)
for delimiter, test_name in (
("", "test_string_split_empty_string_delimiter"),
(None, "test_string_split_no_delimiter"),
):
node = onnx.helper.make_node(
"StringSplit",
inputs=["x"],
outputs=["substrings", "length"],
delimiter=delimiter,
maxsplit=None,
)
x = np.array(
["hello world !", " hello world !", " hello world ! "]
).astype(object)
substrings = np.array(
[
["hello", "world", "!"],
["hello", "world", "!"],
["hello", "world", "!"],
]
).astype(object)
length = np.array([3, 3, 3], dtype=np.int64)
expect(
node,
inputs=[x],
outputs=[substrings, length],
name=test_name,
)
node = onnx.helper.make_node(
"StringSplit",
inputs=["x"],
outputs=["substrings", "length"],
delimiter=None,
maxsplit=None,
)
x = np.array([]).astype(object)
substrings = np.array([]).astype(object).reshape(0, 0)
length = np.array([], dtype=np.int64)
expect(
node,
inputs=[x],
outputs=[substrings, length],
name="test_string_split_empty_tensor",
output_type_protos=[
onnx.helper.make_tensor_type_proto(onnx.TensorProto.STRING, (0, None)),
None,
],
)
node = onnx.helper.make_node(
"StringSplit",
inputs=["x"],
outputs=["substrings", "length"],
maxsplit=2,
)
x = np.array(
[["hello world", "def.net"], ["o n n x", "the quick brown fox"]]
).astype(object)
substrings = np.array(
[
[["hello", "world", ""], ["def.net", "", ""]],
[["o", "n", "n x"], ["the", "quick", "brown fox"]],
]
).astype(object)
length = np.array([[2, 1], [3, 3]], np.int64)
expect(
node,
inputs=[x],
outputs=[substrings, length],
name="test_string_split_maxsplit",
)
Performs element-wise binary subtraction (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
This version of the operator has been available since version 14 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sub-1">1</a>, <a href="Changelog.md#Sub-6">6</a>, <a href="Changelog.md#Sub-7">7</a>, <a href="Changelog.md#Sub-13">13</a>
node = onnx.helper.make_node(
"Sub",
inputs=["x", "y"],
outputs=["z"],
)
x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([3, 2, 1]).astype(np.float32)
z = x - y # expected output [-2., 0., 2.]
expect(node, inputs=[x, y], outputs=[z], name="test_sub_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.int8)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.int8)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_int8")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.int16)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.int16)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_int16")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.uint8)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_uint8")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.uint16)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.uint16)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_uint16")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.uint32)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.uint32)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_uint32")
x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.uint64)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.uint64)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_uint64")
node = onnx.helper.make_node(
"Sub",
inputs=["x", "y"],
outputs=["z"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_bcast")
Element-wise sum of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Sum-1">1</a>, <a href="Changelog.md#Sum-6">6</a>, <a href="Changelog.md#Sum-8">8</a>
data_0 = np.array([3, 0, 2]).astype(np.float32)
data_1 = np.array([1, 3, 4]).astype(np.float32)
data_2 = np.array([2, 6, 6]).astype(np.float32)
result = np.array([6, 9, 12]).astype(np.float32)
node = onnx.helper.make_node(
"Sum",
inputs=["data_0", "data_1", "data_2"],
outputs=["result"],
)
expect(
node,
inputs=[data_0, data_1, data_2],
outputs=[result],
name="test_sum_example",
)
node = onnx.helper.make_node(
"Sum",
inputs=["data_0"],
outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_sum_one_input")
result = np.add(data_0, data_1)
node = onnx.helper.make_node(
"Sum",
inputs=["data_0", "data_1"],
outputs=["result"],
)
expect(
node, inputs=[data_0, data_1], outputs=[result], name="test_sum_two_inputs"
)
Swish function takes one input data (Tensor<T>) and produces one output data (Tensor<T>) of the same shape, where $Swish(x) = x * sigmoid(alpha * x)$.
This version of the operator has been available since version 24 of the default ONNX operator set.
node = onnx.helper.make_node(
"Swish",
inputs=["x"],
outputs=["y"],
alpha=1.0, # pass alpha as attribute
)
x = np.array([3, 4, 5], dtype=np.float32)
y = swish(x, alpha=1.0)
expect(
node,
inputs=[x],
outputs=[y],
name="test_swish",
opset_imports=[onnx.helper.make_opsetid("", 24)],
)
Calculates the tangent of the given input tensor, element-wise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Tan-7">7</a>
node = onnx.helper.make_node(
"Tan",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.tan(x)
expect(node, inputs=[x], outputs=[y], name="test_tan_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.tan(x)
expect(node, inputs=[x], outputs=[y], name="test_tan")
Calculates the hyperbolic tangent of the given input tensor element-wise.
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Tanh-1">1</a>, <a href="Changelog.md#Tanh-6">6</a>
node = onnx.helper.make_node(
"Tanh",
inputs=["x"],
outputs=["y"],
)
x = np.array([-1, 0, 1]).astype(np.float32)
y = np.tanh(x) # expected output [-0.76159418, 0., 0.76159418]
expect(node, inputs=[x], outputs=[y], name="test_tanh_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.tanh(x)
expect(node, inputs=[x], outputs=[y], name="test_tanh")
TensorScatter is a generic tensor update operation, motivated by the requirements for KV cache updates for Attention ops commonly found in LLMs. It is a functional operation that models an in-place update to a KV cache buffer.
The past and present cache tensors have the same shape (batch_size, D1, D2, ..., max_sequence_length, ..., Dn), with
the sequence dimension (indicated by the axis attribute) being max_sequence_length, so the sizes of these tensors do
not need to grow between iterations. The update tensor's shape only differs from the cache tensors in the sequence
dimension: (batch_size, D1, D2, ..., sequence_length, ..., Dn), where sequence_length <= max_sequence_length.
The optional write_indices input indicates the write index for each sample in the batch, assumed to be zero
if not provided. When the mode attribute is set to "circular", the write index is modulo max_sequence_length.
The operation can be described using the following pseudocode:
for prefix_idx in np.ndindex(past_cache.shape[:axis]):
batch_idx = prefix_idx[0]
for sequence_idx in range(sequence_length):
cache_idx = (*prefix_idx, write_indices[batch_idx] + sequence_idx)
if mode == "circular":
cache_idx = tuple(np.mod(np.asarray(cache_idx), max_sequence_length))
update_idx = (*prefix_idx, sequence_idx)
present_cache[cache_idx] = update[update_idx]
During the prefill phase of attention, only the first two inputs are needed. During the decode phase, write_indices
is also needed so that the incoming key or value update can be appended after the last valid token for each sample
in the batch.
This version of the operator has been available since version 24 of the default ONNX operator set.
node = onnx.helper.make_node(
"TensorScatter",
inputs=["past_cache", "update", "write_indices"],
outputs=["present_cache"],
mode="linear",
)
past_cache = np.array(
[
[[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9], [8, 7, 6, 5, 4], [4, 3, 2, 1, 0]]],
[[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9], [8, 7, 6, 5, 4], [4, 3, 2, 1, 0]]],
],
dtype=np.float32,
)
update = np.array(
[
[[[5, 5, 5, 5, 5]]],
[[[1, 1, 1, 1, 1]]],
],
dtype=np.float32,
)
write_indices = np.array([1, 2], dtype=np.int64)
present_cache = np.array(
[
[[[1, 2, 3, 4, 5], [5, 5, 5, 5, 5], [8, 7, 6, 5, 4], [4, 3, 2, 1, 0]]],
[[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9], [1, 1, 1, 1, 1], [4, 3, 2, 1, 0]]],
],
dtype=np.float32,
)
expect(
node,
inputs=[past_cache, update, write_indices],
outputs=[present_cache],
name="test_tensorscatter",
)
node = onnx.helper.make_node(
"TensorScatter",
inputs=["past_cache", "update", "write_indices"],
outputs=["present_cache"],
)
past_cache = np.array(
[
[
[1, 2, 3, 4, 5],
[5, 6, 7, 8, 9],
[8, 7, 6, 5, 4],
[5, 4, 3, 2, 1],
],
[
[1, 2, 3, 4, 5],
[5, 6, 7, 8, 9],
[8, 7, 6, 5, 4],
[5, 4, 3, 2, 1],
],
[
[1, 2, 3, 4, 5],
[5, 6, 7, 8, 9],
[8, 7, 6, 5, 4],
[5, 4, 3, 2, 1],
],
],
dtype=np.float32,
)
update = np.array(
[
[
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5],
],
[
[6, 6, 6, 6, 6],
[7, 7, 7, 7, 7],
],
[
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
],
],
dtype=np.float32,
)
write_indices = np.array([1, 2, 0], dtype=np.int64)
present_cache = np.array(
[
[
[1, 2, 3, 4, 5],
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5],
[5, 4, 3, 2, 1],
],
[
[1, 2, 3, 4, 5],
[5, 6, 7, 8, 9],
[6, 6, 6, 6, 6],
[7, 7, 7, 7, 7],
],
[
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[8, 7, 6, 5, 4],
[5, 4, 3, 2, 1],
],
],
dtype=np.float32,
)
expect(
node,
inputs=[past_cache, update, write_indices],
outputs=[present_cache],
name="test_tensorscatter_3d",
)
node = onnx.helper.make_node(
"TensorScatter",
inputs=["past_cache", "update", "write_indices"],
outputs=["present_cache"],
mode="circular",
)
past_cache = np.array(
[
[[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9], [8, 7, 6, 5, 4], [4, 3, 2, 1, 0]]],
[[[1, 2, 3, 4, 5], [5, 6, 7, 8, 9], [8, 7, 6, 5, 4], [4, 3, 2, 1, 0]]],
],
dtype=np.float32,
)
update = np.array(
[
[
[
[5, 5, 5, 5, 5],
[6, 6, 6, 6, 6],
]
],
[
[
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
]
],
],
dtype=np.float32,
)
write_indices = np.array([1, 3], dtype=np.int64)
present_cache = np.array(
[
[[[1, 2, 3, 4, 5], [5, 5, 5, 5, 5], [6, 6, 6, 6, 6], [4, 3, 2, 1, 0]]],
[[[2, 2, 2, 2, 2], [5, 6, 7, 8, 9], [8, 7, 6, 5, 4], [1, 1, 1, 1, 1]]],
],
dtype=np.float32,
)
expect(
node,
inputs=[past_cache, update, write_indices],
outputs=[present_cache],
name="test_tensorscatter_circular",
)
This transform extracts n-grams from the input sequence and save them as a vector. Input can be either a 1-D or 2-D tensor. For 1-D input, output is the n-gram representation of that input. For 2-D input, the output is also a 2-D tensor whose i-th row is the n-gram representation of the i-th input row. More specifically, if input shape is [C], the corresponding output shape would be [max(ngram_indexes) + 1]. If input shape is [N, C], this operator produces a [N, max(ngram_indexes) + 1]-tensor.
In contrast to standard n-gram extraction, here, the indexes of extracting an n-gram from the original sequence are not necessarily consecutive numbers. The discontinuity between indexes are controlled by the number of skips. If the number of skips is 2, we should skip two tokens when scanning through the original sequence. Let's consider an example. Assume that input sequence is [94, 17, 36, 12, 28] and the number of skips is 2. The associated 2-grams are [94, 12] and [17, 28] respectively indexed by [0, 3] and [1, 4]. If the number of skips becomes 0, the 2-grams generated are [94, 17], [17, 36], [36, 12], [12, 28] indexed by [0, 1], [1, 2], [2, 3], [3, 4], respectively.
The output vector (denoted by Y) stores the count of each n-gram; Y[ngram_indexes[i]] indicates the times that the i-th n-gram is found. The attribute ngram_indexes is used to determine the mapping between index i and the corresponding n-gram's output coordinate. If pool_int64s is [94, 17, 17, 36], ngram_indexes is [1, 0], ngram_counts=[0, 0], then the Y[0] (first element in Y) and Y[1] (second element in Y) are the counts of [17, 36] and [94, 17], respectively. An n-gram which cannot be found in pool_strings/pool_int64s should be ignored and has no effect on the output. Note that we may consider all skips up to S when generating the n-grams.
The examples used above are true if mode is "TF". If mode is "IDF", all the counts larger than 1 would be truncated to 1 and the i-th element in weights would be used to scale (by multiplication) the count of the i-th n-gram in pool. If mode is "TFIDF", this operator first computes the counts of all n-grams and then scale them by the associated values in the weights attribute.
Only one of pool_strings and pool_int64s can be set. If pool_int64s is set, the input should be an integer tensor. If pool_strings is set, the input must be a string tensor.
This version of the operator has been available since version 9 of the default ONNX operator set.
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0]]
).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=2,
max_gram_length=2,
max_skip_count=0,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_batch_onlybigrams_skip0",
)
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0]]
).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=2,
max_gram_length=2,
max_skip_count=5,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_batch_onlybigrams_skip5",
)
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
[[0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0]]
).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=1,
max_gram_length=2,
max_skip_count=5,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_batch_uniandbigrams_skip5",
)
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0]).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=2,
max_gram_length=2,
max_skip_count=0,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_only_bigrams_skip0",
)
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([1.0, 1.0, 1.0]).astype(np.float32)
ngram_counts = np.array([0, 0]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2]).astype(np.int64)
pool_int64s = np.array([5, 6, 7, 8, 6, 7]).astype( # unigrams none
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=2,
max_gram_length=2,
max_skip_count=0,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_onlybigrams_levelempty",
)
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 0.0, 0.0, 0.0, 1.0, 3.0, 1.0]).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=2,
max_gram_length=2,
max_skip_count=5,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_onlybigrams_skip5",
)
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 3.0, 1.0, 0.0, 1.0, 3.0, 1.0]).astype(np.float32)
ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype( # unigrams
np.int64
) # bigrams
helper = TfIdfVectorizerHelper(
mode="TF",
min_gram_length=1,
max_gram_length=2,
max_skip_count=5,
ngram_counts=ngram_counts,
ngram_indexes=ngram_indexes,
pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
node,
inputs=[input],
outputs=[output],
name="test_tfidfvectorizer_tf_uniandbigrams_skip5",
)
ThresholdedRelu takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where the rectified linear function, y = x for x > alpha, y = 0 otherwise, is applied to the tensor elementwise.
This version of the operator has been available since version 22 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#ThresholdedRelu-10">10</a>
default_alpha = 1.0
node = onnx.helper.make_node("ThresholdedRelu", inputs=["x"], outputs=["y"])
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, default_alpha, np.inf)
y[y == default_alpha] = 0
expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu_default")
alpha = 2.0
node = onnx.helper.make_node(
"ThresholdedRelu", inputs=["x"], outputs=["y"], alpha=alpha
)
x = np.array([-1.5, 0.0, 1.2, 2.0, 2.2]).astype(np.float32)
y = np.clip(x, alpha, np.inf) # expected output [0., 0., 0., 0., 2.2]
y[y == alpha] = 0
expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu_example")
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, alpha, np.inf)
y[y == alpha] = 0
expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu")
Constructs a tensor by tiling a given tensor.
This is the same as function tile in Numpy, but no broadcast.
For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]
This version of the operator has been available since version 13 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Tile-1">1</a>, <a href="Changelog.md#Tile-6">6</a>
node = onnx.helper.make_node("Tile", inputs=["x", "y"], outputs=["z"])
x = np.random.rand(2, 3, 4, 5).astype(np.float32)
repeats = np.random.randint(low=1, high=10, size=(np.ndim(x),)).astype(np.int64)
z = np.tile(x, repeats)
expect(node, inputs=[x, repeats], outputs=[z], name="test_tile")
node = onnx.helper.make_node("Tile", inputs=["x", "y"], outputs=["z"])
x = np.array([[0, 1], [2, 3]], dtype=np.float32)
repeats = np.array([2, 2], dtype=np.int64)
z = np.array(
[[0, 1, 0, 1], [2, 3, 2, 3], [0, 1, 0, 1], [2, 3, 2, 3]], dtype=np.float32
)
expect(node, inputs=[x, repeats], outputs=[z], name="test_tile_precomputed")
Retrieve the top-K largest or smallest elements along a specified axis. Given an input tensor of shape [a_0, a_1, ..., a_{n-1}] and integer argument k, return two outputs:
Value tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}] which contains the values of the top k elements along the specified axis
Index tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}] which contains the indices of the top k elements (original indices from the input tensor).
If "largest" is 1 (the default value) then the k largest elements are returned.
If "sorted" is 1 (the default value) then the resulting k elements will be sorted.
If "sorted" is 0, order of returned 'Values' and 'Indices' are undefined.
Given two equivalent values, this operator uses the indices along the axis as a tiebreaker. That is, the element with the lower index will appear first.
This version of the operator has been available since version 24 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#TopK-1">1</a>, <a href="Changelog.md#TopK-10">10</a>, <a href="Changelog.md#TopK-11">11</a>
axis = 1
largest = 1
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[
[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
],
dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [[ 3. 2. 1.]
# [ 7. 6. 5.]
# [11. 10. 9.]]
# print(indices_ref)
# [[3 2 1]
# [3 2 1]
# [3 2 1]]
expect(
node, inputs=[X, K], outputs=[values_ref, indices_ref], name="test_top_k"
)
axis = -1
largest = 1
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[
[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
],
dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [[ 3. 2. 1.]
# [ 7. 6. 5.]
# [11. 10. 9.]]
# print(indices_ref)
# [[3 2 1]
# [3 2 1]
# [3 2 1]]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_negative_axis",
)
axis = 0
largest = 0
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[0, 0, 0, 0],
dtype=np.int64,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# (Pdb) print(values_ref)
# [0 0 0]
# (Pdb) print(indices_ref)
# [0 1 2]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_same_values",
)
axis = 1
largest = 1
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[[0, 0, 0, 0], [1, 1, 1, 1], [2, 2, 1, 1]],
dtype=np.int64,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [[0 0 0]
# [1 1 1]
# [1 1 2]]
# print(indices_ref)
# [[0 1 2]
# [0 1 2]
# [2 3 0]]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_same_values_2d",
)
axis = 0
largest = 1
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[0, 0, 0, 0],
dtype=np.int64,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [0 0 0]
# print(indices_ref)
# [0 1 2]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_same_values_largest",
)
axis = 1
largest = 0
sorted_ = 1
k = 3
node = onnx.helper.make_node(
"TopK",
inputs=["x", "k"],
outputs=["values", "indices"],
axis=axis,
largest=largest,
sorted=sorted_,
)
X = np.array(
[
[0, 1, 2, 3],
[4, 5, 6, 7],
[11, 10, 9, 8],
],
dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [[ 0. 1. 2.]
# [ 4. 5. 6.]
# [ 8. 9. 10.]]
# print(indices_ref)
# [[0 1 2]
# [0 1 2]
# [3 2 1]]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_smallest",
)
axis = 1
largest = 1
k = 3
node = onnx.helper.make_node(
"TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
[
[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
],
dtype=np.uint64,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)
# print(values_ref)
# [[ 3 2 1]
# [ 7 6 5]
# [11 10 9]]
# print(indices_ref)
# [[3 2 1]
# [3 2 1]
# [3 2 1]]
expect(
node,
inputs=[X, K],
outputs=[values_ref, indices_ref],
name="test_top_k_uint64",
)
Returns a transpose of the input tensor. (Similar to numpy.transpose).
The optional attribute perm must be a permutation of the dimensions of
the input tensor. Axis i of the output tensor corresponds to the axis
perm[i] of the input tensor.
For example, when perm=(1, 0, 2), given an input tensor of shape (1, 2, 3),
the output shape will be (2, 1, 3).
When perm=(1, 2, 0), given an input tensor of shape (1, 2, 3),
the output shape will be (2, 3, 1).
If the attribute perm is omitted, its default value is (n-1, ..., 0),
where n is the rank of the input tensor.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Transpose-1">1</a>, <a href="Changelog.md#Transpose-13">13</a>, <a href="Changelog.md#Transpose-21">21</a>, <a href="Changelog.md#Transpose-23">23</a>, <a href="Changelog.md#Transpose-24">24</a>
shape = (2, 3, 4)
data = np.random.random_sample(shape).astype(np.float32)
permutations = list(itertools.permutations(np.arange(len(shape))))
for i, permutation in enumerate(permutations):
node = onnx.helper.make_node(
"Transpose",
inputs=["data"],
outputs=["transposed"],
perm=permutation,
)
transposed = np.transpose(data, permutation)
expect(
node,
inputs=[data],
outputs=[transposed],
name=f"test_transpose_all_permutations_{i}",
)
shape = (2, 3, 4)
data = np.random.random_sample(shape).astype(np.float32)
node = onnx.helper.make_node(
"Transpose", inputs=["data"], outputs=["transposed"]
)
transposed = np.transpose(data)
expect(node, inputs=[data], outputs=[transposed], name="test_transpose_default")
Given a 2-D matrix or batches of 2-D matrices, returns the upper or lower triangular part of the tensor(s). The attribute "upper" determines whether the upper or lower part is retained. If set to true, the upper triangular matrix is retained. Lower triangular matrix is retained otherwise. Default value for the "upper" attribute is true. Trilu takes one input tensor of shape [*, N, M], where * is zero or more batch dimensions. The upper triangular part consists of the elements on and above the given diagonal (k). The lower triangular part consists of elements on and below the diagonal. All other elements in the matrix are set to zero. If k = 0, the triangular part on and above/below the main diagonal is retained. If upper is set to true, a positive k retains the upper triangular matrix excluding the main diagonal and (k-1) diagonals above it. A negative k value retains the main diagonal and |k| diagonals below it. If upper is set to false, a positive k retains the lower triangular matrix including the main diagonal and k diagonals above it. A negative k value excludes the main diagonal and (|k|-1) diagonals below it.
This version of the operator has been available since version 14 of the default ONNX operator set.
node = onnx.helper.make_node(
"Trilu",
inputs=["x"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 0, 0, 0, 0],
# [1, 2, 0, 0, 0],
# [9, 4, 1, 0, 0],
# [4, 3, 4, 2, 0]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[0, 0, 0, 0, 0],
# [1, 0, 0, 0, 0],
# [9, 4, 0, 0, 0],
# [4, 3, 4, 0, 0]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(3, 1, 5)).astype(np.int64)
# X:
# [[[6, 2, 4, 1, 6]],
#
# [[8, 3, 8, 7, 0]],
#
# [[2, 2, 9, 5, 9]]]
# expect result:
# [[[6, 0, 0, 0, 0]],
#
# [[8, 0, 0, 0, 0]],
#
# [[2, 0, 0, 0, 0]]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril_one_row_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-7).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_out_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_out_pos")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(2).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 7, 3, 0, 0],
# [1, 2, 8, 6, 0],
# [9, 4, 1, 8, 7],
# [4, 3, 4, 2, 4]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_pos")
node = onnx.helper.make_node(
"Trilu",
inputs=["x"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
# X:
# [[[0, 4, 3],
# [2, 0, 9],
# [8, 2, 5]],
#
# [[2, 7, 2],
# [2, 6, 0],
# [2, 6, 5]]]
# expect result:
# [[[0, 0, 0],
# [2, 0, 0],
# [8, 2, 5]],
#
# [[2, 0, 0],
# [2, 6, 0],
# [2, 6, 5]]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril_square")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[[0, 4, 3],
# [2, 0, 9],
# [8, 2, 5]],
#
# [[2, 7, 2],
# [2, 6, 0],
# [2, 6, 5]]]
# expect result:
# [[[0, 0, 0],
# [2, 0, 0],
# [8, 2, 0]],
#
# [[0, 0, 0],
# [2, 0, 0],
# [2, 6, 0]]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_square_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
upper=0,
)
x = np.random.randint(10, size=(3, 0, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# []
# expect result:
# []
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_zero")
node = onnx.helper.make_node(
"Trilu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 7, 3, 7, 9],
# [0, 2, 8, 6, 9],
# [0, 0, 0, 8, 7],
# [0, 0, 0, 2, 4]]
y = triu_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_triu")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [0, 4, 0, 8, 7],
# [0, 0, 4, 2, 4]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(3, 1, 5)).astype(np.int64)
k = np.array(1).astype(np.int64)
# X:
# [[[1, 4, 9, 7, 1]],
#
# [[9, 2, 8, 8, 4]],
#
# [[3, 9, 7, 4, 2]]]
# expect result:
# [[[0, 4, 9, 7, 1]],
#
# [[0, 2, 8, 8, 4]],
#
# [[0, 9, 7, 4, 2]]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_one_row")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-7).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_out_neg_out")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_out_pos")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(2).astype(np.int64)
# X:
# [[4, 7, 3, 7, 9],
# [1, 2, 8, 6, 9],
# [9, 4, 0, 8, 7],
# [4, 3, 4, 2, 4]]
# expect result:
# [[0, 0, 3, 7, 9],
# [0, 0, 0, 6, 9],
# [0, 0, 0, 0, 7],
# [0, 0, 0, 0, 0]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_pos")
node = onnx.helper.make_node(
"Trilu",
inputs=["x"],
outputs=["y"],
)
x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
y = triu_reference_implementation(x)
# X:
# [[[4, 6, 9],
# [7, 5, 4],
# [8, 1, 2]],
#
# [[1, 4, 9],
# [9, 6, 3],
# [8, 9, 8]]]
# expect result:
# [[[4, 6, 9],
# [0, 5, 4],
# [0, 0, 2]],
#
# [[1, 4, 9],
# [0, 6, 3],
# [0, 0, 8]]]
expect(node, inputs=[x], outputs=[y], name="test_triu_square")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[[4, 6, 9],
# [7, 5, 4],
# [8, 1, 2]],
#
# [[1, 4, 9],
# [9, 6, 3],
# [8, 9, 8]]]
# expect result:
# [[[4, 6, 9],
# [7, 5, 4],
# [0, 1, 2]],
#
# [[1, 4, 9],
# [9, 6, 3],
# [0, 9, 8]]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_square_neg")
node = onnx.helper.make_node(
"Trilu",
inputs=["x", "k"],
outputs=["y"],
)
x = np.random.randint(10, size=(0, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# []
# expect result:
# []
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_zero")
Find the unique elements of a tensor. When an optional attribute 'axis' is provided, unique subtensors sliced along the 'axis' are returned. Otherwise the input tensor is flattened and unique values of the flattened tensor are returned.
This operator returns the unique values or sliced unique subtensors of the input tensor and three optional outputs. The first output tensor 'Y' contains all unique values or subtensors of the input. The second optional output tensor 'indices' contains indices of 'Y' elements' first occurrence in 'X'. The third optional output tensor 'inverse_indices' contains, for elements of 'X', its corresponding indices in 'Y'. The fourth optional output tensor 'counts' contains the count of each element of 'Y' in the input.
Outputs are either sorted in ascending order or optionally in the order of the first occurrence of the values in the input.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html
Example 1:
input_X = [2, 1, 1, 3, 4, 3]
attribute_sorted = 0
attribute_axis = None
output_Y = [2, 1, 3, 4]
output_indices = [0, 1, 3, 4]
output_inverse_indices = [0, 1, 1, 2, 3, 2]
output_counts = [1, 2, 2, 1]
Example 2:
input_X = [[1, 3], [2, 3]]
attribute_sorted = 1
attribute_axis = None
output_Y = [1, 2, 3]
output_indices = [0, 2, 1]
output_inverse_indices = [0, 2, 1, 2]
output_counts = [1, 1, 2]
Example 3:
input_X = [[1, 0, 0], [1, 0, 0], [2, 3, 4]]
attribute_sorted = 1
attribute_axis = 0
output_Y = [[1, 0, 0], [2, 3, 4]]
output_indices = [0, 2]
output_inverse_indices = [0, 0, 1]
output_counts = [2, 1]
Example 4:
input_x = [[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
[[1., 1.], [0., 1.], [2., 1.], [0., 1.]]]
attribute_sorted = 1
attribute_axis = 1
intermediate data are presented below for better understanding: there are 4 subtensors sliced along axis 1 of input_x (shape = (2, 4, 2)):
A: [[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]],
[[0, 1], [0, 1]].
there are 3 unique subtensors:
[[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]].
sorted unique subtensors:
B: [[0, 1], [0, 1]],
[[1, 1], [1, 1]],
[[2, 1], [2, 1]].
output_Y is constructed from B:
[[[0. 1.], [1. 1.], [2. 1.]],
[[0. 1.], [1. 1.], [2. 1.]]]
output_indices is to map from B to A:
[1, 0, 2]
output_inverse_indices is to map from A to B:
[1, 0, 2, 0]
output_counts:
[2, 1, 1]
This version of the operator has been available since version 11 of the default ONNX operator set.
node_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
sorted=1,
)
x = np.array([0], dtype=np.int64)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
# behavior changed with numpy >= 2.0
inverse_indices = inverse_indices.reshape(-1)
# print(y)
# [0]
# print(indices)
# [0]
# print(inverse_indices)
# [0]
# print(counts)
# [1]
expect(
node_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_length_1",
)
node_not_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
sorted=0,
)
# numpy unique does not retain original order (it sorts the output unique values)
# https://github.com/numpy/numpy/issues/8621
# we need to recover unsorted output and indices
x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)
# prepare index mapping from sorted to unsorted
argsorted_indices = np.argsort(indices)
inverse_indices_map = dict(
zip(argsorted_indices, np.arange(len(argsorted_indices)), strict=True)
)
indices = indices[argsorted_indices]
y = np.take(x, indices, axis=0)
inverse_indices = np.asarray(
[inverse_indices_map[i] for i in inverse_indices], dtype=np.int64
)
counts = counts[argsorted_indices]
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
# print(y)
# [2.0, 1.0, 3.0, 4.0]
# print(indices)
# [0 1 3 4]
# print(inverse_indices)
# [0, 1, 1, 2, 3, 2]
# print(counts)
# [1, 2, 2, 1]
expect(
node_not_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_not_sorted_without_axis",
)
node_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
sorted=1,
axis=0,
)
x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=0)
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
# behavior changed with numpy >= 2.0
inverse_indices = inverse_indices.reshape(-1)
# print(y)
# [[1. 0. 0.]
# [2. 3. 4.]]
# print(indices)
# [0 2]
# print(inverse_indices)
# [0 0 1]
# print(counts)
# [2 1]
expect(
node_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_sorted_with_axis",
)
node_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
sorted=1,
axis=1,
)
x = np.array(
[
[[1.0, 1.0], [0.0, 1.0], [2.0, 1.0], [0.0, 1.0]],
[[1.0, 1.0], [0.0, 1.0], [2.0, 1.0], [0.0, 1.0]],
],
dtype=np.float32,
)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=1)
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
# behavior changed with numpy >= 2.0
inverse_indices = inverse_indices.reshape(-1)
# print(y)
# [[[0. 1.]
# [1. 1.]
# [2. 1.]]
# [[0. 1.]
# [1. 1.]
# [2. 1.]]]
# print(indices)
# [1 0 2]
# print(inverse_indices)
# [1 0 2 0]
# print(counts)
# [2 1 1]
expect(
node_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_sorted_with_axis_3d",
)
node_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
sorted=1,
axis=-1,
)
x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 3]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=-1)
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
# behavior changed with numpy >= 2.0
inverse_indices = inverse_indices.reshape(-1)
# print(y)
# [[0. 1.]
# [0. 1.]
# [3. 2.]]
# print(indices)
# [1 0]
# print(inverse_indices)
# [1 0 0]
# print(counts)
# [2 1]
expect(
node_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_sorted_with_negative_axis",
)
node_sorted = onnx.helper.make_node(
"Unique",
inputs=["X"],
outputs=["Y", "indices", "inverse_indices", "counts"],
)
x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)
indices, inverse_indices, counts = specify_int64(
indices, inverse_indices, counts
)
expect(
node_sorted,
inputs=[x],
outputs=[y, indices, inverse_indices, counts],
name="test_unique_sorted_without_axis",
)
Insert single-dimensional entries to the shape of an input tensor (data).
Takes one required input axes - which contains a list of dimension indices and this operator will insert a dimension of value 1 into the corresponding index of the output tensor (expanded).
For example, given an input tensor (data) of shape [3, 4, 5], then
Unsqueeze(data, axes=[0, 4]) outputs a tensor (expanded) containing same data as data but with shape [1, 3, 4, 5, 1].
The input axes should not contain any duplicate entries. It is an error if it contains duplicates.
The rank of the output tensor (output_rank) is the rank of the input tensor (data) plus the number of values in axes.
Each value in axes should be within the (inclusive) range [-output_rank , output_rank - 1].
The order of values in axes does not matter and can come in any order.
This version of the operator has been available since version 25 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Unsqueeze-1">1</a>, <a href="Changelog.md#Unsqueeze-11">11</a>, <a href="Changelog.md#Unsqueeze-13">13</a>, <a href="Changelog.md#Unsqueeze-21">21</a>, <a href="Changelog.md#Unsqueeze-23">23</a>, <a href="Changelog.md#Unsqueeze-24">24</a>
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["x", "axes"],
outputs=["y"],
)
x = np.random.randn(1, 3, 1, 5).astype(np.float32)
axes = np.array([-2]).astype(np.int64)
y = np.expand_dims(x, axis=-2)
expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_negative_axes")
x = np.random.randn(3, 4, 5).astype(np.float32)
for i in range(x.ndim):
axes = np.array([i]).astype(np.int64)
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["x", "axes"],
outputs=["y"],
)
y = np.expand_dims(x, axis=i)
expect(
node,
inputs=[x, axes],
outputs=[y],
name="test_unsqueeze_axis_" + str(i),
)
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([2, 4, 5]).astype(np.int64)
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["x", "axes"],
outputs=["y"],
)
y = np.expand_dims(x, axis=2)
y = np.expand_dims(y, axis=4)
y = np.expand_dims(y, axis=5)
expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_three_axes")
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([1, 4]).astype(np.int64)
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["x", "axes"],
outputs=["y"],
)
y = np.expand_dims(x, axis=1)
y = np.expand_dims(y, axis=4)
expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_two_axes")
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([5, 4, 2]).astype(np.int64)
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["x", "axes"],
outputs=["y"],
)
y = np.expand_dims(x, axis=2)
y = np.expand_dims(y, axis=4)
y = np.expand_dims(y, axis=5)
expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_unsorted_axes")
Upsample the input tensor. Each dimension value of the output tensor is: output_dimension = floor(input_dimension * scale).
This version of the operator has been deprecated since version 10 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Upsample-7">7</a>, <a href="Changelog.md#Upsample-9">9</a>
node = onnx.helper.make_node(
"Upsample",
inputs=["X", "scales"],
outputs=["Y"],
mode="nearest",
)
data = np.array(
[
[
[
[1, 2],
[3, 4],
]
]
],
dtype=np.float32,
)
scales = np.array([1.0, 1.0, 2.0, 3.0], dtype=np.float32)
output = np.array(
[
[
[
[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2],
[3, 3, 3, 4, 4, 4],
[3, 3, 3, 4, 4, 4],
]
]
],
dtype=np.float32,
)
expect(
node,
inputs=[data, scales],
outputs=[output],
name="test_upsample_nearest",
opset_imports=[helper.make_opsetid("", 9)],
)
Return elements, either from X or Y, depending on condition. Where behaves like numpy.where with three parameters.
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 16 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Where-9">9</a>
node = onnx.helper.make_node(
"Where",
inputs=["condition", "x", "y"],
outputs=["z"],
)
condition = np.array([[1, 0], [1, 1]], dtype=bool)
x = np.array([[1, 2], [3, 4]], dtype=np.int64)
y = np.array([[9, 8], [7, 6]], dtype=np.int64)
z = np.where(condition, x, y) # expected output [[1, 8], [3, 4]]
expect(
node, inputs=[condition, x, y], outputs=[z], name="test_where_long_example"
)
node = onnx.helper.make_node(
"Where",
inputs=["condition", "x", "y"],
outputs=["z"],
)
condition = np.array([[1, 0], [1, 1]], dtype=bool)
x = np.array([[1, 2], [3, 4]], dtype=np.float32)
y = np.array([[9, 8], [7, 6]], dtype=np.float32)
z = np.where(condition, x, y) # expected output [[1, 8], [3, 4]]
expect(node, inputs=[condition, x, y], outputs=[z], name="test_where_example")
Returns the tensor resulted from performing the xor logical operation
elementwise on the input tensors A and B (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
This version of the operator has been available since version 7 of the default ONNX operator set.
Other versions of this operator: <a href="Changelog.md#Xor-1">1</a>
node = onnx.helper.make_node(
"Xor",
inputs=["x", "y"],
outputs=["xor"],
)
# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor2d")
# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor3d")
# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor4d")
node = onnx.helper.make_node(
"Xor",
inputs=["x", "y"],
outputs=["xor"],
)
# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast3v1d")
# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast3v2d")
# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v2d")
# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v3d")
# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v4d")
Compute one iteration of ADAGRAD, a stochastic gradient based optimization algorithm. This operator can conduct the optimization of multiple tensor variables.
Let's define the behavior of this operator. As you can imagine, ADAGRAD requires
some parameters:
- The initial learning-rate "R".
- The update count "T". That is, the number of training iterations conducted.
- A L2-norm regularization coefficient "norm_coefficient".
- A learning-rate decay factor "decay_factor".
- A small constant "epsilon" to avoid dividing-by-zero.
At each ADAGRAD iteration, the optimized tensors are moved along a direction
computed based on their estimated gradient and accumulated squared gradient. Assume
that only a single tensor "X" is updated by this operator. We need the value of "X",
its gradient "G", and its accumulated squared gradient "H". Therefore, variables in
this operator's input list are sequentially "R", "T", "X", "G", and "H". Other
parameters are given as attributes because they are usually constants. Also, the
corresponding output tensors are the new value of "X" (called "X_new"), and then
the new accumulated squared gradient (called "H_new"). Those outputs are computed
from the given inputs following the pseudo code below.
Let "+", "-", "*", and "/" are all element-wise arithmetic operations with
numpy-style broadcasting support. The pseudo code to compute those outputs is:
// Compute a scalar learning-rate factor. At the first update of X, T is generally
// 0 (0-based update index) or 1 (1-based update index).
r = R / (1 + T * decay_factor);
// Add gradient of 0.5 * norm_coefficient * ||X||_2^2, where ||X||_2 is the 2-norm.
G_regularized = norm_coefficient * X + G;
// Compute new accumulated squared gradient.
H_new = H + G_regularized * G_regularized;
// Compute the adaptive part of per-coordinate learning rate. Note that Sqrt(...)
// computes element-wise square-root.
H_adaptive = Sqrt(H_new) + epsilon
// Compute the new value of "X".
X_new = X - r * G_regularized / H_adaptive;
If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2", the same
pseudo code may be extended to handle all tensors jointly. More specifically, we can view "X" as a
concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should
be concatenated too) and then just reuse the entire pseudo code.
Note that ADAGRAD was first proposed in http://jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
In that reference paper, this operator is a special case of the Figure 1's composite mirror
descent update.
This version of the operator has been available since version 1 of the 'ai.onnx.preview.training' operator set.
# Define operator attributes.
norm_coefficient = 0.001
epsilon = 1e-5
decay_factor = 0.1
# Create operator.
node = onnx.helper.make_node(
"Adagrad",
inputs=["R", "T", "X", "G", "H"],
outputs=["X_new", "H_new"],
norm_coefficient=norm_coefficient,
epsilon=epsilon,
decay_factor=decay_factor,
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x = np.array([1.0], dtype=np.float32)
g = np.array([-1.0], dtype=np.float32)
h = np.array([2.0], dtype=np.float32)
# Compute expected outputs of Adagrad.
x_new, h_new = apply_adagrad(
r, t, x, g, h, norm_coefficient, epsilon, decay_factor
)
# Check results.
expect(
node,
inputs=[r, t, x, g, h],
outputs=[x_new, h_new],
name="test_adagrad",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
# Define operator attributes.
norm_coefficient = 0.001
epsilon = 1e-5
decay_factor = 0.1
node = onnx.helper.make_node(
"Adagrad",
inputs=["R", "T", "X1", "X2", "G1", "G2", "H1", "H2"],
outputs=["X1_new", "X2_new", "H1_new", "H2_new"],
norm_coefficient=norm_coefficient,
epsilon=epsilon,
decay_factor=decay_factor,
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x1 = np.array([1.0], dtype=np.float32)
g1 = np.array([-1.0], dtype=np.float32)
h1 = np.array([2.0], dtype=np.float32)
x2 = np.array([1.0, 2.0], dtype=np.float32)
g2 = np.array([-1.0, -3.0], dtype=np.float32)
h2 = np.array([4.0, 1.0], dtype=np.float32)
# Compute expected outputs of Adagrad.
x1_new, h1_new = apply_adagrad(
r, t, x1, g1, h1, norm_coefficient, epsilon, decay_factor
)
x2_new, h2_new = apply_adagrad(
r, t, x2, g2, h2, norm_coefficient, epsilon, decay_factor
)
# Check results.
expect(
node,
inputs=[r, t, x1, x2, g1, g2, h1, h2],
outputs=[x1_new, x2_new, h1_new, h2_new],
name="test_adagrad_multiple",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
Compute one iteration of Adam, a stochastic gradient based optimization algorithm. This operator can conduct the optimization of multiple tensor variables.
Let's define the behavior of this operator. First of all, Adam requires
some parameters:
- The learning-rate "R".
- The update count "T". That is, the number of training iterations conducted.
- A L2-norm regularization coefficient "norm_coefficient".
- A small constant "epsilon" to avoid dividing-by-zero.
- Two coefficients, "alpha" and "beta".
At each Adam iteration, the optimized tensors are moved along a direction
computed based on their exponentially-averaged historical gradient and
exponentially-averaged historical squared gradient. Assume that only a tensor
"X" is being optimized. The rest of required information is
- the value of "X",
- "X"'s gradient (denoted by "G"),
- "X"'s exponentially-averaged historical gradient (denoted by "V"), and
- "X"'s exponentially-averaged historical squared gradient (denoted by "H").
Some of those parameters are passed into this operator as input tensors and others
are stored as this operator's attributes. Specifically, this operator's input tensor
list is ["R", "T", "X", "G", "V", "H"]. That is, "R" is the first input, "T" is
the second input, and so on. Other parameters are given as attributes because they
are constants. Moreover, the corresponding output tensors are
- the new value of "X" (called "X_new"),
- the new exponentially-averaged historical gradient (denoted by "V_new"), and
- the new exponentially-averaged historical squared gradient (denoted by "H_new").
Those outputs are computed following the pseudo code below.
Let "+", "-", "*", and "/" are all element-wise arithmetic operations with
numpy-style broadcasting support. The pseudo code to compute those outputs is:
// Add gradient of 0.5 * norm_coefficient * ||X||_2^2, where ||X||_2 is the 2-norm.
G_regularized = norm_coefficient * X + G
// Update exponentially-averaged historical gradient.
V_new = alpha * V + (1 - alpha) * G_regularized
// Update exponentially-averaged historical squared gradient.
H_new = beta * H + (1 - beta) * G_regularized * G_regularized
// Compute the element-wise square-root of H_new. V_new will be element-wisely
// divided by H_sqrt for a better update direction.
H_sqrt = Sqrt(H_new) + epsilon
// Compute learning-rate. Note that "alpha**T"/"beta**T" is alpha's/beta's T-th power.
R_adjusted = T > 0 ? R * Sqrt(1 - beta**T) / (1 - alpha**T) : R
// Compute new value of "X".
X_new = X - R_adjusted * V_new / H_sqrt
// Post-update regularization.
X_final = (1 - norm_coefficient_post) * X_new
If there are multiple inputs to be optimized, the pseudo code will be applied
independently to each of them.
This version of the operator has been available since version 1 of the 'ai.onnx.preview.training' operator set.
# Define operator attributes.
norm_coefficient = 0.001
alpha = 0.95
beta = 0.1
epsilon = 1e-7
# Create operator.
node = onnx.helper.make_node(
"Adam",
inputs=["R", "T", "X", "G", "V", "H"],
outputs=["X_new", "V_new", "H_new"],
norm_coefficient=norm_coefficient,
alpha=alpha,
beta=beta,
epsilon=epsilon,
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x = np.array([1.2, 2.8], dtype=np.float32)
g = np.array([-0.94, -2.5], dtype=np.float32)
v = np.array([1.7, 3.6], dtype=np.float32)
h = np.array([0.1, 0.1], dtype=np.float32)
# Compute expected outputs of Adam.
x_new, v_new, h_new = apply_adam(
r, t, x, g, v, h, norm_coefficient, 0.0, alpha, beta, epsilon
)
# Check results.
expect(
node,
inputs=[r, t, x, g, v, h],
outputs=[x_new, v_new, h_new],
name="test_adam",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
# Define operator attributes.
norm_coefficient = 0.001
alpha = 0.95
beta = 0.85
epsilon = 1e-2
node = onnx.helper.make_node(
"Adam",
inputs=["R", "T", "X1", "X2", "G1", "G2", "V1", "V2", "H1", "H2"],
outputs=["X1_new", "X2_new", "V1_new", "V2_new", "H1_new", "H2_new"],
norm_coefficient=norm_coefficient,
alpha=alpha,
beta=beta,
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x1 = np.array([1.0], dtype=np.float32)
g1 = np.array([-1.0], dtype=np.float32)
v1 = np.array([2.0], dtype=np.float32)
h1 = np.array([0.5], dtype=np.float32)
x2 = np.array([1.0, 2.0], dtype=np.float32)
g2 = np.array([-1.0, -3.0], dtype=np.float32)
v2 = np.array([4.0, 1.0], dtype=np.float32)
h2 = np.array([1.0, 10.0], dtype=np.float32)
# Compute expected outputs of Adam.
x1_new, v1_new, h1_new = apply_adam(
r, t, x1, g1, v1, h1, norm_coefficient, 0.0, alpha, beta, epsilon
)
x2_new, v2_new, h2_new = apply_adam(
r, t, x2, g2, v2, h2, norm_coefficient, 0.0, alpha, beta, epsilon
)
# Check results.
expect(
node,
inputs=[r, t, x1, x2, g1, g2, v1, v2, h1, h2],
outputs=[x1_new, x2_new, v1_new, v2_new, h1_new, h2_new],
name="test_adam_multiple",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
Gradient operator computes the partial derivatives of a specific tensor w.r.t. some other tensors. This operator is widely used in gradient-based training algorithms. To illustrate its use, let's consider a computation graph,
X -----.
|
v
W --> Conv --> H --> Gemm --> Y
^
|
Z
, where W and Z are trainable tensors. Note that operators' attributes are omitted for the sake of simplicity. Let dY/dW (dY/dZ) be the gradient of Y with respect to W (Z). The user can compute gradient by inserting Gradient operator to form another graph shown below.
W --> Conv --> H --> Gemm --> Y
| ^ ^
| | |
| X Z
| | |
| | .----------'
| | | (W/Z/X is the 1st/2nd/3rd input of Gradient as shown in
| | | "xs" followed by "zs")
| v v
'---> Gradient(xs=["W", "Z"], zs=["X"], y="Y")
| |
| '-----------------------------------> dY/dW (1st output of Gradient)
|
'---------------------------------------> dY/dZ (2nd output of Gradient)
By definition, the tensor "y" is a function of independent variables in "xs" and "zs". Since we only compute the gradient of "y" w.r.t. the differentiable variables in "xs", this Gradient only outputs dY/dW and dY/dZ. Note that "H" cannot appear in "xs" and "zs". The reason is that "H" can be determined by tensors "W" and "X" and therefore "H" is not an independent variable.
All outputs are optional. If needed, for example, user can assign an empty string to the 1st output name of that Gradient to skip the generation of dY/dW. Note that the concept of optional outputs can also be found in ONNX's RNN, GRU, and LSTM.
Gradient operator can compute derivative against intermediate tensors. For example, the gradient of Y with respect to H can be done via
W --> Conv --> H --> Gemm --> Y
^ | ^
| | |
X | Z
.-------' |
| .----------'
| | (H/Z is the 1st/2nd input of Gradient as shown in "xs")
v v
Gradient(xs=["H", "Z"], y="Y")
| |
| '-----------------------------------> dY/dH (1st output of Gradient)
|
'---------------------------------------> dY/dZ (2nd output of Gradient)
It is possible to represent high-order differentiation using Gradient operators. For example, given the following linear model:
W --> Gemm --> Y --> Loss --> O
^ ^
| |
X L
To compute the 2nd order derivative of O with respect to W (denoted by d^2O/dW^2), one can do
W --> Gemm --> Y --> Loss --> O
| ^ ^
| | |
| X .------------L
| | | |
| | | v
+------+-+> Gradient(xs=["X", "W"], zs=["L"], y="O") ---> dO/dX (1st output of Gradient)
| | | |
| | | '---> dO/dW (2nd output of Gradient)
| v v
'---> Gradient(xs=["X", "W"], zs=["L"], y="dO/dW") ---> d(dO/dW)dX (1st output of
| Gradient)
|
|
'---> d^2O/dW^2 (2nd output of Gradient)
The tensors named in attributes "xs", "zs", and "y" define the differentiated computation graph, and the inputs to Gradient node define the values at which the gradient is computed. We can feed different tensors to the identified graph. For example, one can compute the gradient of Y with respect to H at a specific value of H, H_1, by providing that value as an input to the Gradient node.
W --> Conv --> H --> Gemm --> Y
^ ^
| |
X Z
Z_1 (2nd input of Gradient)
|
v
H_1 --> Gradient(xs=["H", "Z"], y="Y") ---> dY/dH when H = H_1 and Y = Y_1.
|
'------------------------------> dY/dZ (2nd output of Gradient)
When the inputs of Gradient are the tensors named in "xs" and "zs", the computation can be optimized. More specifically, intermediate variables in forward pass can be reused if the gradient is computed via reverse-mode auto-differentiation.
This version of the operator has been available since version 1 of the 'ai.onnx.preview.training' operator set.
add_node = onnx.helper.make_node("Add", ["a", "b"], ["c"], name="my_add")
gradient_node = onnx.helper.make_node(
"Gradient",
["a", "b"],
["dc_da", "dc_db"],
name="my_gradient",
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
xs=["a", "b"],
y="c",
)
a = np.array(1.0).astype(np.float32)
b = np.array(2.0).astype(np.float32)
c = a + b
# dc / da = d(a+b) / da = 1
dc_da = np.array(1).astype(np.float32)
# db / db = d(a+b) / db = 1
dc_db = np.array(1).astype(np.float32)
graph = onnx.helper.make_graph(
nodes=[add_node, gradient_node],
name="GradientOfAdd",
inputs=[
onnx.helper.make_tensor_value_info("a", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("b", onnx.TensorProto.FLOAT, []),
],
outputs=[
onnx.helper.make_tensor_value_info("c", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("dc_da", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("dc_db", onnx.TensorProto.FLOAT, []),
],
)
opsets = [
onnx.helper.make_operatorsetid(ONNX_DOMAIN, 12),
onnx.helper.make_operatorsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1),
]
model = onnx.helper.make_model_gen_version(
graph, producer_name="backend-test", opset_imports=opsets
)
expect(
model, inputs=[a, b], outputs=[c, dc_da, dc_db], name="test_gradient_of_add"
)
add_node = onnx.helper.make_node("Add", ["a", "b"], ["c"], name="my_add")
mul_node = onnx.helper.make_node("Mul", ["c", "a"], ["d"], name="my_mul")
gradient_node = onnx.helper.make_node(
"Gradient",
["a", "b"],
["dd_da", "dd_db"],
name="my_gradient",
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
xs=["a", "b"],
y="d",
)
a = np.array(1.0).astype(np.float32)
b = np.array(2.0).astype(np.float32)
c = a + b
# d = a * c = a * (a + b)
d = a * c
# dd / da = d(a*a+a*b) / da = 2 * a + b
dd_da = (2 * a + b).astype(np.float32)
# dd / db = d(a*a+a*b) / db = a
dd_db = a
graph = onnx.helper.make_graph(
nodes=[add_node, mul_node, gradient_node],
name="GradientOfTwoOperators",
inputs=[
onnx.helper.make_tensor_value_info("a", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("b", onnx.TensorProto.FLOAT, []),
],
outputs=[
onnx.helper.make_tensor_value_info("d", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("dd_da", onnx.TensorProto.FLOAT, []),
onnx.helper.make_tensor_value_info("dd_db", onnx.TensorProto.FLOAT, []),
],
)
opsets = [
onnx.helper.make_operatorsetid(ONNX_DOMAIN, 12),
onnx.helper.make_operatorsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1),
]
model = onnx.helper.make_model_gen_version(
graph, producer_name="backend-test", opset_imports=opsets
)
expect(
model,
inputs=[a, b],
outputs=[d, dd_da, dd_db],
name="test_gradient_of_add_and_mul",
)
Compute one iteration of stochastic gradient update with momentum. This operator can conduct the optimization of multiple tensor variables.
Let's define the behavior of this operator. As you can imagine, SG with momentum requires
several parameters:
- The learning-rate "R".
- The update count "T". That is, the number of conducted training iterations. It should
be zero in the first training iteration.
- A L2-norm regularization coefficient "norm_coefficient".
- A decay coefficient of previous accumulated gradient (i.e., momentum) "alpha".
- The scaling coefficient of current gradient "beta".
- An attribute to choose either standard momentum or Nesterov's momentum "mode" should
be used.
For the sake of simplicity, assume that there is only one tensor (called "X") to be optimized.
Other necessary inputs are "X"'s gradient (called "G") and "X"'s momentum (called "V"). This
Momentum operator maps all these inputs to the new value of "X" (called "X_new") and its new
momentum (called "V_new").
This operator supports two different momentum algorithms. Set the attribute "mode" to
"nesterov" if Nesterov's momentum is desired. Otherwise, set the attribute "model" to
"standard" to use standard momentum. Computation details are described subsequently.
Let "+", "-", "*", and "/" are all element-wise operations with numpy-style broadcasting.
Pseudo code for SG with standard momentum:
// Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared
// values of all elements in X.
G_regularized = norm_coefficient * X + G
// In the first training iteration, beta should always be 1.
beta_adjusted = T > 0 ? beta : 1
// Compute the current momentum based on previous momentum and the current gradient.
V_new = alpha * V + beta_adjusted * G_regularized
// Update X.
X_new = X - R * V_new
Pseudo code for SG with Nesterov's momentum:
// Add gradient of 0.5 * norm_coefficient * ||X||^2, where ||X|| is the sum of squared
// values of all elements in X.
G_regularized = norm_coefficient * X + G;
// In the first training iteration, beta should always be 1.
beta_adjusted = T > 0 ? beta : 1
// Compute the current momentum based on previous momentum and the current gradient.
V_new = alpha * V + beta_adjusted * G_regularized;
// Compute final update direction and then update X.
X_new = X - R * (G_regularized + alpha * V_new)
If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2". The same
pseudo code would be extended to handle all tensors jointly. More specifically, we can view "X" as a
concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should
be concatenated too) and then our pseudo code becomes applicable.
This version of the operator has been available since version 1 of the 'ai.onnx.preview.training' operator set.
# Define operator attributes.
norm_coefficient = 0.001
alpha = 0.95
beta = 0.1
# Create operator.
node = onnx.helper.make_node(
"Momentum",
inputs=["R", "T", "X", "G", "V"],
outputs=["X_new", "V_new"],
norm_coefficient=norm_coefficient,
alpha=alpha,
beta=beta,
mode="standard",
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x = np.array([1.2, 2.8], dtype=np.float32)
g = np.array([-0.94, -2.5], dtype=np.float32)
v = np.array([1.7, 3.6], dtype=np.float32)
# Compute expected outputs of Momentum.
x_new, v_new = apply_momentum(r, t, x, g, v, norm_coefficient, alpha, beta)
# Check results.
expect(
node,
inputs=[r, t, x, g, v],
outputs=[x_new, v_new],
name="test_momentum",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
# Define operator attributes.
norm_coefficient = 0.001
alpha = 0.95
beta = 0.85
node = onnx.helper.make_node(
"Momentum",
inputs=["R", "T", "X1", "X2", "G1", "G2", "H1", "H2"],
outputs=["X1_new", "X2_new", "V1_new", "V2_new"],
norm_coefficient=norm_coefficient,
alpha=alpha,
beta=beta,
mode="standard",
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x1 = np.array([1.0], dtype=np.float32)
g1 = np.array([-1.0], dtype=np.float32)
v1 = np.array([2.0], dtype=np.float32)
x2 = np.array([1.0, 2.0], dtype=np.float32)
g2 = np.array([-1.0, -3.0], dtype=np.float32)
v2 = np.array([4.0, 1.0], dtype=np.float32)
# Compute expected outputs of Momentum.
x1_new, v1_new = apply_momentum(r, t, x1, g1, v1, norm_coefficient, alpha, beta)
x2_new, v2_new = apply_momentum(r, t, x2, g2, v2, norm_coefficient, alpha, beta)
# Check results.
expect(
node,
inputs=[r, t, x1, x2, g1, g2, v1, v2],
outputs=[x1_new, x2_new, v1_new, v2_new],
name="test_momentum_multiple",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)
# Define operator attributes.
norm_coefficient = 0.01
alpha = 0.95
beta = 1.0
# Create operator.
node = onnx.helper.make_node(
"Momentum",
inputs=["R", "T", "X", "G", "V"],
outputs=["X_new", "V_new"],
norm_coefficient=norm_coefficient,
alpha=alpha,
beta=beta,
mode="nesterov",
domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)
# Define operator inputs.
r = np.array(0.1, dtype=np.float32) # scalar
t = np.array(0, dtype=np.int64) # scalar
x = np.array([1.2, 2.8], dtype=np.float32)
g = np.array([-0.94, -2.5], dtype=np.float32)
v = np.array([1.7, 3.6], dtype=np.float32)
# Compute expected outputs of Momentum.
x_new, v_new = apply_nesterov(r, t, x, g, v, norm_coefficient, alpha, beta)
# Check results.
expect(
node,
inputs=[r, t, x, g, v],
outputs=[x_new, v_new],
name="test_nesterov_momentum",
opset_imports=[
onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
],
)