## Operator Changelog
*This file is automatically generated from the
            [def files](/onnx/defs) via [this script](/onnx/defs/gen_doc.py).
            Do not modify directly and instead edit operator definitions.*

# ai.onnx (default)
## Version 1 of the default ONNX operator set
### <a name="ATen-1"></a>**ATen-1**</a>

  Experimental allowing ATen operations to be accessed directly from Caffe2
  to allow for quick prototyping when ONNX is missing standard versions of
  and op

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>input</tt> (variadic) : T</dt>
<dd>Arbitrary input</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>output</tt> (variadic) : T</dt>
<dd>Arbitrary output</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to bool, int32, int64, float16, float, double tensors.</dd>
</dl>

### <a name="Abs-1"></a>**Abs-1**</a>

  Absolute takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the absolute is, y = abs(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Add-1"></a>**Add-1**</a>

  Performs element-wise binary addition (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Affine-1"></a>**Affine-1**</a>

  Affine takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the affine function, y = alpha * x + beta,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Value of alpha</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Value of beta</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>1D output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="And-1"></a>**And-1**</a>

  Returns the tensor resulted from performing the `and` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="ArgMax-1"></a>**ArgMax-1**</a>

  Computes the indices of the max elements of the input tensor's element along the 
  provided axis. The resulted tensor has the same rank as the input if keepdims equal 1.
  If keepdims equal 0, then the resulted tensor have the reduced dimension pruned. 
  The type of the output tensor is integer.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>The axis in which to compute the arg indices. Default is 0.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : tensor(int64)</dt>
<dd>Reduced output tensor with integer data type.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>

### <a name="ArgMin-1"></a>**ArgMin-1**</a>

  Computes the indices of the min elements of the input tensor's element along the 
  provided axis. The resulted tensor has the same rank as the input if keepdims equal 1.
  If keepdims equal 0, then the resulted tensor have the reduced dimension pruned. 
  The type of the output tensor is integer.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>The axis in which to compute the arg indices. Default is 0.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : tensor(int64)</dt>
<dd>Reduced output tensor with integer data type.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>

### <a name="AveragePool-1"></a>**AveragePool-1**</a>

  AveragePool consumes an input tensor X and applies average pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   average pooling consisting of computing the average on all values of a
   subset of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape will be following:
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
  
   * pad_shape[i] is sum of pads along axis i
   ```
  
   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]
   ```
   The output of each pooling window is divided by the number of elements exclude pad.
   

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="BatchNormalization-1"></a>**BatchNormalization-1**</a>

  Carries out batch normalization as described in the paper
  https://arxiv.org/abs/1502.03167. Depending on the mode it is being run,
  there are multiple cases for the number of outputs, which we list below:
  
  Output case #1: Y, mean, var, saved_mean, saved_var (training mode)
  Output case #2: Y (test mode)
      

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints (required)</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>epsilon</tt> : float</dt>
<dd>The epsilon value to use to avoid division by zero, default is 1e-5f.</dd>
<dt><tt>is_test</tt> : int</dt>
<dd>If set to nonzero, run spatial batch normalization in test mode, default is 0.</dd>
<dt><tt>momentum</tt> : float</dt>
<dd>Factor used in computing the running mean and variance.e.g., running_mean = running_mean * momentum + mean * (1 - momentum), default is 0.9f.</dd>
<dt><tt>spatial</tt> : int</dt>
<dd>If true, compute the mean and variance across all spatial elements If false, compute the mean and variance across per feature.Default is 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input 4-dimensional tensor of shape NCHW.</dd>
<dt><tt>scale</tt> : T</dt>
<dd>The scale as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>B</tt> : T</dt>
<dd>The bias as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>mean</tt> : T</dt>
<dd>The running mean (training) or the estimated mean (testing) as a 1-dimensional tensor of size C.</dd>
<dt><tt>var</tt> : T</dt>
<dd>The running variance (training) or the estimated variance (testing) as a 1-dimensional tensor of size C.</dd>
</dl>

#### Outputs (1 - 5)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>The output 4-dimensional tensor of the same shape as X.</dd>
<dt><tt>mean</tt> (optional) : T</dt>
<dd>The running mean after the BatchNormalization operator. Must be in-place with the input mean. Should not be used for testing.</dd>
<dt><tt>var</tt> (optional) : T</dt>
<dd>The running variance after the BatchNormalization operator. Must be in-place with the input var. Should not be used for testing.</dd>
<dt><tt>saved_mean</tt> (optional) : T</dt>
<dd>Saved mean used during training to speed up gradient computation. Should not be used for testing.</dd>
<dt><tt>saved_var</tt> (optional) : T</dt>
<dd>Saved variance used during training to speed up gradient computation. Should not be used for testing.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Cast-1"></a>**Cast-1**</a>

  The operator casts the elements of a given input tensor to a data type
  specified by the 'to' argument and returns an output tensor of the same size in
  the converted type. The 'to' argument must be one of the data types specified
  in the 'DataType' enum field in the TensorProto message.
  NOTE: Casting to and from strings is not supported yet.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>to</tt> : string (required)</dt>
<dd>The data type to which the elements of the input tensor are cast.Strictly must be one of the types from DataType enum in TensorProto</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to be cast.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor with the same shape as input with type specified by the 'to' argument</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain input types. Casting from strings and complex are not supported.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain output types. Casting to strings and complex are not supported.</dd>
</dl>

### <a name="Ceil-1"></a>**Ceil-1**</a>

  Ceil takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the ceil is, y = ceil(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Clip-1"></a>**Clip-1**</a>

  Clip operator limits the given input within an interval. The interval is
  specified with arguments 'min' and 'max'. They default to
  numeric_limits::lowest() and numeric_limits::max() respectively.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>max</tt> : float</dt>
<dd>Maximum value, above which element is replaced by max</dd>
<dt><tt>min</tt> : float</dt>
<dd>Minimum value, under which element is replaced by min</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor whose elements to be clipped</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor with clipped input elements</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Concat-1"></a>**Concat-1**</a>

  Concatenate a list of tensors into a single tensor

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Which axis to concat on.  Default value is 1.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>inputs</tt> (variadic) : T</dt>
<dd>List of tensors for concatenation</dd>
</dl>

#### Outputs

<dl>
<dt><tt>concat_result</tt> : T</dt>
<dd>Concatenated tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>

### <a name="Constant-1"></a>**Constant-1**</a>

  A constant tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>value</tt> : tensor (required)</dt>
<dd>The value for the elements of the output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor containing the same value of the provided tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ConstantFill-1"></a>**ConstantFill-1**</a>

  The operator fills the elements of the output tensor with a constant value
  specified by the 'value' attribute.
  
  The data type is specified by the 'dtype' attribute. The 'dtype' attribute must
  be one of the data types specified in the 'DataType' enum field in the
  TensorProto message. If the 'dtype' attribute is not provided, the data type of
  'value' is used.
  
  The output tensor shape is specified by the 'shape' attribute. If the number of
  input is 1, the shape will be identical to that of the input at run time with
  optional additional dimensions appended at the end as specified by 'extra_shape'
  attribute. In that case the 'shape' attribute should not be set.
  
  If input_as_shape is set to true, then the input should be a 1D tensor
  containing the desired output shape (the dimensions specified in extra_shape
  will also be appended)
  
  NOTE: Currently, it supports data type of float, int32, int64, and bool.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>The data type for the elements of the output tensor.Strictly must be one of the types from DataType enum in TensorProto.</dd>
<dt><tt>extra_shape</tt> : list of ints</dt>
<dd>The additional dimensions appended at the end of the shape indicatedby the input blob.Cannot set the extra_shape argument when there is no input blob.</dd>
<dt><tt>input_as_shape</tt> : int</dt>
<dd>1D tensor containing the desired output shape.  First input must be in CPU context.</dd>
<dt><tt>shape</tt> : list of ints</dt>
<dd>The shape of the output tensor. Cannot set the shape argument and pass in an input at the same time.</dd>
<dt><tt>value</tt> : float</dt>
<dd>The value for the elements of the output tensor. Default is 0.</dd>
</dl>

#### Inputs (0 - 1)

<dl>
<dt><tt>input</tt> (optional) : T1</dt>
<dd>Input tensor (optional) to provide shape information.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of constant values specified by 'value'argument and its type is specified by the 'dtype' argument</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float), tensor(int32), tensor(int64), tensor(bool)</dt>
<dd>Constrain input types to float, int32, int64, bool tensors.</dd>
<dt><tt>T2</tt> : tensor(float), tensor(int32), tensor(int64), tensor(bool)</dt>
<dd>Constrain output types to float, int32, int64, bool tensors.</dd>
</dl>

### <a name="Conv-1"></a>**Conv-1**</a>

  The convolution operator consumes an input tensor and a filter, and
  computes the output.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each axis of the filter. If not present, the dilation defaults to 1 along each axis.</dd>
<dt><tt>group</tt> : int</dt>
<dd>number of groups input channels and output channels are divided into, default is 1.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input W.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 ... x Dn). Optionally, if dimension denotation is in effect, the operation expects input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor that will be used in the convolutions; has size (M x C x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the kernel shape will be (M x C x k1 x k2 x ... x kn), where (k1 x k2 x ... kn) is the dimension of the kernel. Optionally, if dimension denotation is in effect, the operation expects the weight tensor to arrive with the dimension denotation of [FILTER_IN_CHANNEL, FILTER_OUT_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL ...].</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>Optional 1D bias to be added to the convolution, has size of M.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ConvTranspose-1"></a>**ConvTranspose-1**</a>

  The convolution transpose operator consumes an input tensor and a filter,
  and computes the output. 
  
  If the pads parameter is provided the shape of the output is calculated via the following equation:
  
    output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + kernel_shape[i] - pads[start_i] - pads[end_i]
  
  output_shape can also be explicitly specified in which case pads values are auto generated using these equations:
  
    total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + kernel_shape[i] - output_shape[i]
    If (auto_pads != SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
    Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
  
      

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each axis of the filter. If not present, the dilation defaults to 1 along each axis.</dd>
<dt><tt>group</tt> : int</dt>
<dd>number of groups input channels and output channels are divided into, default is 1.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input W.</dd>
<dt><tt>output_padding</tt> : list of ints</dt>
<dd>The zero-padding added to one side of the output. This is also called adjs/adjustment in some frameworks.</dd>
<dt><tt>output_shape</tt> : list of ints</dt>
<dd>The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified pads values are ignored. See doc for details for equations to generate pads</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image.Otherwise the size is (N x D1 x D2 ... x Dn)</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor that will be used in the convolutions; has size (C x M x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the weight shape will be (C x M x k1 x k2 x ... x kn), where (k1 x k2 x ... x kn) is the dimension of the kernel</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>Optional 1D bias to be added to the convolution, has size of C.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Crop-1"></a>**Crop-1**</a>

  Crop and image to the specified spatial dimensions. If scale is given,
  then optionally start the crop offset by the left/top border amounts.
  If scale is not provided, crop the borders as provided.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>border</tt> : list of ints</dt>
<dd>A 1-D values of (leftBorder, topBorder, rightBorder, bottomBorder).</dd>
<dt><tt>scale</tt> : list of ints</dt>
<dd>A 1-D values of (height, width).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of shape [N,C,H,W]</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Result, has same type as input, with H and W dimensions reduced.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="DepthToSpace-1"></a>**DepthToSpace-1**</a>

  DepthToSpace rearranges (permutes) data from depth into blocks of spatial data.
  This is the reverse transformation of SpaceToDepth. More specifically, this op outputs a copy of
  the input tensor where values from the depth dimension are moved in spatial blocks to the height
  and width dimensions.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>blocksize</tt> : int (required)</dt>
<dd>Blocks of [blocksize, blocksize] are moved.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of [N, C/(blocksize * blocksize), H * blocksize, W * blocksize].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
</dl>

### <a name="Div-1"></a>**Div-1**</a>

  Performs element-wise binary division (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Dropout-1"></a>**Dropout-1**</a>

  Dropout takes one input data (Tensor<float>) and produces two Tensor outputs,
  output (Tensor<float>) and mask (Tensor<bool>). Depending on whether it is in
  test mode or not, the output Y will either be a random dropout, or a simple
  copy of the input. Note that our implementation of Dropout does scaling in
  the training phase, so during testing nothing needs to be done.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>is_test</tt> : int</dt>
<dd>(int, default 0) if nonzero, run dropout in test mode where the output is simply Y = X.</dd>
<dt><tt>ratio</tt> : float</dt>
<dd>(float, default 0.5) the ratio of random dropout</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>The input data as Tensor.</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output.</dd>
<dt><tt>mask</tt> (optional) : T</dt>
<dd>The output mask. If is_test is nonzero, this output is not filled.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Elu-1"></a>**Elu-1**</a>

  Elu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the function `f(x) = alpha * (exp(x) - 1.) for x <
  0`, `f(x) = x for x >= 0`., is applied to the tensor elementwise.
  

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of ELU default to 1.0.</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Equal-1"></a>**Equal-1**</a>

  Returns the tensor resulted from performing the `equal` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool), tensor(int32), tensor(int64)</dt>
<dd>Constrains input to integral tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="Exp-1"></a>**Exp-1**</a>

  Calculates the exponential of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The exponential of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Flatten-1"></a>**Flatten-1**</a>

  Flattens the input tensor into a 2D matrix. If input tensor has shape
  (d_0, d_1, ... d_n) then the output will have shape
  (d_0 X d_1 ... d_(axis-1), d_axis X d_(axis+1) ... X dn).

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(Default to 1) Indicate up to which input dimensions (exclusive) should be flattened to the outer dimension of the output. The value for axis must be in the range [0, R], where R is the rank of the input tensor. When axis = 0, the shape of the output tensor is (1, (d_0 X d_1 ... d_n), where the shape of the input tensor is (d_0, d_1, ... d_n). </dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>A tensor of rank >= axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>A 2D tensor with the contents of the input tensor, with input dimensions up to axis flattened to the outer dimension of the output and remaining input dimensions flattened into the inner dimension of the output.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Floor-1"></a>**Floor-1**</a>

  Floor takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the floor is, y = floor(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="GRU-1"></a>**GRU-1**</a>

  Computes an one-layer GRU. This operator is usually supported via some custom
  implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `z` - update gate
  
  `r` - reset gate
  
  `h` - hidden gate
  
  `t` - time step (t-1 means previous time step)
  
  `W[zrh]` - W parameter weight matrix for update, reset, and hidden gates
  
  `R[zrh]` - R recurrence weight matrix for update, reset, and hidden gates
  
  `Wb[zrh]` - W bias vectors for update, reset, and hidden gates
  
  `Rb[zrh]` - R bias vectors for update, reset, and hidden gates
  
  `WB[zrh]` - W parameter weight matrix for backward update, reset, and hidden gates
  
  `RB[zrh]` - R recurrence weight matrix for backward update, reset, and hidden gates
  
  `WBb[zrh]` - W bias vectors for backward update, reset, and hidden gates
  
  `RBb[zrh]` - R bias vectors for backward update, reset, and hidden gates
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Sigmoid, g=Tanh):
  
    - zt = f(Xt*(Wz^T) + Ht-1*Rz + Wbz + Rbz)
  
    - rt = f(Xt*(Wr^T) + Ht-1*Rr + Wbr + Rbr)
  
    - ht = g(Xt*(Wh^T) + (rt (.) Ht-1)*Rh + Rbh + Wbh) # default, when linear_before_reset = 0
  
    - ht = g(Xt*(Wh^T) + (rt (.) (Ht-1*Rh + Rbh) + Wbh) # when linear_before_reset != 0
  
    - Ht = (1 - zt) (.) ht + zt (.) Ht-1

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 2 (or 4 if bidirectional) activation functions for update, reset, and hidden gates. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>output_sequence</tt> : int</dt>
<dd>The sequence output for the hidden is optional if 0. Default 0.</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[zrh]` and `WB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[zrh]` and `RB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for the gates. Concatenation of `[Wb[zrh], Rb[zrh]]` and `[WBb[zrh], RBb[zrh]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 6*hidden_size]`. Optional: If not specified - assumed to be 0</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. It is optional if `output_sequence` is 0.</dd>
<dt><tt>Y_h</tt> : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="GRUUnit-1"></a>**GRUUnit-1**</a>

  GRUUnit computes the activations of a standard GRU,
  in a sequence-length aware fashion.
  Concretely, given the (fused) inputs X (TxNxD), the previous hidden
  state (NxD), and the sequence lengths (N), computes the GRU
  activations, avoiding computation if the input is invalid (as in, the
  value at X[t][n] >= seqLengths[n].

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>drop_states</tt> : int</dt>
<dd>Bool to determine if hidden state is zeroes or passed along for timesteps past the given sequence_length.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>hidden_prev</tt> : T</dt>
<dd>The previous GRU hidden state.</dd>
<dt><tt>gates</tt> : T</dt>
<dd>Unactivated gate outputs from forget, update, and output gates, pre-activation.</dd>
<dt><tt>seq_lengths</tt> : T</dt>
<dd>Array of sequence lengths.  len(seq_lengths) should equal batch size N.</dd>
<dt><tt>t</tt> : T</dt>
<dd>The timestep for this operation.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>hidden</tt> : T</dt>
<dd>The new GRU hidden state calculated by this op.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Gather-1"></a>**Gather-1**</a>

  Given `data` tensor of rank r >= 1, and `indices` tensor of rank q, gather
  entries of the axis dimension of `data` (by default outer-most one as axis=0) indexed by `indices`, and concatenates
  them in an output tensor of rank q + (r - 1).
  Example 1:
    data = [
        [1.0, 1.2],
        [2.3, 3.4],
        [4.5, 5.7],
    ]
    indices = [
        [0, 1],
        [1, 2],
    ]
    output = [
        [
            [1.0, 1.2],
            [2.3, 3.4],
        ],
        [
            [2.3, 3.4],
            [4.5, 5.7],
        ],
    ]
  Example 2:
    data = [
        [1.0, 1.2, 1.9],
        [2.3, 3.4, 3.9],
        [4.5, 5.7, 5.9],
    ]
    indices = [
        [0, 2],
    ]
    axis = 1,
    output = [
        [
            [1.0, 1.9],
            [2.3, 3.9],
            [4.5, 5.9],
        ],
    ]

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Which axis to gather on, defaults to 0. Negative value means counting dimensions from the back. Accepted range in [-r, r-1]</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> : Tind</dt>
<dd>Tensor of int32/int64 indices, of any rank q.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Tensor of rank q + (r - 1).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>

### <a name="Gemm-1"></a>**Gemm-1**</a>

  General Matrix multiplication:
  https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3
  Compute Y = alpha * A * B + beta * C, where input tensor A has
  dimension (M X K), input tensor B has dimension (K X N), input tensor C and
  output tensor Y have dimension (M X N).
  If attribute broadcast is non-zero, input tensor C will be broadcasted to match
  the dimension requirement. A will be transposed before doing the computation
  if attribute transA is non-zero, same for B and transB.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Scalar multiplier for the product of input tensors A * B</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Scalar multiplier for input tensor C</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Whether C should be broadcasted</dd>
<dt><tt>transA</tt> : int</dt>
<dd>Whether A should be transposed</dd>
<dt><tt>transB</tt> : int</dt>
<dd>Whether B should be transposed</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Input tensor A</dd>
<dt><tt>B</tt> : T</dt>
<dd>Input tensor B</dd>
<dt><tt>C</tt> : T</dt>
<dd>Input tensor C, can be inplace.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="GivenTensorFill-1"></a>**GivenTensorFill-1**</a>

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>extra_shape</tt> : list of ints</dt>
<dd></dd>
<dt><tt>input_as_shape</tt> : int</dt>
<dd></dd>
<dt><tt>shape</tt> : list of ints</dt>
<dd></dd>
<dt><tt>values</tt> : list of floats</dt>
<dd></dd>
</dl>

#### Inputs (0 - 1)

<dl>
<dt><tt>shape</tt> (optional) : T</dt>
<dd>The shape of filled tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The filled tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="GlobalAveragePool-1"></a>**GlobalAveragePool-1**</a>

  GlobalAveragePool consumes an input tensor X and applies average pooling across
   the values in the same channel. This is equivalent to AveragePool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from pooling across the input tensor. Dimensions will be N x C x 1 x 1</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="GlobalLpPool-1"></a>**GlobalLpPool-1**</a>

  GlobalLpPool consumes an input tensor X and applies lp pool pooling across the
   the values in the same channel. This is equivalent to LpPool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>p</tt> : float</dt>
<dd>p value of the Lp norm used to pool over the input data, default is 2.0.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimension are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from pooling across the input tensor. Dimensions will be N x C x 1 x 1</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="GlobalMaxPool-1"></a>**GlobalMaxPool-1**</a>

  GlobalMaxPool consumes an input tensor X and applies max pooling across
   the values in the same channel. This is equivalent to MaxPool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from pooling across the input tensor. Dimensions will be N x C x 1 x 1</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Greater-1"></a>**Greater-1**</a>

  Returns the tensor resulted from performing the `greater` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrains input to float tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="HardSigmoid-1"></a>**HardSigmoid-1**</a>

  HardSigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the HardSigmoid function, y = max(0, min(1, alpha * x + beta)),
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Value of alpha default to 0.2</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Value of beta default to 0.5</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Hardmax-1"></a>**Hardmax-1**</a>

  The operator computes the hardmax (1 for the first maximum value, and 0 for all others) values for each layer in the batch
   of the given input. The input is a 2-D tensor (Tensor<float>) of size
  (batch_size x input_feature_dimensions). The output tensor has the same shape
  and contains the hardmax values of the corresponding input.
  
  X does not need to explicitly be a 2D vector; rather, it will be
  coerced into one. For an arbitrary n-dimensional tensor
  X \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
  the axis provided, then X will be coerced into a 2-dimensional tensor with
  dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
  case where axis=1, this means the X tensor will be coerced into a 2D tensor
  of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
  In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
  Each of these dimensions must be matched correctly, or else the operator
  will throw errors.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(int) default to 1; describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The input tensor that's coerced into a 2D matrix of size (NxD) as described above.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output values with the same shape as input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Identity-1"></a>**Identity-1**</a>

  Identity operator

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Tensor to copy input into.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>

### <a name="If-1"></a>**If-1**</a>

  If conditional

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>else_branch</tt> : graph (required)</dt>
<dd>Graph to run if condition is false. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the then_branch.</dd>
<dt><tt>then_branch</tt> : graph (required)</dt>
<dd>Graph to run if condition is true. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the else_branch.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>cond</tt> : B</dt>
<dd>Condition for the if</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs</tt> (variadic) : V</dt>
<dd>Values that are live-out to the enclosing scope.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>All Tensor types</dd>
<dt><tt>B</tt> : tensor(bool)</dt>
<dd>Only bool</dd>
</dl>

### <a name="ImageScaler-1"></a>**ImageScaler-1**</a>

  Scale and bias the input image. Bias values are stored in
  the same ordering as the image pixel format.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>bias</tt> : list of floats</dt>
<dd>Bias applied to each channel, same size as C.</dd>
<dt><tt>scale</tt> : float</dt>
<dd>(float, default 1.0) the scale to apply.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of shape [N,C,H,W]</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Result, has same shape and type as input</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="InstanceNormalization-1"></a>**InstanceNormalization-1**</a>

  Carries out instance normalization as described in the paper
  https://arxiv.org/abs/1607.08022.
  
  y = scale * (x - mean) / sqrt(variance + epsilon) + B,
  where mean and variance are computed per instance per channel.
  

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>epsilon</tt> : float</dt>
<dd>The epsilon value to use to avoid division by zero, default is 1e-5f.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The input 4-dimensional tensor of shape NCHW.</dd>
<dt><tt>scale</tt> : T</dt>
<dd>The input 1-dimensional scale tensor of size C.</dd>
<dt><tt>B</tt> : T</dt>
<dd>The input 1-dimensional bias tensor of size C.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output 4-dimensional tensor of the same shape as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="LRN-1"></a>**LRN-1**</a>

  Local Response Normalization proposed in the [AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf).
  It normalizes over local input regions.
  The local region is defined across the channels. For an element X[n, c, d1, ..., dk] in a tensor
  of shape (N x C x D1 x D2, ..., Dk), its region is
  {X[n, i, d1, ..., dk] | max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))}.
  
  square_sum[n, c, d1, ..., dk] = sum(X[n, i, d1, ..., dk] ^ 2),
  where max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2)).
  
  Y[n, c, d1, ..., dk] = X[n, c, d1, ..., dk] / (bias + alpha / size * square_sum[n, c, d1, ..., dk] ) ^ beta

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Scaling parameter, default is 1e-4f.</dd>
<dt><tt>beta</tt> : float</dt>
<dd>The exponent, default is 0.75f</dd>
<dt><tt>bias</tt> : float</dt>
<dd>Default to 1.0f</dd>
<dt><tt>size</tt> : int (required)</dt>
<dd>The number of channels to sum over</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor, which has the shape and type as input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output  types to float tensors.</dd>
</dl>

### <a name="LSTM-1"></a>**LSTM-1**</a>

  Computes an one-layer LSTM. This operator is usually supported via some
  custom implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `i` - input gate
  
  `o` - output gate
  
  `f` - forget gate
  
  `c` - cell gate
  
  `t` - time step (t-1 means previous time step)
  
  `W[iofc]` - W parameter weight matrix for input, output, forget, and cell gates
  
  `R[iofc]` - R recurrence weight matrix for input, output, forget, and cell gates
  
  `Wb[iofc]` - W bias vectors for input, output, forget, and cell gates
  
  `Rb[iofc]` - R bias vectors for input, output, forget, and cell gates
  
  `P[iof]`  - P peephole weight vector for input, output, and forget gates
  
  `WB[iofc]` - W parameter weight matrix for backward input, output, forget, and cell gates
  
  `RB[iofc]` - R recurrence weight matrix for backward input, output, forget, and cell gates
  
  `WBb[iofc]` - W bias vectors for backward input, output, forget, and cell gates
  
  `RBb[iofc]` - R bias vectors for backward input, output, forget, and cell gates
  
  `PB[iof]`  - P peephole weight vector for backward input, output, and forget gates
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Sigmoid, g=Tanh, h=Tanh):
  
    - it = f(Xt*(Wi^T) + Ht-1*Ri + Pi (.) Ct-1 + Wbi + Rbi)
  
    - ft = f(Xt*(Wf^T) + Ht-1*Rf + Pf (.) Ct-1 + Wbf + Rbf)
  
    - ct = g(Xt*(Wc^T) + Ht-1*Rc + Wbc + Rbc)
  
    - Ct = ft (.) Ct-1 + it (.) ct
  
    - ot = f(Xt*(Wo^T) + Ht-1*Ro + Po (.) Ct + Wbo + Rbo)
  
    - Ht = ot (.) h(Ct)

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 3 (or 6 if bidirectional) activation functions for input, output, forget, cell, and hidden. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>input_forget</tt> : int</dt>
<dd>Couple the input and forget gates if 1, default 0.</dd>
<dt><tt>output_sequence</tt> : int</dt>
<dd>The sequence output for the hidden is optional if 0. Default 0.</dd>
</dl>

#### Inputs (3 - 8)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[iofc]` and `WB[iofc]` (if bidirectional) along dimension 0. The tensor has shape `[num_directions, 4*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[iofc]` and `RB[iofc]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 4*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wb[iofc], Rb[iofc]]`, and `[WBb[iofc], RBb[iofc]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 8*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>initial_c</tt> (optional) : T</dt>
<dd>Optional initial value of the cell. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>P</tt> (optional) : T</dt>
<dd>The weight tensor for peepholes. Concatenation of `P[iof]` and `PB[iof]` (if bidirectional) along dimension 0. It has shape `[num_directions, 3*hidde_size]`. Optional: If not specified - assumed to be 0.</dd>
</dl>

#### Outputs (0 - 3)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. It is optional if `output_sequence` is 0.</dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>Y_c</tt> (optional) : T</dt>
<dd>The last output value of the cell. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="LeakyRelu-1"></a>**LeakyRelu-1**</a>

  LeakyRelu takes input data (Tensor<T>) and an argument alpha, and produces one
  output data (Tensor<T>) where the function `f(x) = alpha * x for x < 0`,
  `f(x) = x for x >= 0`, is applied to the data tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of leakage default to 0.01.</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Less-1"></a>**Less-1**</a>

  Returns the tensor resulted from performing the `less` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrains input to float tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="Log-1"></a>**Log-1**</a>

  Calculates the natural log of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The natural log of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="LogSoftmax-1"></a>**LogSoftmax-1**</a>

  The operator computes the logsoftmax (log of softmax) values for each layer in the batch
   of the given input. The input is a 2-D tensor (Tensor<float>) of size
  (batch_size x input_feature_dimensions). The output tensor has the same shape
  and contains the logsoftmax values of the corresponding input.
  
  X does not need to explicitly be a 2D vector; rather, it will be
  coerced into one. For an arbitrary n-dimensional tensor
  X \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
  the axis provided, then X will be coerced into a 2-dimensional tensor with
  dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
  case where axis=1, this means the X tensor will be coerced into a 2D tensor
  of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
  In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
  Each of these dimensions must be matched correctly, or else the operator
  will throw errors.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(int) default to 1; describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The input tensor that's coerced into a 2D matrix of size (NxD) as described above.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output values with the same shape as input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Loop-1"></a>**Loop-1**</a>

  Generic Looping construct. This loop has multiple termination conditions:
  
  1) Trip count. Iteration count specified at runtime. Set by
     specifying the input M. Optional. Set to empty string to omit.
     Note that a static trip count (specified at graph construction time) can be
     specified by passing in a constant node for input M.
  2) Loop termination condition. This is an input to the op that determines
     whether to run the first iteration and also a loop-carried dependency for
     the body graph. The body graph must yield a value for the condition variable,
     whether this input is provided or not.
  
  This table summarizes the operating modes of this operator with equivalent
  C-style code:
  
      Operator inputs defined as (max_trip_count, condition_var).
  
      input ("", ""):
          for (int i=0; ; ++i) {
            cond = ... // Note this value is ignored, but is required in the body
          }
  
      input ("", cond) // Note this is analogous to a while loop
          bool cond = ...;
          for (int i=0; cond; ++i) {
            cond = ...;
          }
  
      input ("", 1) // Note this is analogous to a do-while loop
          bool cond = true
          for (int i=0; cond; ++i) {
            cond = ...;
          }
  
      input (trip_count, "") // Note this is analogous to a for loop
          int trip_count = ...
          for (int i=0; i < trip_count; ++i) {
            cond = ...; // ignored
          }
  
      input (trip_count, cond)
          int trip_count = ...;
          bool cond = ...;
          for (int i=0; i < trip_count && cond; ++i) {
            cond = ...;
          }
  
  
  *Sample usage - cond as well as trip count*
  
      graph predict-net {
        %a = Constant[value = <Scalar Tensor [3]>]()
        %b = Constant[value = <Scalar Tensor [6]>]()
        %keepgoing = Constant[value = <Scalar Tensor [1]>]()
        %max_trip_count = Constant[value = <Scalar Tensor [10]>]()
        %keepgoing_out, %b_out, %user_defined_vals = Loop[body = <graph body-net>](%max_trip_count, %keepgoing, %b)
        return
      }
  
      graph body-net (
        %i[INT32, scalar]
        %keepgoing[BOOL, scalar]
        %b[INT32, scalar]
      ) {
        %my_local = Add(%a, %b)
        %b_out = Sub(%a, %b)
        %keepgoing_out = Greater(%my_local, %b_out)
        %user_defined_vals = Add(%b, %b)
        return %keepgoing_out, %b_out, %user_defined_vals
      }
  
  *Sample equivalent C code*
  
      {
        /* User-defined code (enclosing scope) */
        int a = 3, b = 6;
        bool keepgoing = true; // Analogous to input cond
        /* End user-defined code */
  
        /* Implicitly-defined code */
        const int max_trip_count = 10; // Analogous to input M
        int user_defined_vals[]; // Imagine this is resizable
        /* End implicitly-defined code */
        for (int i=0; i < max_trip_count && keepgoing; ++i) {
          /* User-defined code (loop body) */
          int my_local = a + b; // Reading values in the enclosing scope is fine
          b = a - b; // writes fine if we specify b as a loop-carried dependency
          keepgoing = my_local > b; // keepgoing is a loop-carried dependency
          user_defined_vals[i] = b + b;
          /* End user-defined code */
        }
        // my_local = 123; // Can't do this. my_local was defined in the the body
  
        // These below values are live-out from the loop and therefore accessible
        b_out; user_defined_vals; keepgoing_out;
      }
  
  There are several things of note in this code snippet:
  
  1) Values from the enclosing scope (i.e. variable a here) are in scope and can
     be referenced in the inputs of the loop.
  2) Any variables which you wish to make available in the enclosing scope (i.e.
     the variables b and keepgoing) must be declared as either loop-carried
     dependencies (both at the op inputs and output and at the body net input and
     output) or scan_outputs.
  3) Values created in the body cannot be accessed in the enclosing scope.
  
  Note that the semantics of this op support "diagonal" or "wavefront" execution.
  (See Step 3 here for an example:
  https://devblogs.nvidia.com/optimizing-recurrent-neural-networks-cudnn-5/).
  Frontends should emit multi-layer RNNs as a series of While operators (with
  time being the inner looping dimension), with each successive layer consuming
  the scan_outputs from the previous layer, possibly going through several
  point-wise operators (e.g. dropout, residual connections, linear layer).
  Concretely, the (possibly transformed) scan_outputs are referenced by the
  subsequent layer as a LoopIndexTensor operating on a value in scope, not
  necessarily a loop-carried dependency. Backends can recognize this pattern and
  are permitted to schedule the execution of the multi-layer network in a
  pipelined/"wavefront" fashion.
  

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>body</tt> : graph (required)</dt>
<dd>The graph run each iteration. It has 2+N inputs: (iteration_num, condition, loop carried dependencies...). It has 1+N+K outputs: (condition, loop carried dependencies..., scan_outputs...). Each scan_output is created by concatenating the value of the specified output value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.</dd>
</dl>

#### Inputs (3 - &#8734;)

<dl>
<dt><tt>M</tt> : I</dt>
<dd>A maximum trip-count for the loop specified at runtime. Optional. pass empty string to skip.</dd>
<dt><tt>cond</tt> : B</dt>
<dd>A boolean termination condition. Pass empty string to skip.</dd>
<dt><tt>v_initial</tt> (variadic) : V</dt>
<dd>The initial values of any loop-carried dependencies (values that change across loop iterations)</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>v_final_and_scan_outputs</tt> (variadic) : V</dt>
<dd>Final N loop carried dependency values then K scan_outputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>All Tensor types</dd>
<dt><tt>I</tt> : int64</dt>
<dd>Only int64</dd>
<dt><tt>B</tt> : bool</dt>
<dd>Only bool</dd>
</dl>

### <a name="LoopIndexTensor-1"></a>**LoopIndexTensor-1**</a>

  This is a special operator only valid inside the loop that supports the common case behavior of accessing the correct element of the input sequence in an RNN. This operator MUST be directly given the passed-in iteration number to the body of a Loop graph. This signals to back-ends that this is a direct indexing operation, with no transforms applied to the index.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Axis on which to index</dd>
</dl>

#### Inputs

<dl>
<dt><tt>T</tt> : T</dt>
<dd>Tensor to be indexed (has N dimensions)</dd>
<dt><tt>loop_idx</tt> : I</dt>
<dd>Loop index provided as input to the body graph</dd>
</dl>

#### Outputs

<dl>
<dt><tt>O</tt> : T</dt>
<dd>Tensor of N - 1 dims that is a sub tensor of T</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>All Tensor types</dd>
<dt><tt>I</tt> : int32</dt>
<dd>Indices</dd>
</dl>

### <a name="LpNormalization-1"></a>**LpNormalization-1**</a>

  Given a matrix, apply Lp-normalization along the provided axis.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(int64, default -1) the axis on which to apply normalization, -1 mean last axis.</dd>
<dt><tt>p</tt> : int</dt>
<dd>(int64, default 2) the order of the normalization, only 1 or 2 are supported.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input matrix</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Matrix after normalization</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="LpPool-1"></a>**LpPool-1**</a>

  LpPool consumes an input tensor X and applies Lp pooling across the
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   Lp pooling consisting of computing the Lp norm on all values of a subset
   of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>p</tt> : float</dt>
<dd>p value of the Lp norm used to pool over the input data, default is 2.0.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimension are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from Lp pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="MatMul-1"></a>**MatMul-1**</a>

  Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>N-dimensional matrix A</dd>
<dt><tt>B</tt> : T</dt>
<dd>N-dimensional matrix B</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Matrix multiply results from A * B</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Max-1"></a>**Max-1**</a>

  Element-wise max of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Max.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>max</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="MaxPool-1"></a>**MaxPool-1**</a>

  MaxPool consumes an input tensor X and applies max pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   max pooling consisting of computing the max on all values of a
   subset of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape will be following:
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
  
   * pad_shape[i] is sum of pads along axis i
   ```
  
   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]
   ```
   The output of each pooling window is maximum number of elements exclude pad.
   

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="MaxRoiPool-1"></a>**MaxRoiPool-1**</a>

  ROI max pool consumes an input tensor X and region of interests (RoIs) to
   apply max pooling across each RoI, to produce output 4-D tensor of shape
   (num_rois, channels, pooled_shape[0], pooled_shape[1]).

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>pooled_shape</tt> : list of ints (required)</dt>
<dd>ROI pool output shape (height, width).</dd>
<dt><tt>spatial_scale</tt> : float</dt>
<dd>Multiplicative spatial scale factor to translate ROI coordinates from their input scale to the scale used when pooling, default is 1.0f.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.</dd>
<dt><tt>rois</tt> : T</dt>
<dd>RoIs (Regions of Interest) to pool over. Should be a 2-D tensor of shape (num_rois, 5) given as [[batch_id, x1, y1, x2, y2], ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>RoI pooled output 4-D tensor of shape (num_rois, channels, pooled_shape[0], pooled_shape[1]).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Mean-1"></a>**Mean-1**</a>

  Element-wise mean of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Mean.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>mean</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="MeanVarianceNormalization-1"></a>**MeanVarianceNormalization-1**</a>

  Perform mean variance normalization.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>across_channels</tt> : int</dt>
<dd>If 1, mean and variance are computed across channels. Default is 0.</dd>
<dt><tt>normalize_variance</tt> : int</dt>
<dd>If 0, normalize the mean only.  Default is 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of shape [N,C,H,W]</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Result, has same shape and type as input</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Min-1"></a>**Min-1**</a>

  Element-wise min of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Min</dd>
</dl>

#### Outputs

<dl>
<dt><tt>min</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Mul-1"></a>**Mul-1**</a>

  Performs element-wise binary multiplication (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Neg-1"></a>**Neg-1**</a>

  Neg takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where each element flipped sign, y = -x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Not-1"></a>**Not-1**</a>

  Returns the negation of the input tensor element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input/output to boolean tensors.</dd>
</dl>

### <a name="Or-1"></a>**Or-1**</a>

  Returns the tensor resulted from performing the `or` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="PRelu-1"></a>**PRelu-1**</a>

  PRelu takes input data (Tensor<T>) and slope tensor as input, and produces one
  output data (Tensor<T>) where the function `f(x) = slope * x for x < 0`,
  `f(x) = x for x >= 0`., is applied to the data tensor elementwise.
  

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
<dt><tt>slope</tt> : T</dt>
<dd>Slope tensor. If `Slope` is of size 1, the value is sharedacross different channels</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Pad-1"></a>**Pad-1**</a>

  Given `data` tensor, paddings, mode, and value.
  
  Example:
    Insert 0 paddings to the beginning of the second dimension.
  
    data = [
        [1.0, 1.2],
        [2.3, 3.4],
        [4.5, 5.7],
    ]
    paddings = [0, 0, 2, 0]
  
    output = [
        [
            [0.0, 0.0, 1.0, 1.2],
            [0.0, 0.0, 2.3, 3.4],
            [0.0, 0.0, 4.5, 5.7],
        ],
    ]

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>mode</tt> : string</dt>
<dd>Three modes: constant(default), reflect, edge</dd>
<dt><tt>paddings</tt> : list of ints (required)</dt>
<dd>List of integers indicate the padding element count at the beginning and end of each axis, for 2D it is the number of pixel. `paddings` rank should be double of the input's rank. `paddings` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`.</dd>
<dt><tt>value</tt> : float</dt>
<dd>One float, indicates the value to be filled, default is 0</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Tensor after padding.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ParametricSoftplus-1"></a>**ParametricSoftplus-1**</a>

  ParametricSoftplus takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the softplus function, y = alpha * ln(exp(beta * x) + 1), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Value of alpha</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Value of beta</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Pow-1"></a>**Pow-1**</a>

  Pow takes input data (Tensor<T>) and exponent Tensor, and
  produces one output data (Tensor<T>) where the function `f(x) = x^exponent`,
  is applied to the data tensor elementwise.
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor of any shape, base of the exponent.</dd>
<dt><tt>Y</tt> : T</dt>
<dd>Input tensor of any shape broadcastable to X shape, the exponent component.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Z</tt> : T</dt>
<dd>Output tensor (same size as X)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="RNN-1"></a>**RNN-1**</a>

  Computes an one-layer simple RNN. This operator is usually supported
  via some custom implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `i` - input gate
  
  `t` - time step (t-1 means previous time step)
  
  `Wi` - W parameter weight matrix for input gate
  
  `Ri` - R recurrence weight matrix for input gate
  
  `Wbi` - W parameter bias vector for input gate
  
  `Rbi` - R parameter bias vector for input gate
  
  `WBi` - W parameter weight matrix for backward input gate
  
  `RBi` - R recurrence weight matrix for backward input gate
  
  `WBbi` - WR bias vectors for backward input gate
  
  `RBbi` - RR bias vectors for backward input gate
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Tanh):
  
    - Ht = f(Xt*(Wi^T) + Ht-1*Ri + Wbi + Rbi)

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>One (or two if bidirectional) activation function for input gate. The activation function must be one of the activation functions specified above. Optional: Default `Tanh` if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>output_sequence</tt> : int</dt>
<dd>The sequence output for the hidden is optional if 0. Default 0.</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for input gate. Concatenation of `Wi` and `WBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `Ri` and `RBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wbi, Rbi]` and `[WBbi, RBbi]` (if bidirectional). The tensor has shape `[num_directions, 2*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. It is optional if `output_sequence` is 0.</dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="RandomNormal-1"></a>**RandomNormal-1**</a>

  Generate a tensor with random values drawn from a normal distribution. The shape
  of the tensor is specified by the `shape` argument and the parameter of the normal distribution
  specified by `mean` and `scale`.
  
  The data type is specified by the 'dtype' argument. The 'dtype' argument must
  be one of the data types specified in the 'DataType' enum field in the
  TensorProto message.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>The data type for the elements of the output tensor. Default is TensorProto::FLOAT.</dd>
<dt><tt>mean</tt> : float</dt>
<dd>The mean of the normal distribution. If not specified, default is 0.</dd>
<dt><tt>scale</tt> : float</dt>
<dd>The standard deviation of the normal distribution. If not specified, default is 1.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
<dt><tt>shape</tt> : list of ints (required)</dt>
<dd>The shape of the output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of random values drawn from normal distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>

### <a name="RandomNormalLike-1"></a>**RandomNormalLike-1**</a>

  Generate a tensor with random values drawn from a normal distribution. 
  The shape of the output tensor is copied from the shape of the input tensor, 
  and the parameters of the normal distribution are specified by `mean` and `scale`.
  
  The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided. 
  The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the
  TensorProto message, and be valid as an output type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will usethe data type of the input tensor.</dd>
<dt><tt>mean</tt> : float</dt>
<dd>The mean of the normal distribution. If not specified, default is 0.</dd>
<dt><tt>scale</tt> : float</dt>
<dd>The standard deviation of the normal distribution. If not specified, default is 1.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to copy shape and optionally type information from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of random values drawn from normal distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain to any tensor type. If the dtype attribute is not provided this must be a valid output type.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>

### <a name="RandomUniform-1"></a>**RandomUniform-1**</a>

  Generate a tensor with random values drawn from a uniform distribution. The shape
  of the tensor is specified by the `shape` argument and the range by `low` and `high`.
  
  The data type is specified by the 'dtype' argument. The 'dtype' argument must
  be one of the data types specified in the 'DataType' enum field in the
  TensorProto message.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>The data type for the elements of the output tensor. If not specified, default is TensorProto::FLOAT.</dd>
<dt><tt>high</tt> : float</dt>
<dd>Upper boundary of the output values. If not specified, default is 1.</dd>
<dt><tt>low</tt> : float</dt>
<dd>Lower boundary of the output values. If not specified, default is 0.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
<dt><tt>shape</tt> : list of ints (required)</dt>
<dd>The shape of the output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of random values drawn from uniform distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>

### <a name="RandomUniformLike-1"></a>**RandomUniformLike-1**</a>

  Generate a tensor with random values drawn from a uniform distribution. 
  The shape of the output tensor is copied from the shape of the input tensor, 
  and the parameters of the uniform distribution are specified by `low` and `high`.
  
  The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided. 
  The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the
  TensorProto message and be valid as an output type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will usethe data type of the input tensor.</dd>
<dt><tt>high</tt> : float</dt>
<dd>Upper boundary of the output values. If not specified, default is 1.</dd>
<dt><tt>low</tt> : float</dt>
<dd>Lower boundary of the output values. If not specified, default is 0.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to copy shape and optionally type information from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of random values drawn from uniform distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain to any tensor type. If the dtype attribute is not provided this must be a valid output type.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>

### <a name="Reciprocal-1"></a>**Reciprocal-1**</a>

  Reciprocal takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the reciprocal is, y = 1/x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ReduceL1-1"></a>**ReduceL1-1**</a>

  Computes the L1 norm of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceL2-1"></a>**ReduceL2-1**</a>

  Computes the L2 norm of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceLogSum-1"></a>**ReduceLogSum-1**</a>

  Computes the log sum of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceLogSumExp-1"></a>**ReduceLogSumExp-1**</a>

  Computes the log sum exponent of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceMax-1"></a>**ReduceMax-1**</a>

  Computes the max of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceMean-1"></a>**ReduceMean-1**</a>

  Computes the mean of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceMin-1"></a>**ReduceMin-1**</a>

  Computes the min of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceProd-1"></a>**ReduceProd-1**</a>

  Computes the product of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceSum-1"></a>**ReduceSum-1**</a>

  Computes the sum of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="ReduceSumSquare-1"></a>**ReduceSumSquare-1**</a>

  Computes the sum square of the input tensor's element along the provided axes. The resulted
  tensor has the same rank as the input if keepdims equal 1. If keepdims equal 0, then
  the resulted tensor have the reduced dimension pruned.
  
  The above behavior is similar to numpy, with the exception that numpy default keepdims to
  False instead of True.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>A list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor.</dd>
<dt><tt>keepdims</tt> : int</dt>
<dd>Keep the reduced dimension or not, default 1 mean keep reduced dimension.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Relu-1"></a>**Relu-1**</a>

  Relu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the rectified linear function, y = max(0, x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Reshape-1"></a>**Reshape-1**</a>

  Reshape the input tensor similar to numpy.reshape.
  It takes a tensor as input and an argument `shape`. It outputs the reshaped tensor.
  At most one dimension of the new shape can be -1. In this case, the value is
  inferred from the size of the tensor and the remaining dimensions. A dimension
  could also be 0, in which case the actual dimension value is unchanged (i.e. taken
  from the input tensor).

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>shape</tt> : list of ints</dt>
<dd>New shape</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reshaped</tt> : T</dt>
<dd>Reshaped data.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Scale-1"></a>**Scale-1**</a>

  Scale takes one input data (Tensor<float>) and produces one output data
  (Tensor<float>) whose value is the input data tensor scaled element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>scale</tt> : float</dt>
<dd>(float, default 1.0) the scale to apply.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input data to be scaled</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output data after scaling</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ScaledTanh-1"></a>**ScaledTanh-1**</a>

  Calculates the scaled hyperbolic tangent of the given input tensor element-wise,
  alpha * tanh(beta * x). 
      

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Scaling value</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Scaling value</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The scaled hyperbolic tangent values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Selu-1"></a>**Selu-1**</a>

  Selu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the scaled exponential linear unit function,
  `y = gamma * (alpha * e^x - alpha) for x <= 0`, `y = gamma * x for x > 0`,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of SELU default to 1.6732.</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
<dt><tt>gamma</tt> : float</dt>
<dd>Coefficient of SELU default to 1.0507.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Shape-1"></a>**Shape-1**</a>

  Takes a tensor as input and outputs an 1D int64 tensor containing the shape of the input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>shape</tt> : T1</dt>
<dd>Shape of the input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(bool)</dt>
<dd>Input tensor can be of arbitrary type.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrains output to int64 tensor.</dd>
</dl>

### <a name="Sigmoid-1"></a>**Sigmoid-1**</a>

  Sigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the sigmoid function, y = 1 / (1 + exp(-x)), is applied to the
  tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Size-1"></a>**Size-1**</a>

  Takes a tensor as input and outputs a int64 scalar that equals to the total number of elements of the input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>size</tt> : T1</dt>
<dd>Total number of elements of the input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(bool)</dt>
<dd>Input tensor can be of arbitrary type.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrains output to int64 tensor, which should be a scalar though.</dd>
</dl>

### <a name="Slice-1"></a>**Slice-1**</a>

  Produces a slice of the input tensor along multiple axes. Similar to numpy:
  https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
  Slices uses `axes`, `starts` and `ends` attributes to specify the start and end
  dimension for each axis in the list of axes, it uses this information to
  slice the input `data` tensor. If a negative value is passed for any of the
  start or end indices, it represent number of elements before the end of that
  dimension. If the value passed to start or end is larger than the `n` (the
  number of elements in this dimension), it represents `n`. For slicing to the
  end of a dimension with unknown size, it is recommended to pass in `INT_MAX`.
  If `axes` are omitted, they are set to `[0, ..., ndim-1]`.
  Example 1:
    data = [
        [1, 2, 3, 4],
        [5, 6, 7, 8],
    ]
    axes = [0, 1]
    starts = [1, 0]
    ends = [2, 3]
    result = [
        [5, 6, 7],
    ]
  Example 2:
    data = [
        [1, 2, 3, 4],
        [5, 6, 7, 8],
    ]
    starts = [0, 1]
    ends = [-1, 1000]
    result = [
        [2, 3, 4],
    ]

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>Axes that `starts` and `ends` apply to. It's optional. If not present, will be treated as [0, 1, ..., len(`starts`) - 1].</dd>
<dt><tt>ends</tt> : list of ints (required)</dt>
<dd>Ending indices (exclusive) of corresponding axis in axes`</dd>
<dt><tt>starts</tt> : list of ints (required)</dt>
<dd>Starting indices of corresponding axis in `axes`</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Tensor of data to extract slices from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Sliced data tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Softmax-1"></a>**Softmax-1**</a>

  The operator computes the softmax (normalized exponential) values for each layer in the batch
   of the given input. The input is a 2-D tensor (Tensor<float>) of size
  (batch_size x input_feature_dimensions). The output tensor has the same shape
  and contains the softmax values of the corresponding input.
  
  X does not need to explicitly be a 2D vector; rather, it will be
  coerced into one. For an arbitrary n-dimensional tensor
  X \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
  the axis provided, then X will be coerced into a 2-dimensional tensor with
  dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
  case where axis=1, this means the X tensor will be coerced into a 2D tensor
  of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
  In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
  Each of these dimensions must be matched correctly, or else the operator
  will throw errors.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(int) default to 1; describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The input tensor that's coerced into a 2D matrix of size (NxD) as described above.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output values with the same shape as input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Softplus-1"></a>**Softplus-1**</a>

  Softplus takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the softplus function, y = ln(exp(x) + 1), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Softsign-1"></a>**Softsign-1**</a>

  Calculates the softsign (x/(1+|x|)) of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The softsign (x/(1+|x|)) values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="SpaceToDepth-1"></a>**SpaceToDepth-1**</a>

  SpaceToDepth rearranges blocks of spatial data into depth. More specifically,
  this op outputs a copy of the input tensor where values from the height and width dimensions
  are moved to the depth dimension.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>blocksize</tt> : int (required)</dt>
<dd>Blocks of [blocksize, blocksize] are moved.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of [N, C * blocksize * blocksize, H/blocksize, W/blocksize].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
</dl>

### <a name="Split-1"></a>**Split-1**</a>

  Split a tensor into a list of tensors, along the specified
  'axis'. The lengths of the split can be specified using argument 'axis' or
  optional second input blob to the operator. Otherwise, the tensor is split
  to equal sized parts.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Which axis to split on</dd>
<dt><tt>split</tt> : list of ints</dt>
<dd>length of each output</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The tensor to split</dd>
<dt><tt>split</tt> (optional) : T</dt>
<dd>Optional list of output lengths (see also arg 'split')</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs...</tt> (variadic) : T</dt>
<dd>One or more outputs forming list of tensors after splitting</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
</dl>

### <a name="Sqrt-1"></a>**Sqrt-1**</a>

  Square root takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the square root is, y = x^0.5, is applied to
  the tensor elementwise. If x is negative, then it will return NaN.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Squeeze-1"></a>**Squeeze-1**</a>

  Remove single-dimensional entries from the shape of a tensor.
  Takes a  parameter `axes` with a list of axes to squeeze.
  If `axes` is not provided, all the single dimensions will be removed from
  the shape. If an axis is selected with shape entry not equal to one, an error is raised.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>List of positive integers, indicate the dimensions to squeeze.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Tensors with at least max(dims) dimensions.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>squeezed</tt> : T</dt>
<dd>Reshaped tensor with same data as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
</dl>

### <a name="Sub-1"></a>**Sub-1**</a>

  Performs element-wise binary subtraction (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sum-1"></a>**Sum-1**</a>

  Element-wise sum of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Sum.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>sum</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Tanh-1"></a>**Tanh-1**</a>

  Calculates the hyperbolic tangent of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>consumed_inputs</tt> : list of ints</dt>
<dd>legacy optimization attribute.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>1-D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The hyperbolic tangent values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="ThresholdedRelu-1"></a>**ThresholdedRelu-1**</a>

  ThresholdedRelu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the rectified linear function, y = x for x > alpha, y = 0 otherwise,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Threshold value</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Tile-1"></a>**Tile-1**</a>

  Repeat the elements of a tensor along an axis.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of any shape.</dd>
<dt><tt>tiles</tt> : T</dt>
<dd>Number of repeated copies to make of the input tensor.</dd>
<dt><tt>axis</tt> : T</dt>
<dd>Axis along which to repeat.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of same shape and type as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain tiles and axis's type to int64 tensors.</dd>
</dl>

### <a name="TopK-1"></a>**TopK-1**</a>

  Retrieve the top-K elements along a specified axis. Given an input tensor of
  shape [a_1, a_2, ..., a_n, r] and integer argument k, return two outputs:
    -Value tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n]
      which contains the values of the top k elements along the specified axis
    -Index tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n] which
     contains the indices of the top k elements (original indices from the input
     tensor).
  
  Given two equivalent values, this operator uses the indices along the axis  as
   a tiebreaker. That is, the element with the lower index will appear first.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Dimension on which to do the sort. Default -1, which indicates the last axis</dd>
<dt><tt>k</tt> : int (required)</dt>
<dd>Number of top elements to retrieve</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Tensor of shape [a_1, a_2, ..., a_n, r]</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Values</tt> : T</dt>
<dd>Tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n] containing top K values from the input tensor</dd>
<dt><tt>Indices</tt> : I</dt>
<dd>Tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n] containing the corresponding input tensor indices for the top K values.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>I</tt> : tensor(int64)</dt>
<dd>Constrain index tensor to int64</dd>
</dl>

### <a name="Transpose-1"></a>**Transpose-1**</a>

  Transpose the input tensor similar to numpy.transpose. For example, when
  perm=(1, 0, 2), given an input tensor of shape (1, 2, 3), the output shape
  will be (2, 1, 3).

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>perm</tt> : list of ints</dt>
<dd>A list of integers. By default, reverse the dimensions, otherwise permute the axes according to the values given.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>transposed</tt> : T</dt>
<dd>Transposed output.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Unsqueeze-1"></a>**Unsqueeze-1**</a>

  Insert single-dimensional entries to the shape of a tensor.
  Takes one required argument `axes`, a list of dimensions that will be inserted.
  Dimension indices in `axes` are as seen in the output tensor. For example:
    Given a tensor such that tensor with shape [3, 4, 5], then
    Unsqueeze(tensor, axes=[0, 4]) has shape [1, 3, 4, 5, 1]

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints (required)</dt>
<dd>List of positive integers, indicate the dimensions to be inserted</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Original tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>expanded</tt> : T</dt>
<dd>Reshaped tensor with same data as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
</dl>

### <a name="Upsample-1"></a>**Upsample-1**</a>

  Upsample the input tensor.
  The width and height of the output tensor are:
    output_width = floor(input_width * width_scale),
    output_height = floor(input_height * height_scale).
  Example:
    Given `data` tensor, width_scale, height_scale, mode,
    Upsample the input 4-D tensor in nearest mode:
    data = [[[
        [1, 2],
        [3, 4]
    ]]]
    width_scale = 2
    height_scale = 2
    mode = "nearest"
    output = [[[
        [1, 1, 2, 2],
        [1, 1, 2, 2],
        [3, 3, 4, 4],
        [3, 3, 4, 4]
    ]]]

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>height_scale</tt> : float (required)</dt>
<dd>The scale along height dimension. It takes value greater than or equal to 1.</dd>
<dt><tt>mode</tt> : string</dt>
<dd>Two interpolation modes: nearest(default), bilinear</dd>
<dt><tt>width_scale</tt> : float (required)</dt>
<dd>The scale along width dimension. It takes value greater than or equal to 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>4-D tensor, [N,C,H,W]</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>4-D tensor after resizing, [N,C,H,W]</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to bool, int32, int64, float16, float, double tensors.</dd>
</dl>

### <a name="Xor-1"></a>**Xor-1**</a>

  Returns the tensor resulted from performing the `xor` logical operation
  elementwise on the input tensors `A` and `B`.
  
  If broadcasting is enabled, the right-hand-side argument will be broadcasted
  to match the shape of left-hand-side argument. See the doc of `Add` for a
  detailed description of the broadcasting rules.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Left input tensor for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Right input tensor for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

## Version 2 of the default ONNX operator set
### <a name="GlobalLpPool-2"></a>**GlobalLpPool-2**</a>

  GlobalLpPool consumes an input tensor X and applies lp pool pooling across
   the values in the same channel. This is equivalent to LpPool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 2 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>p</tt> : int</dt>
<dd>p value of the Lp norm used to pool over the input data, default is 2.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from pooling across the input tensor. Dimensions will be N x C x 1 x 1</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="LpPool-2"></a>**LpPool-2**</a>

  LpPool consumes an input tensor X and applies Lp pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   Lp pooling consisting of computing the Lp norm on all values of a subset
   of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing.

#### Version

This version of the operator has been available since version 2 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>p</tt> : int</dt>
<dd>p value of the Lp norm used to pool over the input data, default is 2.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 0 along each axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from Lp pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Pad-2"></a>**Pad-2**</a>

  Given `data` tensor, pads, mode, and value.
  Example:
    Insert 0 pads to the beginning of the second dimension.
    data = [
        [1.0, 1.2],
        [2.3, 3.4],
        [4.5, 5.7],
    ]
    pads = [0, 2, 0, 0]
    output = [
        [
            [0.0, 0.0, 1.0, 1.2],
            [0.0, 0.0, 2.3, 3.4],
            [0.0, 0.0, 4.5, 5.7],
        ],
    ]

#### Version

This version of the operator has been available since version 2 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>mode</tt> : string</dt>
<dd>Three modes: constant(default), reflect, edge</dd>
<dt><tt>pads</tt> : list of ints (required)</dt>
<dd>List of integers indicating the number of padding elements to add or remove (if negative) at the beginning and end of each axis. For 2D it is the number of pixels. `pads` rank should be double of the input's rank. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`.</dd>
<dt><tt>value</tt> : float</dt>
<dd>One float, indicates the value to be filled, default is 0</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>Input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Tensor after padding.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Split-2"></a>**Split-2**</a>

  Split a tensor into a list of tensors, along the specified
  'axis'. Lengths of the parts can be specified using argument 'split'.
  Otherwise, the tensor is split to equal sized parts.

#### Version

This version of the operator has been available since version 2 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>Which axis to split on (defaults to 0)</dd>
<dt><tt>split</tt> : list of ints</dt>
<dd>length of each output</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The tensor to split</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs</tt> (variadic) : T</dt>
<dd>One or more outputs forming list of tensors after splitting</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
</dl>

## Version 3 of the default ONNX operator set
### <a name="GRU-3"></a>**GRU-3**</a>

  Computes an one-layer GRU. This operator is usually supported via some custom
  implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `z` - update gate
  
  `r` - reset gate
  
  `h` - hidden gate
  
  `t` - time step (t-1 means previous time step)
  
  `W[zrh]` - W parameter weight matrix for update, reset, and hidden gates
  
  `R[zrh]` - R recurrence weight matrix for update, reset, and hidden gates
  
  `Wb[zrh]` - W bias vectors for update, reset, and hidden gates
  
  `Rb[zrh]` - R bias vectors for update, reset, and hidden gates
  
  `WB[zrh]` - W parameter weight matrix for backward update, reset, and hidden gates
  
  `RB[zrh]` - R recurrence weight matrix for backward update, reset, and hidden gates
  
  `WBb[zrh]` - W bias vectors for backward update, reset, and hidden gates
  
  `RBb[zrh]` - R bias vectors for backward update, reset, and hidden gates
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Sigmoid, g=Tanh):
  
    - zt = f(Xt*(Wz^T) + Ht-1*Rz + Wbz + Rbz)
  
    - rt = f(Xt*(Wr^T) + Ht-1*Rr + Wbr + Rbr)
  
    - ht = g(Xt*(Wh^T) + (rt (.) Ht-1)*Rh + Rbh + Wbh) # default, when linear_before_reset = 0
  
    - ht = g(Xt*(Wh^T) + (rt (.) (Ht-1*Rh + Rbh) + Wbh) # when linear_before_reset != 0
  
    - Ht = (1 - zt) (.) ht + zt (.) Ht-1

#### Version

This version of the operator has been available since version 3 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 2 (or 4 if bidirectional) activation functions for update, reset, and hidden gates. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>linear_before_reset</tt> : int</dt>
<dd>When computing the output of the hidden gate, apply the linear transformation before multiplying by the output of the reset gate.</dd>
<dt><tt>output_sequence</tt> : int</dt>
<dd>The sequence output for the hidden is optional if 0. Default 0.</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[zrh]` and `WB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[zrh]` and `RB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for the gates. Concatenation of `[Wb[zrh], Rb[zrh]]` and `[WBb[zrh], RBb[zrh]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 6*hidden_size]`. Optional: If not specified - assumed to be 0</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. It is optional if `output_sequence` is 0.</dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

## Version 4 of the default ONNX operator set
### <a name="Concat-4"></a>**Concat-4**</a>

  Concatenate a list of tensors into a single tensor

#### Version

This version of the operator has been available since version 4 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int (required)</dt>
<dd>Which axis to concat on</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>inputs</tt> (variadic) : T</dt>
<dd>List of tensors for concatenation</dd>
</dl>

#### Outputs

<dl>
<dt><tt>concat_result</tt> : T</dt>
<dd>Concatenated tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain output types to any tensor type.</dd>
</dl>

## Version 5 of the default ONNX operator set
### <a name="Reshape-5"></a>**Reshape-5**</a>

  Reshape the input tensor similar to numpy.reshape.
  First input is the data tensor, second input is a shape tensor which specifies the output shape. It outputs the reshaped tensor.
  At most one dimension of the new shape can be -1. In this case, the value is
  inferred from the size of the tensor and the remaining dimensions. A dimension
  could also be 0, in which case the actual dimension value is unchanged (i.e. taken
  from the input tensor).

#### Version

This version of the operator has been available since version 5 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>An input tensor.</dd>
<dt><tt>shape</tt> : tensor(int64)</dt>
<dd>Specified shape for output.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reshaped</tt> : T</dt>
<dd>Reshaped data.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

## Version 6 of the default ONNX operator set
### <a name="Abs-6"></a>**Abs-6**</a>

  Absolute takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the absolute is, y = abs(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>

### <a name="Add-6"></a>**Add-6**</a>

  Performs element-wise binary addition (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="BatchNormalization-6"></a>**BatchNormalization-6**</a>

  Carries out batch normalization as described in the paper
  https://arxiv.org/abs/1502.03167. Depending on the mode it is being run,
  there are multiple cases for the number of outputs, which we list below:
  
  Output case #1: Y, mean, var, saved_mean, saved_var (training mode)
  Output case #2: Y (test mode)

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float</dt>
<dd>The epsilon value to use to avoid division by zero, default is 1e-5f.</dd>
<dt><tt>is_test</tt> : int</dt>
<dd>If set to nonzero, run spatial batch normalization in test mode, default is 0.</dd>
<dt><tt>momentum</tt> : float</dt>
<dd>Factor used in computing the running mean and variance.e.g., running_mean = running_mean * momentum + mean * (1 - momentum), default is 0.9f.</dd>
<dt><tt>spatial</tt> : int</dt>
<dd>If true, compute the mean and variance across all spatial elements If false, compute the mean and variance across per feature.Default is 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
<dt><tt>scale</tt> : T</dt>
<dd>The scale as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>B</tt> : T</dt>
<dd>The bias as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>mean</tt> : T</dt>
<dd>The running mean (training) or the estimated mean (testing) as a 1-dimensional tensor of size C.</dd>
<dt><tt>var</tt> : T</dt>
<dd>The running variance (training) or the estimated variance (testing) as a 1-dimensional tensor of size C.</dd>
</dl>

#### Outputs (1 - 5)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>The output tensor of the same shape as X.</dd>
<dt><tt>mean</tt> (optional) : T</dt>
<dd>The running mean after the BatchNormalization operator. Must be in-place with the input mean. Should not be used for testing.</dd>
<dt><tt>var</tt> (optional) : T</dt>
<dd>The running variance after the BatchNormalization operator. Must be in-place with the input var. Should not be used for testing.</dd>
<dt><tt>saved_mean</tt> (optional) : T</dt>
<dd>Saved mean used during training to speed up gradient computation. Should not be used for testing.</dd>
<dt><tt>saved_var</tt> (optional) : T</dt>
<dd>Saved variance used during training to speed up gradient computation. Should not be used for testing.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Cast-6"></a>**Cast-6**</a>

  The operator casts the elements of a given input tensor to a data type
  specified by the 'to' argument and returns an output tensor of the same size in
  the converted type. The 'to' argument must be one of the data types specified
  in the 'DataType' enum field in the TensorProto message.
  NOTE: Casting to and from strings is not supported yet.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>to</tt> : int (required)</dt>
<dd>The data type to which the elements of the input tensor are cast.Strictly must be one of the types from DataType enum in TensorProto</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to be cast.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor with the same shape as input with type specified by the 'to' argument</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain input types. Casting from strings and complex are not supported.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain output types. Casting to strings and complex are not supported.</dd>
</dl>

### <a name="Ceil-6"></a>**Ceil-6**</a>

  Ceil takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the ceil is, y = ceil(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Clip-6"></a>**Clip-6**</a>

  Clip operator limits the given input within an interval. The interval is
  specified with arguments 'min' and 'max'. They default to
  numeric_limits::lowest() and numeric_limits::max() respectively.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>max</tt> : float</dt>
<dd>Maximum value, above which element is replaced by max</dd>
<dt><tt>min</tt> : float</dt>
<dd>Minimum value, under which element is replaced by min</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor whose elements to be clipped</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor with clipped input elements</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Div-6"></a>**Div-6**</a>

  Performs element-wise binary division (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Dropout-6"></a>**Dropout-6**</a>

  Dropout takes one input data (Tensor<float>) and produces two Tensor outputs,
  output (Tensor<float>) and mask (Tensor<bool>). Depending on whether it is in
  test mode or not, the output Y will either be a random dropout, or a simple
  copy of the input. Note that our implementation of Dropout does scaling in
  the training phase, so during testing nothing needs to be done.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>is_test</tt> : int</dt>
<dd>(int, default 0) if nonzero, run dropout in test mode where the output is simply Y = X.</dd>
<dt><tt>ratio</tt> : float</dt>
<dd>(float, default 0.5) the ratio of random dropout</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>The input data as Tensor.</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output.</dd>
<dt><tt>mask</tt> (optional) : T</dt>
<dd>The output mask. If is_test is nonzero, this output is not filled.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Elu-6"></a>**Elu-6**</a>

  Elu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the function `f(x) = alpha * (exp(x) - 1.) for x <
  0`, `f(x) = x for x >= 0`., is applied to the tensor elementwise.
  

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of ELU default to 1.0.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Exp-6"></a>**Exp-6**</a>

  Calculates the exponential of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The exponential of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Floor-6"></a>**Floor-6**</a>

  Floor takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the floor is, y = floor(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Gemm-6"></a>**Gemm-6**</a>

  General Matrix multiplication:
  https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3
  Compute Y = alpha * A * B + beta * C, where input tensor A has
  dimension (M X K), input tensor B has dimension (K X N), input tensor C and
  output tensor Y have dimension (M X N).
  If attribute broadcast is non-zero, input tensor C will be broadcasted to match
  the dimension requirement. A will be transposed before doing the computation
  if attribute transA is non-zero, same for B and transB.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Scalar multiplier for the product of input tensors A * B</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Scalar multiplier for input tensor C</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Whether C should be broadcasted</dd>
<dt><tt>transA</tt> : int</dt>
<dd>Whether A should be transposed</dd>
<dt><tt>transB</tt> : int</dt>
<dd>Whether B should be transposed</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Input tensor A</dd>
<dt><tt>B</tt> : T</dt>
<dd>Input tensor B</dd>
<dt><tt>C</tt> : T</dt>
<dd>Input tensor C</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="HardSigmoid-6"></a>**HardSigmoid-6**</a>

  HardSigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the HardSigmoid function, y = max(0, min(1, alpha * x + beta)),
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Value of alpha default to 0.2</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Value of beta default to 0.5</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="InstanceNormalization-6"></a>**InstanceNormalization-6**</a>

  Carries out instance normalization as described in the paper
  https://arxiv.org/abs/1607.08022.
  
  y = scale * (x - mean) / sqrt(variance + epsilon) + B,
  where mean and variance are computed per instance per channel.
  

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float</dt>
<dd>The epsilon value to use to avoid division by zero, default is 1e-5f.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
<dt><tt>scale</tt> : T</dt>
<dd>The input 1-dimensional scale tensor of size C.</dd>
<dt><tt>B</tt> : T</dt>
<dd>The input 1-dimensional bias tensor of size C.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output tensor of the same shape as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="LeakyRelu-6"></a>**LeakyRelu-6**</a>

  LeakyRelu takes input data (Tensor<T>) and an argument alpha, and produces one
  output data (Tensor<T>) where the function `f(x) = alpha * x for x < 0`,
  `f(x) = x for x >= 0`, is applied to the data tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of leakage default to 0.01.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Log-6"></a>**Log-6**</a>

  Calculates the natural log of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The natural log of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Max-6"></a>**Max-6**</a>

  Element-wise max of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Max.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>max</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Mean-6"></a>**Mean-6**</a>

  Element-wise mean of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Mean.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>mean</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Min-6"></a>**Min-6**</a>

  Element-wise min of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Min</dd>
</dl>

#### Outputs

<dl>
<dt><tt>min</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Mul-6"></a>**Mul-6**</a>

  Performs element-wise binary multiplication (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Neg-6"></a>**Neg-6**</a>

  Neg takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where each element flipped sign, y = -x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float), tensor(int32), tensor(int8), tensor(int16), tensor(int64), tensor(float16), tensor(double)</dt>
<dd>Constrain input and output types to signed numeric tensors.</dd>
</dl>

### <a name="PRelu-6"></a>**PRelu-6**</a>

  PRelu takes input data (Tensor<T>) and slope tensor as input, and produces one
  output data (Tensor<T>) where the function `f(x) = slope * x for x < 0`,
  `f(x) = x for x >= 0`., is applied to the data tensor elementwise.
  

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
<dt><tt>slope</tt> : T</dt>
<dd>Slope tensor. If `Slope` is of size 1, the value is sharedacross different channels</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Reciprocal-6"></a>**Reciprocal-6**</a>

  Reciprocal takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the reciprocal is, y = 1/x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Relu-6"></a>**Relu-6**</a>

  Relu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the rectified linear function, y = max(0, x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Selu-6"></a>**Selu-6**</a>

  Selu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the scaled exponential linear unit function,
  `y = gamma * (alpha * e^x - alpha) for x <= 0`, `y = gamma * x for x > 0`,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Coefficient of SELU default to 1.67326319217681884765625 (i.e., float32 approximation of 1.6732632423543772848170429916717).</dd>
<dt><tt>gamma</tt> : float</dt>
<dd>Coefficient of SELU default to 1.05070102214813232421875 (i.e., float32 approximation of 1.0507009873554804934193349852946).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sigmoid-6"></a>**Sigmoid-6**</a>

  Sigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the sigmoid function, y = 1 / (1 + exp(-x)), is applied to the
  tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sqrt-6"></a>**Sqrt-6**</a>

  Square root takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the square root is, y = x^0.5, is applied to
  the tensor elementwise. If x is negative, then it will return NaN.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sub-6"></a>**Sub-6**</a>

  Performs element-wise binary subtraction (with limited broadcast support).
  
  If necessary the right-hand-side argument will be broadcasted to match the
  shape of left-hand-side argument. When broadcasting is specified, the second
  tensor can either be of element size 1 (including a scalar tensor and any
  tensor with rank equal to or smaller than the first tensor), or having its
  shape as a contiguous subset of the first tensor's shape. The starting of the
  mutually equal shape is specified by the argument "axis", and if it is not set,
  suffix matching is assumed. 1-dim expansion doesn't work yet.
  
  For example, the following tensor shapes are supported (with broadcast=1):
  
    shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (1, 1), i.e. B is an 1-element tensor
    shape(A) = (2, 3, 4, 5), shape(B) = (5,)
    shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
    shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
    shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
  
  Attribute `broadcast=1` needs to be passed to enable broadcasting.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>If set, defines the broadcast dimensions. See doc for details.</dd>
<dt><tt>broadcast</tt> : int</dt>
<dd>Pass 1 to enable broadcasting</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand, should share the type with the second operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same dimensions and type as A</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Sum-6"></a>**Sum-6**</a>

  Element-wise sum of each of the input tensors. All inputs and outputs must
  have the same shape and data type.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for Sum.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>sum</tt> : T</dt>
<dd>Output tensor. Same dimension as inputs.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Tanh-6"></a>**Tanh-6**</a>

  Calculates the hyperbolic tangent of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The hyperbolic tangent values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Tile-6"></a>**Tile-6**</a>

  Constructs a tensor by tiling a given tensor.
  This is the same as function `tile` in Numpy, but no broadcast.
  For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor of any shape.</dd>
<dt><tt>repeats</tt> : T1</dt>
<dd>1D int64 tensor of the same length as input's dimension number, includes numbers of repeated copies along input's dimensions.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of the same dimension and type as tensor input. output_dim[i] = input_dim[i] * repeats[i]</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output's types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain repeat's type to int64 tensors.</dd>
</dl>

## Version 7 of the default ONNX operator set
### <a name="Acos-7"></a>**Acos-7**</a>

  Calculates the arccosine (inverse of cosine) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The arccosine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Add-7"></a>**Add-7**</a>

  Performs element-wise binary addition (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="And-7"></a>**And-7**</a>

  Returns the tensor resulted from performing the `and` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="Asin-7"></a>**Asin-7**</a>

  Calculates the arcsine (inverse of sine) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The arcsine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Atan-7"></a>**Atan-7**</a>

  Calculates the arctangent (inverse of tangent) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The arctangent of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="AveragePool-7"></a>**AveragePool-7**</a>

  AveragePool consumes an input tensor X and applies average pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   average pooling consisting of computing the average on all values of a
   subset of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape will be following:
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
  
   * pad_shape[i] is sum of pads along axis i
   ```
  
   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]
   ```
   The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).
   

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string</dt>
<dd>auto_pad must be either SAME_UPPER, SAME_LOWER or VALID. Where SAME_UPPER or SAME_LOWER mean pad the input so that the output size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. DEPRECATION NOTE: auto_pad is only intended to support legacy uses, and for framework authors, one is explicitly encouraged to use explicit padding specified in the pads attribute.</dd>
<dt><tt>count_include_pad</tt> : int</dt>
<dd>Whether include pad pixels when calculating values for the edges.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="BatchNormalization-7"></a>**BatchNormalization-7**</a>

  Carries out batch normalization as described in the paper
  https://arxiv.org/abs/1502.03167. Depending on the mode it is being run,
  there are multiple cases for the number of outputs, which we list below:
  
  Output case #1: Y, mean, var, saved_mean, saved_var (training mode)
  Output case #2: Y (test mode)
      This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float</dt>
<dd>The epsilon value to use to avoid division by zero, default is 1e-5f.</dd>
<dt><tt>momentum</tt> : float</dt>
<dd>Factor used in computing the running mean and variance.e.g., running_mean = running_mean * momentum + mean * (1 - momentum), default is 0.9f.</dd>
<dt><tt>spatial</tt> : int</dt>
<dd>If true, compute the mean and variance across all spatial elements If false, compute the mean and variance across per feature.Default is 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
<dt><tt>scale</tt> : T</dt>
<dd>The scale as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>B</tt> : T</dt>
<dd>The bias as a 1-dimensional tensor of size C to be applied to the output.</dd>
<dt><tt>mean</tt> : T</dt>
<dd>The running mean (training) or the estimated mean (testing) as a 1-dimensional tensor of size C.</dd>
<dt><tt>var</tt> : T</dt>
<dd>The running variance (training) or the estimated variance (testing) as a 1-dimensional tensor of size C.</dd>
</dl>

#### Outputs (1 - 5)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>The output tensor of the same shape as X.</dd>
<dt><tt>mean</tt> (optional) : T</dt>
<dd>The running mean after the BatchNormalization operator.</dd>
<dt><tt>var</tt> (optional) : T</dt>
<dd>The running variance after the BatchNormalization operator.</dd>
<dt><tt>saved_mean</tt> (optional) : T</dt>
<dd>Saved mean used during training to speed up gradient computation.</dd>
<dt><tt>saved_var</tt> (optional) : T</dt>
<dd>Saved variance used during training to speed up gradient computation.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Cos-7"></a>**Cos-7**</a>

  Calculates the cosine of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The cosine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Div-7"></a>**Div-7**</a>

  Performs element-wise binary division (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Dropout-7"></a>**Dropout-7**</a>

  Dropout takes one input data (Tensor<float>) and produces two Tensor outputs,
  output (Tensor<float>) and mask (Tensor<bool>). Depending on whether it is in
  test mode or not, the output Y will either be a random dropout, or a simple
  copy of the input. Note that our implementation of Dropout does scaling in
  the training phase, so during testing nothing needs to be done.
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>ratio</tt> : float</dt>
<dd>(float, default 0.5) the ratio of random dropout</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> : T</dt>
<dd>The input data as Tensor.</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The output.</dd>
<dt><tt>mask</tt> (optional) : T</dt>
<dd>The output mask.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Equal-7"></a>**Equal-7**</a>

  Returns the tensor resulted from performing the `equal` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool), tensor(int32), tensor(int64)</dt>
<dd>Constrains input to integral tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="GRU-7"></a>**GRU-7**</a>

  Computes an one-layer GRU. This operator is usually supported via some custom
  implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `z` - update gate
  
  `r` - reset gate
  
  `h` - hidden gate
  
  `t` - time step (t-1 means previous time step)
  
  `W[zrh]` - W parameter weight matrix for update, reset, and hidden gates
  
  `R[zrh]` - R recurrence weight matrix for update, reset, and hidden gates
  
  `Wb[zrh]` - W bias vectors for update, reset, and hidden gates
  
  `Rb[zrh]` - R bias vectors for update, reset, and hidden gates
  
  `WB[zrh]` - W parameter weight matrix for backward update, reset, and hidden gates
  
  `RB[zrh]` - R recurrence weight matrix for backward update, reset, and hidden gates
  
  `WBb[zrh]` - W bias vectors for backward update, reset, and hidden gates
  
  `RBb[zrh]` - R bias vectors for backward update, reset, and hidden gates
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Sigmoid, g=Tanh):
  
    - zt = f(Xt*(Wz^T) + Ht-1*(Rz^T) + Wbz + Rbz)
  
    - rt = f(Xt*(Wr^T) + Ht-1*(Rr^T) + Wbr + Rbr)
  
    - ht = g(Xt*(Wh^T) + (rt (.) Ht-1)*(Rh^T) + Rbh + Wbh) # default, when linear_before_reset = 0
  
    - ht = g(Xt*(Wh^T) + (rt (.) (Ht-1*(Rh^T) + Rbh) + Wbh) # when linear_before_reset != 0
  
    - Ht = (1 - zt) (.) ht + zt (.) Ht-1
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 2 (or 4 if bidirectional) activation functions for update, reset, and hidden gates. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>linear_before_reset</tt> : int</dt>
<dd>When computing the output of the hidden gate, apply the linear transformation before multiplying by the output of the reset gate.</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[zrh]` and `WB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[zrh]` and `RB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for the gates. Concatenation of `[Wb[zrh], Rb[zrh]]` and `[WBb[zrh], RBb[zrh]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 6*hidden_size]`. Optional: If not specified - assumed to be 0</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="Gemm-7"></a>**Gemm-7**</a>

  General Matrix multiplication:
  https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3
  
  A' = transpose(A) if transA else A
  
  B' = transpose(B) if transB else B
  
  Compute Y = alpha * A' * B' + beta * C, where input tensor A has shape (M, K) or (K, M),
  input tensor B has shape (K, N) or (N, K), input tensor C is broadcastable to shape (M, N),
  and output tensor Y has shape (M, N). A will be transposed before doing the
  computation if attribute transA is non-zero, same for B and transB.
  This operator supports **unidirectional broadcasting** (tensor C should be unidirectional broadcastable to tensor A * B); for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float</dt>
<dd>Scalar multiplier for the product of input tensors A * B</dd>
<dt><tt>beta</tt> : float</dt>
<dd>Scalar multiplier for input tensor C</dd>
<dt><tt>transA</tt> : int</dt>
<dd>Whether A should be transposed</dd>
<dt><tt>transB</tt> : int</dt>
<dd>Whether B should be transposed</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>Input tensor A. The shape of A should be (M, K) if transA is 0, or (K, M) if transA is non-zero.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Input tensor B. The shape of B should be (K, N) if transB is 0, or (N, K) if transB is non-zero.</dd>
<dt><tt>C</tt> : T</dt>
<dd>Input tensor C. The shape of C should be unidirectional broadcastable to (M, N).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor of shape (M, N).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Greater-7"></a>**Greater-7**</a>

  Returns the tensor resulted from performing the `greater` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrains input to float tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="LSTM-7"></a>**LSTM-7**</a>

  Computes an one-layer LSTM. This operator is usually supported via some
  custom implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `i` - input gate
  
  `o` - output gate
  
  `f` - forget gate
  
  `c` - cell gate
  
  `t` - time step (t-1 means previous time step)
  
  `W[iofc]` - W parameter weight matrix for input, output, forget, and cell gates
  
  `R[iofc]` - R recurrence weight matrix for input, output, forget, and cell gates
  
  `Wb[iofc]` - W bias vectors for input, output, forget, and cell gates
  
  `Rb[iofc]` - R bias vectors for input, output, forget, and cell gates
  
  `P[iof]`  - P peephole weight vector for input, output, and forget gates
  
  `WB[iofc]` - W parameter weight matrix for backward input, output, forget, and cell gates
  
  `RB[iofc]` - R recurrence weight matrix for backward input, output, forget, and cell gates
  
  `WBb[iofc]` - W bias vectors for backward input, output, forget, and cell gates
  
  `RBb[iofc]` - R bias vectors for backward input, output, forget, and cell gates
  
  `PB[iof]`  - P peephole weight vector for backward input, output, and forget gates
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Sigmoid, g=Tanh, h=Tanh):
  
    - it = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Pi (.) Ct-1 + Wbi + Rbi)
  
    - ft = f(Xt*(Wf^T) + Ht-1*(Rf^T) + Pf (.) Ct-1 + Wbf + Rbf)
  
    - ct = g(Xt*(Wc^T) + Ht-1*(Rc^T) + Wbc + Rbc)
  
    - Ct = ft (.) Ct-1 + it (.) ct
  
    - ot = f(Xt*(Wo^T) + Ht-1*(Ro^T) + Po (.) Ct + Wbo + Rbo)
  
    - Ht = ot (.) h(Ct)
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 3 (or 6 if bidirectional) activation functions for input, output, forget, cell, and hidden. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>input_forget</tt> : int</dt>
<dd>Couple the input and forget gates if 1, default 0.</dd>
</dl>

#### Inputs (3 - 8)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[iofc]` and `WB[iofc]` (if bidirectional) along dimension 0. The tensor has shape `[num_directions, 4*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[iofc]` and `RB[iofc]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 4*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wb[iofc], Rb[iofc]]`, and `[WBb[iofc], RBb[iofc]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 8*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>initial_c</tt> (optional) : T</dt>
<dd>Optional initial value of the cell. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>P</tt> (optional) : T</dt>
<dd>The weight tensor for peepholes. Concatenation of `P[iof]` and `PB[iof]` (if bidirectional) along dimension 0. It has shape `[num_directions, 3*hidde_size]`. Optional: If not specified - assumed to be 0.</dd>
</dl>

#### Outputs (0 - 3)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>Y_c</tt> (optional) : T</dt>
<dd>The last output value of the cell. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="Less-7"></a>**Less-7**</a>

  Returns the tensor resulted from performing the `less` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrains input to float tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="Mul-7"></a>**Mul-7**</a>

  Performs element-wise binary multiplication (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Multinomial-7"></a>**Multinomial-7**</a>

  Generate a tensor of samples from a multinomial distribution according to the probabilities
  of each of the possible outcomes.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will use int32.</dd>
<dt><tt>sample_size</tt> : int</dt>
<dd>Number of times to sample.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor with shape [batch_size, class_size], where class_size is the number of all possible outcomes. Each value along the axis zero represents the unnormalized log-probability of each corresponding outcome in a batch.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor with shape [batch_size, sample_size], where sample_size is the number of times to sample. Each value along the axis zero represents the outcome of the corresponding sample in a batch.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain output types to integral tensors.</dd>
</dl>

### <a name="Or-7"></a>**Or-7**</a>

  Returns the tensor resulted from performing the `or` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

### <a name="PRelu-7"></a>**PRelu-7**</a>

  PRelu takes input data (Tensor<T>) and slope tensor as input, and produces one
  output data (Tensor<T>) where the function `f(x) = slope * x for x < 0`,
  `f(x) = x for x >= 0`., is applied to the data tensor elementwise.
  This operator supports **unidirectional broadcasting** (tensor slope should be unidirectional broadcastable to input tensor X); for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input tensor</dd>
<dt><tt>slope</tt> : T</dt>
<dd>Slope tensor. The shape of slope can be smaller then first input X; if so, its shape must be unidirectional broadcastable to X</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output tensor (same size as X)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Pow-7"></a>**Pow-7**</a>

  Pow takes input data (Tensor<T>) and exponent Tensor, and
  produces one output data (Tensor<T>) where the function `f(x) = x^exponent`,
  is applied to the data tensor elementwise.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>First operand, base of the exponent.</dd>
<dt><tt>Y</tt> : T</dt>
<dd>Second operand, power of the exponent.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Z</tt> : T</dt>
<dd>Output tensor (same size as X)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="RNN-7"></a>**RNN-7**</a>

  Computes an one-layer simple RNN. This operator is usually supported
  via some custom implementation such as CuDNN.
  
  Notations:
  
  `X` - input tensor
  
  `i` - input gate
  
  `t` - time step (t-1 means previous time step)
  
  `Wi` - W parameter weight matrix for input gate
  
  `Ri` - R recurrence weight matrix for input gate
  
  `Wbi` - W parameter bias vector for input gate
  
  `Rbi` - R parameter bias vector for input gate
  
  `WBi` - W parameter weight matrix for backward input gate
  
  `RBi` - R recurrence weight matrix for backward input gate
  
  `WBbi` - WR bias vectors for backward input gate
  
  `RBbi` - RR bias vectors for backward input gate
  
  `H` - Hidden state
  
  `num_directions` - 2 if direction == bidirectional else 1
  
  Activation functions:
  
    Relu(x)                - max(0, x)
  
    Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  
    Sigmoid(x)             - 1/(1 + e^{-x})
  
    (NOTE: Below are optional)
  
    Affine(x)              - alpha*x + beta
  
    LeakyRelu(x)           - x if x >= 0 else alpha * x
  
    ThresholdedRelu(x)     - x if x >= alpha else 0
  
    ScaledTanh(x)          - alpha*Tanh(beta*x)
  
    HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  
    Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  
    Softsign(x)            - x/(1 + |x|)
  
    Softplus(x)            - log(1 + e^x)
  
  Equations (Default: f=Tanh):
  
    - Ht = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi)
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>One (or two if bidirectional) activation function for input gate. The activation function must be one of the activation functions specified above. Optional: Default `Tanh` if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> : T</dt>
<dd>The weight tensor for input gate. Concatenation of `Wi` and `WBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, input_size]`.</dd>
<dt><tt>R</tt> : T</dt>
<dd>The recurrence weight tensor. Concatenation of `Ri` and `RBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wbi, Rbi]` and `[WBbi, RBbi]` (if bidirectional). The tensor has shape `[num_directions, 2*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>

### <a name="Sin-7"></a>**Sin-7**</a>

  Calculates the sine of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The sine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sub-7"></a>**Sub-7**</a>

  Performs element-wise binary subtraction (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>

### <a name="Tan-7"></a>**Tan-7**</a>

  Calculates the tangent of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>The tangent of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Upsample-7"></a>**Upsample-7**</a>

  Upsample the input tensor.
  Each dimension value of the output tensor is:
    output_dimension = floor(input_dimension * scale).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>mode</tt> : string</dt>
<dd>Two interpolation modes: nearest (default), and linear (including bilinear, trilinear, etc)</dd>
<dt><tt>scales</tt> : list of floats (required)</dt>
<dd>The scale array along each dimension. It takes value greater than or equal to 1. The number of elements of 'scales' should be the same as the rank of input 'X'.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T</dt>
<dd>N-D tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>N-D tensor after resizing</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool)</dt>
<dd>Constrain input/output types to all tensor types.</dd>
</dl>

### <a name="Xor-7"></a>**Xor-7**</a>

  Returns the tensor resulted from performing the `xor` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
  
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrains input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrains output to boolean tensor.</dd>
</dl>

## Version 8 of the default ONNX operator set
### <a name="Max-8"></a>**Max-8**</a>

  Element-wise max of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 8 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for max.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>max</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Mean-8"></a>**Mean-8**</a>

  Element-wise mean of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 8 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for mean.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>mean</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Min-8"></a>**Min-8**</a>

  Element-wise min of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 8 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for min.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>min</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

### <a name="Sum-8"></a>**Sum-8**</a>

  Element-wise sum of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 8 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic) : T</dt>
<dd>List of tensors for sum.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>sum</tt> : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>

