Genred

Summary

This section contains the full API documentation for the PyTorch Generic reductions, with full support of PyTorch’s torch.autograd engine.

Genred

Creates a new generic operation.

Genred.__init__

Instantiate a new generic operation.

Genred.__call__

To apply the routine on arbitrary torch Tensors.

Syntax

class pykeops.torch.Genred[source]

Creates a new generic operation.

This is KeOps’ main function, whose usage is documented in the user-guide, the gallery of examples and the high-level tutorials. Taking as input a handful of strings and integers that specify a custom Map-Reduce operation, it returns a C++ wrapper that can be called just like any other PyTorch function.

Note

Genred() is fully compatible with PyTorch’s autograd engine: You can backprop through the KeOps __call__() just as if it was a vanilla PyTorch operation (except for Min or Max reduction types, see reductions)

Example

>>> my_conv = Genred('Exp(-SqNorm2(x - y))',  # formula
...                  ['x = Vi(3)',            # 1st input: dim-3 vector per line
...                   'y = Vj(3)'],           # 2nd input: dim-3 vector per column
...                  reduction_op='Sum',      # we also support LogSumExp, Min, etc.
...                  axis=1)                  # reduce along the lines of the kernel matrix
>>> # Apply it to 2d arrays x and y with 3 columns and a (huge) number of lines
>>> x = torch.randn(1000000, 3, requires_grad=True).cuda()
>>> y = torch.randn(2000000, 3).cuda()
>>> a = my_conv(x, y)  # a_i = sum_j exp(-|x_i-y_j|^2)
>>> print(a.shape)
torch.Size([1000000, 1])
>>> [g_x] = torch.autograd.grad((a ** 2).sum(), [x])  # KeOps supports autograd!
>>> print(g_x.shape)
torch.Size([1000000, 3])
__init__(formula, aliases, reduction_op='Sum', axis=0, dtype='float32', opt_arg=None, formula2=None, cuda_type=None)[source]

Instantiate a new generic operation.

Note

Genred relies on C++ or CUDA kernels that are compiled on-the-fly, and stored in a cache directory as shared libraries (“.so” files) for later use.

Parameters
  • formula (string) – The scalar- or vector-valued expression that should be computed and reduced. The correct syntax is described in the documentation, using appropriate mathematical operations.

  • aliases (list of strings) –

    A list of identifiers of the form "AL = TYPE(DIM)" that specify the categories and dimensions of the input variables. Here:

    • AL is an alphanumerical alias, used in the formula.

    • TYPE is a category. One of:

      • Vi: indexation by \(i\) along axis 0.

      • Vj: indexation by \(j\) along axis 1.

      • Pm: no indexation, the input tensor is a vector and not a 2d array.

    • DIM is an integer, the dimension of the current variable.

    As described below, __call__() will expect as input Tensors whose shape are compatible with aliases.

Keyword Arguments
  • reduction_op (string, default = "Sum") – Specifies the reduction operation that is applied to reduce the values of formula(x_i, y_j, ...) along axis 0 or axis 1. The supported values are one of Reductions.

  • axis (int, default = 0) –

    Specifies the dimension of the “kernel matrix” that is reduced by our routine. The supported values are:

    • axis = 0: reduction with respect to \(i\), outputs a Vj or “\(j\)” variable.

    • axis = 1: reduction with respect to \(j\), outputs a Vi or “\(i\)” variable.

  • dtype (string, default = "float32") –

    Specifies the numerical dtype of the input and output arrays. The supported values are:

    • dtype = "float32" or "float".

    • dtype = "float64" or "double".

  • opt_arg (int, default = None) – If reduction_op is in ["KMin", "ArgKMin", "KMin_ArgKMin"], this argument allows you to specify the number K of neighbors to consider.

__call__(*args, backend='auto', device_id=-1, ranges=None)[source]

To apply the routine on arbitrary torch Tensors.

Warning

Even for variables of size 1 (e.g. \(a_i\in\mathbb{R}\) for \(i\in[0,M)\)), KeOps expects inputs to be formatted as 2d Tensors of size (M,dim). In practice, a.view(-1,1) should be used to turn a vector of weights into a list of scalar values.

Parameters

*args (2d Tensors (variables Vi(..), Vj(..)) and 1d Tensors (parameters Pm(..))) –

The input numerical arrays, which should all have the same dtype, be contiguous and be stored on the same device. KeOps expects one array per alias, with the following compatibility rules:

  • All Vi(Dim_k) variables are encoded as 2d-tensors with Dim_k columns and the same number of lines \(M\).

  • All Vj(Dim_k) variables are encoded as 2d-tensors with Dim_k columns and the same number of lines \(N\).

  • All Pm(Dim_k) variables are encoded as 1d-tensors (vectors) of size Dim_k.

Keyword Arguments
  • backend (string) –

    Specifies the map-reduce scheme. The supported values are:

    • "auto" (default): let KeOps decide which backend is best suited to your data, based on the tensors’ shapes. "GPU_1D" will be chosen in most cases.

    • "CPU": use a simple C++ for loop on a single CPU core.

    • "GPU_1D": use a simple multithreading scheme on the GPU - basically, one thread per value of the output index.

    • "GPU_2D": use a more sophisticated 2D parallelization scheme on the GPU.

    • "GPU": let KeOps decide which one of the "GPU_1D" or the "GPU_2D" scheme will run faster on the given input.

  • device_id (int, default=-1) – Specifies the GPU that should be used to perform the computation; a negative value lets your system choose the default GPU. This parameter is only useful if your system has access to several GPUs.

  • ranges (6-uple of IntTensors, None by default) –

    Ranges of integers that specify a block-sparse reduction scheme with Mc clusters along axis 0 and Nc clusters along axis 1. If None (default), we simply loop over all indices \(i\in[0,M)\) and \(j\in[0,N)\).

    The first three ranges will be used if axis = 1 (reduction along the axis of “\(j\) variables”), and to compute gradients with respect to Vi(..) variables:

    • ranges_i, (Mc,2) IntTensor - slice indices \([\operatorname{start}^I_k,\operatorname{end}^I_k)\) in \([0,M]\) that specify our Mc blocks along the axis 0 of “\(i\) variables”.

    • slices_i, (Mc,) IntTensor - consecutive slice indices \([\operatorname{end}^S_1, ..., \operatorname{end}^S_{M_c}]\) that specify Mc ranges \([\operatorname{start}^S_k,\operatorname{end}^S_k)\) in redranges_j, with \(\operatorname{start}^S_k = \operatorname{end}^S_{k-1}\). The first 0 is implicit, meaning that \(\operatorname{start}^S_0 = 0\), and we typically expect that slices_i[-1] == len(redrange_j).

    • redranges_j, (Mcc,2) IntTensor - slice indices \([\operatorname{start}^J_l,\operatorname{end}^J_l)\) in \([0,N]\) that specify reduction ranges along the axis 1 of “\(j\) variables”.

    If axis = 1, these integer arrays allow us to say that for k in range(Mc), the output values for indices i in range( ranges_i[k,0], ranges_i[k,1] ) should be computed using a Map-Reduce scheme over indices j in Union( range( redranges_j[l, 0], redranges_j[l, 1] )) for l in range( slices_i[k-1], slices_i[k] ).

    Likewise, the last three ranges will be used if axis = 0 (reduction along the axis of “\(i\) variables”), and to compute gradients with respect to Vj(..) variables:

    • ranges_j, (Nc,2) IntTensor - slice indices \([\operatorname{start}^J_k,\operatorname{end}^J_k)\) in \([0,N]\) that specify our Nc blocks along the axis 1 of “\(j\) variables”.

    • slices_j, (Nc,) IntTensor - consecutive slice indices \([\operatorname{end}^S_1, ..., \operatorname{end}^S_{N_c}]\) that specify Nc ranges \([\operatorname{start}^S_k,\operatorname{end}^S_k)\) in redranges_i, with \(\operatorname{start}^S_k = \operatorname{end}^S_{k-1}\). The first 0 is implicit, meaning that \(\operatorname{start}^S_0 = 0\), and we typically expect that slices_j[-1] == len(redrange_i).

    • redranges_i, (Ncc,2) IntTensor - slice indices \([\operatorname{start}^I_l,\operatorname{end}^I_l)\) in \([0,M]\) that specify reduction ranges along the axis 0 of “\(i\) variables”.

    If axis = 0, these integer arrays allow us to say that for k in range(Nc), the output values for indices j in range( ranges_j[k,0], ranges_j[k,1] ) should be computed using a Map-Reduce scheme over indices i in Union( range( redranges_i[l, 0], redranges_i[l, 1] )) for l in range( slices_j[k-1], slices_j[k] ).

Returns

The output of the reduction, stored on the same device as the input Tensors. The output of a Genred call is always a 2d-tensor with \(M\) or \(N\) lines (if axis = 1 or axis = 0, respectively) and a number of columns that is inferred from the formula.

Return type

(M,D) or (N,D) Tensor