This section contains the full API documention for the conjugate gradient solver for Generic linear systems, with full support of PyTorch’s torch.autograd engine.


Creates a new conjugate gradient solver.


Instantiate a new KernelSolve operation.


Apply the routine on arbitrary torch Tensors.


class pykeops.torch.KernelSolve[source]

Creates a new conjugate gradient solver.

Supporting the same generic syntax as torch.Genred, this module allows you to solve generic optimization problems of the form:

\[\begin{split}& & a^{\star} & =\operatorname*{argmin}_a \tfrac 1 2 \langle a,( \alpha \operatorname{Id}+K_{xx}) a\rangle - \langle a,b \rangle, \\\\ &\text{i.e.}\quad & a^{\star} & = (\alpha \operatorname{Id} + K_{xx})^{-1} b,\end{split}\]

where \(K_{xx}\) is a symmetric, positive definite linear operator and \(\alpha\) is a nonnegative regularization parameter.


KernelSolve is fully compatible with PyTorch’s autograd engine: you can backprop through the KernelSolve __call__() just as if it was a vanilla PyTorch operation.


>>> formula = "Exp(-Norm2(x - y)) * a"  # Exponential kernel
>>> aliases =  ["x = Vi(3)",  # 1st input: target points, one dim-3 vector per line
...             "y = Vj(3)",  # 2nd input: source points, one dim-3 vector per column
...             "a = Vj(2)"]  # 3rd input: source signal, one dim-2 vector per column
>>> K = Genred(formula, aliases, axis = 1)  # Reduce formula along the lines of the kernel matrix
>>> K_inv = KernelSolve(formula, aliases, "a",  # The formula above is linear wrt. 'a'
...                     axis = 1)
>>> # Generate some random data:
>>> x = torch.randn(10000, 3, requires_grad=True).cuda()  # Sampling locations
>>> b = torch.randn(10000, 2).cuda()                      # Random observed signal
>>> a = K_inv(x, x, b, alpha = .1)  # Linear solve: a_i = (.1*Id + K(x,x)) \ b
>>> print(a.shape)
torch.Size([10000, 2])
>>> # Mean squared error:
>>> print( ((( .1 * a + K(x,x,a) - b)**2 ).sqrt().sum() / len(x) ).item() )
>>> [g_x] = torch.autograd.grad((a ** 2).sum(), [x])  # KernelSolve supports autograd!
>>> print(g_x.shape)
torch.Size([10000, 3])
__init__(formula, aliases, varinvalias, axis=0, dtype='float32', cuda_type=None, use_double_acc=False, use_BlockRed='auto', use_Kahan=False)[source]

Instantiate a new KernelSolve operation.


KernelSolve relies on CUDA kernels that are compiled on-the-fly and stored in a cache directory as shared libraries (“.so” files) for later use.

  • formula (string) – The scalar- or vector-valued expression that should be computed and reduced. The correct syntax is described in the documentation, using appropriate mathematical operations.

  • aliases (list of strings) –

    A list of identifiers of the form "AL = TYPE(DIM)" that specify the categories and dimensions of the input variables. Here:

    • AL is an alphanumerical alias, used in the formula.

    • TYPE is a category. One of:

      • Vi: indexation by \(i\) along axis 0.

      • Vj: indexation by \(j\) along axis 1.

      • Pm: no indexation, the input tensor is a vector and not a 2d array.

    • DIM is an integer, the dimension of the current variable.

    As described below, __call__() will expect input Tensors whose shape are compatible with aliases.

  • varinvalias (string) – The alphanumerical alias of the variable with respect to which we shall perform our conjugate gradient descent. formula is supposed to be linear with respect to varinvalias, but may be more sophisticated than a mere "K(x,y) * {varinvalias}".

Keyword Arguments
  • alpha (float, default = 1e-10) – Non-negative ridge regularization parameter, added to the diagonal of the Kernel matrix \(K_{xx}\).

  • axis (int, default = 0) –

    Specifies the dimension of the kernel matrix \(K_{x_ix_j}\) that is reduced by our routine. The supported values are:

    • axis = 0: reduction with respect to \(i\), outputs a Vj or “\(j\)” variable.

    • axis = 1: reduction with respect to \(j\), outputs a Vi or “\(i\)” variable.

  • dtype (string, default = "float32") –

    Specifies the numerical dtype of the input and output arrays. The supported values are:

    • dtype = "float32" or "float".

    • dtype = "float64" or "double".

__call__(*args, backend='auto', device_id=-1, alpha=1e-10, eps=1e-06, ranges=None)[source]

Apply the routine on arbitrary torch Tensors.


Even for variables of size 1 (e.g. \(a_i\in\mathbb{R}\) for \(i\in[0,M)\)), KeOps expects inputs to be formatted as 2d Tensors of size (M,dim). In practice, a.view(-1,1) should be used to turn a vector of weights into a list of scalar values.


*args (2d Tensors (variables Vi(..), Vj(..)) and 1d Tensors (parameters Pm(..))) –

The input numerical arrays, which should all have the same dtype, be contiguous and be stored on the same device. KeOps expects one array per alias, with the following compatibility rules:

  • All Vi(Dim_k) variables are encoded as 2d-tensors with Dim_k columns and the same number of lines \(M\).

  • All Vj(Dim_k) variables are encoded as 2d-tensors with Dim_k columns and the same number of lines \(N\).

  • All Pm(Dim_k) variables are encoded as 1d-tensors (vectors) of size Dim_k.

Keyword Arguments
  • backend (string) – Specifies the map-reduce scheme, as detailed in the documentation of the torch.Genred module.

  • device_id (int, default=-1) – Specifies the GPU that should be used to perform the computation; a negative value lets your system choose the default GPU. This parameter is only useful if your system has access to several GPUs.

  • ranges (6-uple of IntTensors, None by default) –

    Ranges of integers that specify a block-sparse reduction scheme with Mc clusters along axis 0 and Nc clusters along axis 1, as detailed in the documentation of the torch.Genred module.

    If None (default), we simply use a dense Kernel matrix as we loop over all indices \(i\in[0,M)\) and \(j\in[0,N)\).


The solution of the optimization problem, stored on the same device as the input Tensors. The output of a KernelSolve call is always a 2d-tensor with \(M\) or \(N\) lines (if axis = 1 or axis = 0, respectively) and a number of columns that is inferred from the formula.

Return type

(M,D) or (N,D) Tensor