Note
Click here to download the full example code
Solving positive definite linear systems¶
This benchmark compares the performances of KeOps versus Numpy and Pytorch on a inverse matrix operation. It uses the functions torch.KernelSolve
(see also here) and numpy.KernelSolve
(see also here).
In a nutshell, given \(x \in\mathbb R^{N\times D}\) and \(b \in \mathbb R^{N\times D_v}\), we compute \(a \in \mathbb R^{N\times D_v}\) so that
where \(K_{x,x} = \Big[\exp(-\|x_i -x_j\|^2 / \sigma^2)\Big]_{i,j=1}^N\). The method is based on a conjugate gradient scheme. The benchmark tests various values of \(N \in [10, \cdots,10^6]\).
Note
In this demo, we implement the linear operator \(K_xx\) using a bruteforce implementation and do not leverage any multiscale or low-rank (Nystroem/multipole) decomposition of the Kernel matrix. First support for these approximation schemes is scheduled for May-June 2021. Going further, advanced strategies and solvers are now available through the GPyTorch and Falkon libraries, which rely on a KeOps backend whenever relevant.
Setup¶
Standard imports:
import numpy as np
import torch
from matplotlib import pyplot as plt
from scipy.sparse import diags
from scipy.sparse.linalg import aslinearoperator, cg
from scipy.sparse.linalg.interface import IdentityOperator
from pykeops.numpy import KernelSolve as KernelSolve_np, LazyTensor
from pykeops.torch import KernelSolve
from pykeops.torch.utils import squared_distances as sqdist_torch
from pykeops.numpy.utils import squared_distances as sqdist_np
from benchmark_utils import flatten, random_normal, unit_tensor, full_benchmark
use_cuda = torch.cuda.is_available()
Benchmark specifications:
D = 3 # Let's do this in 3D
Dv = 1 # Dimension of the vectors (= number of linear problems to solve)
# Numbers of samples that we'll loop upon:
problem_sizes = flatten(
[[1 * 10 ** k, 2 * 10 ** k, 5 * 10 ** k] for k in [1, 2, 3, 4, 5]] + [[10 ** 6]]
)
D = 3 # We work with 3D points
Dv = 1 # and solve one problem at a time.
Create some random input data:
def generate_samples(N, device="cuda", lang="torch", **kwargs):
"""Generates a point cloud x, a scalar signal b of size N and two regularization parameters.
Args:
N (int): number of point.
device (str, optional): "cuda", "cpu", etc. Defaults to "cuda".
lang (str, optional): "torch", "numpy", etc. Defaults to "torch".
Returns:
3-uple of arrays: x, y, b
"""
randn = random_normal(device=device, lang=lang)
ones = unit_tensor(device=device, lang=lang)
x = randn((N, D))
b = randn((N, Dv))
gamma = ones((1,)) / (2 * 0.01 ** 2) # kernel bandwidth
alpha = ones((1,)) * 0.8 # regularization
return x, b, gamma, alpha
KeOps kernel¶
Define a Gaussian RBF kernel:
formula = "Exp(- g * SqDist(x,y)) * a"
aliases = [
"x = Vi(" + str(D) + ")", # First arg: i-variable of size D
"y = Vj(" + str(D) + ")", # Second arg: j-variable of size D
"a = Vj(" + str(Dv) + ")", # Third arg: j-variable of size Dv
"g = Pm(1)",
] # Fourth arg: scalar parameter
Note
This operator uses a conjugate gradient solver and assumes
that formula defines a symmetric, positive and definite
linear reduction with respect to the alias "a"
specified trough the third argument.
Define the Kernel solver, with a ridge regularization alpha:
def Kinv_keops(x, b, gamma, alpha, **kwargs):
Kinv = KernelSolve(formula, aliases, "a", axis=1)
res = Kinv(x, x, b, gamma, alpha=alpha)
return res
def Kinv_keops_numpy(x, b, gamma, alpha, **kwargs):
Kinv = KernelSolve_np(formula, aliases, "a", axis=1, dtype="float32")
res = Kinv(x, x, b, gamma, alpha=alpha)
return res
def Kinv_scipy(x, b, gamma, alpha, **kwargs):
x_i = LazyTensor(np.sqrt(gamma) * x[:, None, :])
y_j = LazyTensor(np.sqrt(gamma) * x[None, :, :])
K_ij = (-((x_i - y_j) ** 2).sum(2)).exp()
A = aslinearoperator(diags(alpha * np.ones(x.shape[0]))) + aslinearoperator(K_ij)
A.dtype = np.dtype("float32")
res = cg(A, b)
return res
Define the same Kernel solver, using a tensorized implementation:
def Kinv_pytorch(x, b, gamma, alpha, **kwargs):
K_xx = alpha * torch.eye(x.shape[0], device=x.get_device()) + torch.exp(
-gamma * sqdist_torch(x, x)
)
res = torch.solve(b, K_xx)[0]
return res
def Kinv_numpy(x, b, gamma, alpha, **kwargs):
K_xx = alpha * np.eye(x.shape[0]) + np.exp(-gamma * sqdist_np(x, x))
res = np.linalg.solve(K_xx, b)
return res
Run the benchmark¶
routines = [
(Kinv_numpy, "NumPy", {"lang": "numpy"}),
(Kinv_pytorch, "PyTorch", {}),
(Kinv_keops_numpy, "NumPy + KeOps", {"lang": "numpy"}),
(Kinv_keops, "PyTorch + KeOps", {}),
(Kinv_scipy, "Scipy + KeOps", {"lang": "numpy"}),
]
full_benchmark(
"Inverse radial kernel matrix",
routines,
generate_samples,
problem_sizes=problem_sizes,
)
plt.show()
Total running time of the script: ( 0 minutes 0.000 seconds)