# LazyTensors¶

## Overview¶

The **high-level** interface of KeOps is the `LazyTensor`

and `LazyTensor`

wrappers, which allows users to perform **efficient, semi-symbolic computations** on very large NumPy arrays or PyTorch tensors respectively. As displayed on this website’s front page, this new tensor type may be used with **very little overhead**:

```
# Create two arrays with 3 columns and a (huge) number of lines, on the GPU
import torch
x = torch.randn(1000000, 3, requires_grad=True).cuda()
y = torch.randn(2000000, 3).cuda()
# Turn our Tensors into KeOps symbolic variables:
from pykeops.torch import LazyTensor
x_i = LazyTensor( x[:,None,:] ) # x_i.shape = (1e6, 1, 3)
y_j = LazyTensor( y[None,:,:] ) # y_j.shape = ( 1, 2e6,3)
# We can now perform large-scale computations, without memory overflows:
D_ij = ((x_i - y_j)**2).sum(dim=2) # Symbolic (1e6,2e6,1) matrix of squared distances
K_ij = (- D_ij).exp() # Symbolic (1e6,2e6,1) Gaussian kernel matrix
# And come back to vanilla PyTorch Tensors or NumPy arrays using
# reduction operations such as .sum(), .logsumexp() or .argmin().
# Here, the kernel density estimation a_i = sum_j exp(-|x_i-y_j|^2)
# is computed using a CUDA online map-reduce routine that has a linear
# memory footprint and outperforms standard PyTorch implementations
# by two orders of magnitude.
a_i = K_ij.sum(dim=1) # Genuine torch.cuda.FloatTensor, a_i.shape = (1e6, 1),
g_x = torch.autograd.grad((a_i ** 2).sum(), [x]) # KeOps supports autograd!
```

## Documentation¶

Starting with the KeOps 101 tutorial,
most examples in our gallery
rely on `LazyTensor`

or `LazyTensors`

:
going through this collection of **real-life demos** is probably
the best way of getting familiar with the KeOps user interface.

Going further, please refer to the `LazyTensor`

or `LazyTensor`

API for an exhaustive list of all supported operations.