Formulas and syntax

KeOps lets us define any reduction operation of the form

\underset{j}{Reduction} = {Reduction}_{j} [F (x_{ι_{0}}^{0}, . . ., x_{ι_{n - 1}}^{n - 1})]

or

\underset{i}{Reduction} = {Reduction}_{i} [F (x_{ι_{0}}^{0}, . . ., x_{ι_{n - 1}}^{n - 1})]

where $F$ is a symbolic formula, the $x_{ι_{k}}^{k}$ ’s are vector variables and $Reduction$ is a Sum, LogSumExp or any other standard operation (see Reductions for the full list of supported reductions).

We now describe the symbolic syntax that can be used through all KeOps bindings.

Variables: category, index and dimension

At a low level, every variable $x_{ι_{k}}^{k}$ is specified by its category $ι_{k} \in {i, j, \emptyset}$ (meaning that the variable is indexed by $i$ , by $j$ , or is a fixed parameter across indices), its positional index $k$ and its dimension $d_{k}$ .

In practice, the category $ι_{k}$ is given through a keyword

keyword	meaning
`Vi`	variable indexed by $i$
`Vj`	variable indexed by $j$
`Pm`	parameter

followed by a $(k, d_{k})$ or (index,dimension) pair of integers. For instance, Vi(2,4) specifies a variable indexed by $i$ , given as the third ( $k = 2$ ) input in the function call, and representing a vector of dimension $d_{k} = 4$ .

N.B.: Using the same index k for two variables with different dimensions or categories is not allowed and will be rejected by the compiler.

Reserved words

keyword	meaning
`Ind`	indexes sequences

followed by a sequence $(i_{0}, i_{1}, \dots)$ of integers. For instance, Ind(2,4,2,5,12) can be used as parameters for some operations.

Math operators

To define formulas with KeOps, you can use simple arithmetics:

`f * g`	scalar-vector multiplication (if `f` is scalar) or vector-vector element-wise multiplication
`f + g`	addition of two vectors
`f - g`	difference between two vectors or minus sign
`f / g`	element-wise division (N.B. `f` can be scalar, in fact `f / g` is the same as `f * Inv(g)`)
`(f \| g)`	scalar product between vectors
`f==g`	element-wise equal function (`1` if `f==g`, `0` otherwise)
`f!=g`	element-wise equal function (`1` if `f!=g`, `0` otherwise)
`f<g`	element-wise less-than function (`1` if `f<g`, `0` otherwise)
`f>g`	element-wise greater-than function (`1` if `f>g`, `0` otherwise)
`f<=g`	element-wise less-than-or-equal-to function (`1` if `f<=g`, `0` otherwise)
`f>=g`	element-wise greater-than-or-equal-to function (`1` if `f>=g`, `0` otherwise)

Elementary functions:

`Minus(f)`	element-wise opposite of `f`
`Inv(f)`	element-wise inverse `1 ./ f`
`Exp(f)`	element-wise exponential function
`Log(f)`	element-wise natural logarithm
`XLogX(f)`	computes `f * log(f)` element-wise (with value `0` at `0`)
`Sin(f)`	element-wise sine function
`SinXDivX(f)`	function `sin(f)/f` element-wise (with value `1` at `0`)
`Asin(f)`	element-wise arc-sine function
`Cos(f)`	element-wise cosine function
`Acos(f)`	element-wise arc-cosine function
`Atan(f)`	element-wise arc-tangent function
`Atan2(f,g)`	element-wise 2-argument arc-tangent function
`Pow(f, P)`	`P`-th power of `f` (element-wise), where `P` is a fixed integer
`Powf(f, g)`	power operation, alias for `Exp(g*Log(f))`
`Square(f)`	element-wise square, faster than `Pow(f,2)`
`Sqrt(f)`	element-wise square root, faster than `Powf(f,.5)`
`Rsqrt(f)`	element-wise inverse square root, faster than `Powf(f,-.5)`
`Abs(f)`	element-wise absolute value
`Sign(f)`	element-wise sign function (`-1` if `f<0`, `0` if `f=0`, `1` if `f>0`)
`Step(f)`	element-wise step function (`0` if `f<0`, `1` if `f>=0`)
`ReLU(f)`	element-wise ReLU function (`0` if `f<0`, `f` if `f>=0`)
`Clamp(f,a,b)`	element-wise Clamp function (`a` if `f<a`, `f` if `a<=f<=b`, `b` if `b<f`)
`ClampInt(f,a,b)`	element-wise Clamp function, with a and b fixed integers
`IfElse(f,g,h)`	element-wise IfElse function (`g` if `f>=0`, `h` if `f<0`)
`Mod(f,m,off)`	element-wise Modulo function, with offset (computes `f - m * floor((f - off)/m)`)
`Round(f,d)`	element-wise Round function, with d decimal rounding

Simple vector operations:

`SqNorm2(f)`	squared L2 norm, same as `(f\|f)`
`Norm2(f)`	L2 norm, same as `Sqrt((f\|f))`
`Normalize(f)`	normalize vector, same as `Rsqrt(SqNorm2(f)) * f`
`SqDist(f, g)`	squared L2 distance, same as `SqNorm2(f - g)`

Generic squared Euclidean norms, with support for scalar, diagonal and full (symmetric) matrices. If f is a vector of size N, depending on the size of s, WeightedSqNorm(s,f) may refer to:

a weighted L2 norm $s [0] \cdot \sum_{0 ⩽ i < N} f [i]^{2}$ if s is a vector of size 1.
a separable norm $\sum_{0 ⩽ i < N} s [i] \cdot f [i]^{2}$ if s is a vector of size N.
a full anisotropic norm $\sum_{0 ⩽ i, j < N} s [i N + j] f [i] f [j]$ if s is a vector of size N * N such that s[i*N+j]=s[j*N+i] (i.e. stores a symmetric matrix).

`WeightedSqNorm(s, f)`	generic squared euclidean norm
`WeightedSqDist(s, f, g)`	generic squared distance, same as `WeightedSqNorm(s, f-g)`

Operations involving complex numbers:

`ComplexReal(f)`	Real part of complex (vectorized)
`ComplexImag(f)`	Imaginary part of complex (vectorized)
`Real2Complex(f)`	convert real vector to complex vector with zero imaginary part (F+0*i)
`Imag2Complex(f)`	convert real vector to complex vector with zero real part (0+i*F)
`Conj(f)`	Complex conjugate (vectorized)
`ComplexAbs(f)`	Absolute value or modulus of complex (vectorized)
`ComplexSquareAbs(f)`	Square of modulus of complex (vectorized)
`ComplexAngle(f)`	Angle of complex (vectorized)
`ComplexSum(f)`	Sum of complex vector
`ComplexSumT(f,dim)`	Adjoint operation of ComplexSum - replicates f (complex scalar) dim times
`ComplexMult(f,g)`	Complex multiplication of f and g (vectorized)
`ComplexScal(f,g)`	Multiplication of f (complex scalar) with g (complex vector)
`ComplexRealScal(f,g)`	Multiplication of f (real scalar) with g (complex vector)
`ComplexDivide(f,g)`	Complex division of f and g (vectorized)

Constants and padding/concatenation operations:

`IntCst(N)`	integer constant N
`IntInv(N)`	alias for `Inv(IntCst(N))` : 1/N
`Zero(N)`	vector of zeros of size N
`Sum(f)`	sum of elements of vector `f`
`Max(f)`	max of elements of vector `f`
`Min(f)`	min of elements of vector `f`
`ArgMax(f)`	argmax of elements of vector `f`
`ArgMin(f)`	argmin of elements of vector `f`
`Elem(f, M)`	extract M-th element of vector `f`
`ElemT(f, N, M)`	insert scalar value `f` at position M in a vector of zeros of length N
`Extract(f, M, D)`	extract sub-vector from vector `f` (M is starting index, D is dimension of sub-vector)
`ExtractT(f, M, D)`	insert vector `f` in a larger vector of zeros (M is starting index, D is dimension of output)
`Concat(f, g)`	concatenation of vectors `f` and `g`
`OneHot(f, D)`	encodes a (rounded) scalar value as a one-hot vector of dimension D

Elementary tensor algebra:

`MatVecMult(f, g)`	matrix-vector product `f x g`: `f` is vector interpreted as matrix (column-major), `g` is vector
`VecMatMult(f, g)`	vector-matrix product `f x g`: `f` is vector, `g` is vector interpreted as matrix (column-major)
`TensorProd(f, g)`	tensor cross product `f x g^T`: `f` and `g` are vectors of sizes M and N, output is of size MN.
`Kron(f, g, dimf, dimg)`	kronecker product (similar to numpy's kron in the spirit)
`TensorDot(f, g, dimf, dimg, contf, contg)`	tensordot product `f : g` (similar to numpy's tensordot in the spirit): `f` and `g` are tensors of sizes listed in `dimf` and `dimg` index sequences and contracted along the dimensions listed in `contf` and `contg` index sequences. The `MatVecMult`, `VecMatMult`, `Kron` and `TensorProd` operations are special cases of `TensorDot`.

Symbolic gradients and linear operators:

`Grad(f,x,e)`	gradient of `f` with respect to the variable `x`, with `e` as the “grad_input” to backpropagate
`Diff(f,x,e)`	differential of `f` with respect to the variable `x`, with `e` as the “diff_input” to propagate
`GradMatrix(f, v)`	matrix of gradient (i.e. transpose of the jacobian matrix)
`Divergence(f,x)`	divergence of `f` with respect to the variable `x`
`Laplacian(f,x)`	laplacian of `f` with respect to the variable `x`
`TraceOperator(f,x)`	trace of `f` with respect to the variable `x` (`f` must be a linear function of `x`)
`AdjointOperator(f,x)`	adjoint of `f` with respect to the variable `x` (`f` must be a linear function of `x`)

Other:

SoftDTW_SqDist(f,g,c)

Soft-DTW distance (based on the sqaured euclidean distance) between f and g with smoothness parameter c

Reductions

The operations that can be used to reduce an array are described in the following table.

code name	arguments	mathematical expression (reduction over j)	remarks
`Sum`	`f`	$\sum_{j} f_{i j}$
`Max_SumShiftExp`	`f` (scalar)	$(m_{i}, s_{i})$ with ${\begin{cases} m_{i} = max_{j} f_{i j} \\ s_{i} = \sum_{j} \exp (f_{i j} - m_{i}) \end{cases}$	core KeOps reduction for `LogSumExp`. gradient is a pseudo-gradient, should not be used by itself
`LogSumExp`	`f` (scalar)	$\log (\sum_{j} \exp (f_{i j}))$	only in Python bindings
`Max_SumShiftExpWeight`	`f` (scalar), `g`	$(m_{i}, s_{i})$ with ${\begin{cases} m_{i} = max_{j} f_{i j} \\ s_{i} = \sum_{j} \exp (f_{i j} - m_{i}) g_{i j} \end{cases}$	core KeOps reduction for `LogSumExpWeight` and `SumSoftMaxWeight`. gradient is a pseudo-gradient, should not be used by itself
`LogSumExpWeight`	`f` (scalar), `g`	$\log (\sum_{j} \exp (f_{i j}) g_{i j})$	only in Python bindings
`SumSoftMaxWeight`	`f` (scalar), `g`	$(\sum_{j} \exp (f_{i j}) g_{i j}) / (\sum_{j} \exp (f_{i j}))$	only in Python bindings
`Min`	`f`	$min_{j} f_{i j}$	no gradient
`ArgMin`	`f`	${argmin}_{j} f_{i j}$	gradient xreturns zeros
`Min_ArgMin`	`f`	$(min_{j} f_{i j}, {argmin}_{j} f_{i j})$	no gradient
`Max`	`f`	$max_{j} f_{i j}$	no gradient
`ArgMax`	`f`	${argmax}_{j} f_{i j}$	gradient returns zeros
`Max_ArgMax`	`f`	$(max_{j} f_{i j}, {argmax}_{j} f_{i j})$	no gradient
`KMin`	`f`, `K` (int)	$\begin{array}{l} [min_{j} f_{i j}, \dots, {min}_{j}^{(K)} f_{i j}] \\ (min^{(k)} means k-th smallest value) \end{array}$	no gradient
`ArgKMin`	`f`, `K` (int)	$[{argmin}_{j} f_{i j}, \dots, {argmin}_{j}^{(K)} f_{i j}]$	gradient returns zeros
`KMin_ArgKMin`	`f`, `K` (int)	$([{min}_{j}^{(1. . . K)} f_{i j}], [{argmin}_{j}^{(1. . . K)} f_{i j}])$	no gradient

N.B.: All these reductions, except Max_SumShiftExp and LogSumExp, are vectorized : whenever the input f or g is vector-valued, the output will be vector-valued, with the corresponding reduction applied element-wise to each component.

N.B.: All reductions accept an additional optional argument that specifies wether the reduction is performed over the j or the i index. (see C++ API for KeOps and Generic reductions)

An example

Assume we want to compute the sum

F (p, x, y, a)_{i} = {(\sum_{j = 1}^{N} (p - a_{j})^{2} \exp (x_{i}^{u} + y_{j}^{u}))}_{i = 1, \dots, M, u = 1, 2, 3} \in R^{M \times 3}

where:

$p \in R$ is a parameter,
$x \in R^{M \times 3}$ is an x-variable indexed by $i$ ,
$y \in R^{N \times 3}$ is an y-variable indexed by $j$ ,
$a \in R^{N}$ is an y-variable indexed by $j$ .

Using the variable placeholders presented above and the mathematical operations listed in Math operators, we can define F as a symbolic string

F = "Sum_Reduction( Square( Pm(0,1) - Vj(3,1) )  *  Exp( Vi(1,3) + Vj(2,3) ), 1 )"

in which + and - denote the usual addition of vectors, Exp is the (element-wise) exponential function and * denotes scalar-vector multiplication. The second argument 1 of the Sum_Reduction operator indicates that the summation is performed with respect to the $j$ index: a 0 would have been associated with an $i$ -reduction.

Note that in all bindings, variables can be defined through aliases. In this example, we may write p=Pm(0,1), x=Vi(1,3), y=Vj(2,3), a=Vj(3,1) and thus give F through a much friendlier expression:

F = "Sum_Reduction( Square(p - a) * Exp(x + y), 1 )"