array_api_extra.cov¶

array_api_extra.cov(m, /, *, xp=None)¶

Estimate a covariance matrix (or a stack of covariance matrices).

Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, \(X = [x_1, x_2, ... x_N]^T\), each with M observations, then element \(C_{ij}\) of the \(N imes N\) covariance matrix is the covariance of \(x_i\) and \(x_j\). The element \(C_{ii}\) is the variance of \(x_i\).

With the exception of supporting batch input, this provides a subset of the functionality of numpy.cov.

Parameters:

m (array) – An array of shape (..., N, M) whose innermost two dimensions contain M observations of N variables. That is, each row of m represents a variable, and each column a single observation of all those variables.
xp (array_namespace, optional) – The standard-compatible namespace for m. Default: infer.

Returns:

An array having shape (…, N, N) whose innermost two dimensions represent the covariance matrix of the variables.

Return type:

array

Examples

>>> import array_api_strict as xp
>>> import array_api_extra as xpx

Consider two variables, \(x_0\) and \(x_1\), which correlate perfectly, but in opposite directions:

>>> x = xp.asarray([[0, 2], [1, 1], [2, 0]]).T
>>> x
Array([[0, 1, 2],
       [2, 1, 0]], dtype=array_api_strict.int64)

Note how \(x_0\) increases while \(x_1\) decreases. The covariance matrix shows this clearly:

>>> xpx.cov(x, xp=xp)
Array([[ 1., -1.],
       [-1.,  1.]], dtype=array_api_strict.float64)

Note that element \(C_{0,1}\), which shows the correlation between \(x_0\) and \(x_1\), is negative.

Further, note how x and y are combined:

>>> x = xp.asarray([-2.1, -1,  4.3])
>>> y = xp.asarray([3,  1.1,  0.12])
>>> X = xp.stack((x, y), axis=0)
>>> xpx.cov(X, xp=xp)
Array([[11.71      , -4.286     ],
       [-4.286     ,  2.14413333]], dtype=array_api_strict.float64)

>>> xpx.cov(x, xp=xp)
Array(11.71, dtype=array_api_strict.float64)

>>> xpx.cov(y, xp=xp)
Array(2.14413333, dtype=array_api_strict.float64)

Input with more than two dimensions is treated as a stack of two-dimensional input.

>>> stack = xp.stack((X, 2*X))
>>> xpx.cov(stack)
Array([[[ 11.71      ,  -4.286     ],
        [ -4.286     ,   2.14413333]],

[[ 46.84 , -17.144 ],
[-17.144 , 8.57653333]]], dtype=array_api_strict.float64)