Quasi-arithmetic mean

Generalization of means

In mathematics and statistics, the quasi-arithmetic mean or generalised f-mean or Kolmogorov-Nagumo-de Finetti mean[1] is one generalisation of the more familiar means such as the arithmetic mean and the geometric mean, using a function f {\displaystyle f} . It is also called Kolmogorov mean after Soviet mathematician Andrey Kolmogorov. It is a broader generalization than the regular generalized mean.

Definition

If   f   {\displaystyle \ f\ } is a function that maps some continuous interval   I   {\displaystyle \ I\ } of the real line to some other continuous subset   J f ( I )   {\displaystyle \ J\equiv f(I)\ } of the real numbers, and   f   {\displaystyle \ f\ } is both continuous, and injective (one-to-one).

(We require   f   {\displaystyle \ f\ } to be injective on   I   {\displaystyle \ I\ } in order for an inverse function   f 1   {\displaystyle \ f^{-1}\ } to exist. We require   I   {\displaystyle \ I\ } and   J   {\displaystyle \ J\ } to both be continuous intervals in order to ensure that an average of any finite (or infinite) subset of values within   J   {\displaystyle \ J\ } will always correspond to a value in   I   {\displaystyle \ I\ } .)

Subject to those requirements, the f mean of   n   {\displaystyle \ n\ } numbers   x 1 , , x n I   {\displaystyle \ x_{1},\ldots ,x_{n}\in I\ } is defined to be

  M f ( x 1 , , x n ) f 1 (   1 n (   f ( x 1 ) + + f ( x n )   )   )   , {\displaystyle \ M_{f}(x_{1},\dots ,x_{n})\;\equiv \;f^{-1}\!\left(\ {\frac {1}{n}}{\Bigl (}\ f(x_{1})+\cdots +f(x_{n})\ {\Bigr )}\ \right)\ ,}

or equivalently

  M f ( x ) = f 1 (   1 n k = 1 n f ( x k )   )   . {\displaystyle \ M_{f}({\vec {x}})\;=\;f^{-1}\!\!\left(\ {\frac {1}{n}}\sum _{k=1}^{n}f(x_{k})\ \right)~.}

A consequence of   f   {\displaystyle \ f\ } being defined over some selected interval,   I   , {\displaystyle \ I\ ,} mapping to yet another interval,   J   , {\displaystyle \ J\ ,} is that   1 n (   f ( x 1 ) + + f ( x n )   )   {\displaystyle \ {\frac {1}{n}}\left(\ f(x_{1})+\cdots +f(x_{n})\ \right)\ } must also lie within   J     . {\displaystyle \ J\ ~.} And because   J   {\displaystyle \ J\ } is the domain of   f 1   , {\displaystyle \ f^{-1}\ ,} so in turn   f 1   {\displaystyle \ f^{-1}\ } must produce a value inside the same domain the values originally came from,   I   . {\displaystyle \ I~.}

Because   f   {\displaystyle \ f\ } is injective and continuous, it necessarily follows that   f   {\displaystyle \ f\ } is a strictly monotonic function, and therefore that the f mean is neither larger than the largest number of the tuple   x 1 ,   , x n X   {\displaystyle \ x_{1},\ldots \ ,x_{n}\equiv X\ } nor smaller than the smallest number contained in   X   , {\displaystyle \ X\ ,} hence contained somewhere among the values of the original sample.

Examples

  • If I = R   , {\displaystyle I=\mathbb {R} \ ,} the real line, and   f ( x ) = x   , {\displaystyle \ f(x)=x\ ,} (or indeed any linear function   x a x + b   , {\displaystyle \ x\mapsto a\cdot x+b\ ,} for   a 0   , {\displaystyle \ a\neq 0\ ,} otherwise any   a   {\displaystyle \ a\ } and any   b   {\displaystyle \ b\ } ) then the f mean corresponds to the arithmetic mean.
  • If   I = R +   , {\displaystyle \ I=\mathbb {R} ^{+}\ ,} the strictly positive real numbers, and   f ( x )   =   log ( x )   , {\displaystyle \ f(x)\ =\ \log(x)\ ,} then the f mean corresponds to the geometric mean. (The result is the same for any logarithm; it does not depend on the base of the logarithm, as long as that base is strictly positive but not 1.)
  • If   I = R +   {\displaystyle \ I=\mathbb {R} ^{+}\ } and   f ( x )   =     1   x   , {\displaystyle \ f(x)\ =\ {\frac {\ 1\ }{x}}\ ,} then the f mean corresponds to the harmonic mean.
  • If   I = R +   {\displaystyle \ I=\mathbb {R} ^{+}\ } and   f ( x )   =   x   p   , {\displaystyle \ f(x)\ =\ x^{\ \!p}\ ,} then the f mean corresponds to the power mean with exponent   p   {\displaystyle \ p\ } (e.g., for   p = 2   {\displaystyle \ p=2\ } one gets the root mean square (RMS).)
  • If   I = R   {\displaystyle \ I=\mathbb {R} \ } and   f ( x )   =   exp ( x )   , {\displaystyle \ f(x)\ =\ \exp(x)\ ,} then the f mean is the mean in the log semiring, which is a constant-shifted version of the LogSumExp (LSE) function (which is the logarithmic sum),   M f (   x 1 ,   ,   x n   )   =   L S E (   x 1 ,   ,   x n   ) log ( n )   . {\displaystyle \ M_{f}(\ x_{1},\ \ldots ,\ x_{n}\ )\ =\ \operatorname {\mathsf {LSE}} \left(\ x_{1},\ \ldots ,\ x_{n}\ \right)-\log(n)~.} (The   log ( n )   {\displaystyle \ -\log(n)\ } in the expression corresponds to dividing by n, since logarithmic division is linear subtraction.) The LogSumExp function is a smooth maximum: It is a smooth approximation to the maximum function.

Properties

The following properties hold for   M f   {\displaystyle \ M_{f}\ } for any single function   f   {\displaystyle \ f\ } :

Symmetry: The value of   M f   {\displaystyle \ M_{f}\ } is unchanged if its arguments are permuted.

Idempotency: for all   x   , {\displaystyle \ x\ ,} the repeated average   M f (   x ,   ,   x   ) = x   . {\displaystyle \ M_{f}(\ x,\ \dots ,\ x\ )=x~.}

Monotonicity:   M f   {\displaystyle \ M_{f}\ } is monotonic in each of its arguments (since   f   {\displaystyle \ f\ } is monotonic).

Continuity:   M f   {\displaystyle \ M_{f}\ } is continuous in each of its arguments (since   f   {\displaystyle \ f\ } is continuous).

Replacement: Subsets of elements can be averaged a priori, without altering the mean, given that the multiplicity of elements is maintained. With   m     M f (   x 1 ,     ,   x k   )   {\displaystyle \ m\ \equiv \ M_{f}\!\left(\ x_{1},\ \ldots \ ,\ x_{k}\ \right)\ } it holds:

  M f (   x 1 ,   ,   x k ,   x k + 1 ,     ,   x   n   )   =   M f ( m ,     ,   m   k  times     , x k + 1   ,     ,   x n )   . {\displaystyle \ M_{f}\!\left(\ x_{1},\ \dots ,\ x_{k},\ x_{k+1},\ \ldots \ ,\ x\ _{n}\ \right)\ =\ M_{f}\!\left(\;\underbrace {m,\,\ \ldots \ ,\ m} _{\ k{\text{ times}}\ }\ ,\;x_{k+1}\ ,\ \ldots \ ,\ x_{n}\;\right)~.}

Partitioning: The computation of the mean can be split into computations of equal sized sub-blocks:

M f (   x 1 ,   ,   x n k   ) = M f ( M f (   x 1 ,     ,   x k   ) , M f (   x k + 1 ,     ,   x 2 k   ) , , M f (   x ( n 1 ) k + 1 ,     ,   x n k   ) )   . {\displaystyle M_{f}\!\left(\ x_{1},\ \dots ,\ x_{n\cdot k}\ \right)\;=\;M_{f}\!{\Bigl (}\;M_{f}\left(\ x_{1},\ \ldots \ ,\ x_{k}\ \right),\;M_{f}\!\left(\ x_{k+1},\ \ldots \ ,\ x_{2\cdot k}\ \right),\;\dots ,\;M_{f}\!\left(\ x_{(n-1)\cdot k+1},\ \ldots \ ,\ x_{n\cdot k}\ \right)\;{\Bigr )}~.}

Self-distributivity: For any quasi-arithmetic (q.a.) mean   M q   a   {\displaystyle \ M_{\mathsf {q\ \!a}}\ } of two variables:

  M q   a   ( x ,   M q   a   (   y ,   z   ) ) = M q   a   ( M q   a   (   x ,   y   ) , M q   a   (   x ,   z   ) )   . {\displaystyle \ M{\mathsf {q\ \!a\ \!}}\!{\Bigl (}\;x,\ M{\mathsf {q\ \!a\ \!}}\!\left(\ y,\ z\ \right)\;{\Bigr )}=M{\mathsf {q\ \!a\ \!}}\!{\Bigl (}\;M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ y\ \right),\;M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ z\ \right)\;{\Bigr )}~.}

Mediality: For any quasi-arithmetic mean   M q   a   {\displaystyle \ M{\mathsf {q\ \!a}}\ } of two variables:

  M q   a   ( M q   a   (   x ,   y   ) , M q   a   (   z ,   w   ) ) = M q   a   ( M q   a   (   x ,   z   ) , M q   a   (   y ,   w   ) )   . {\displaystyle \ M{\mathsf {q\ \!a\ \!}}\!{\Bigl (}\;M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ y\ \right),\;M{\mathsf {q\ \!a\ \!}}\!\left(\ z,\ w\ \right)\;{\Bigr )}=M{\mathsf {q\ \!a\ \!}}\!{\Bigl (}\;M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ z\ \right),\;M{\mathsf {q\ \!a\ \!}}\!\left(\ y,\ w\ \right)\;{\Bigr )}~.}

Balancing: For any quasi-arithmetic mean   M q   a   {\displaystyle \ M{\mathsf {q\ \!a}}\ } of two variables:

  M q   a   (   M q   a   ( x , M q   a   (   x ,   y   ) ) ,   M q   a   ( y ,   M q   a   (   x ,   y   ) )   )   =   M q   a   (   x ,   y   )   . {\displaystyle \ M{\mathsf {q\ \!a\ \!}}\!{\biggl (}\;\ M{\mathsf {q\ \!a\ \!}}\!{\Bigl (}\;x,\;M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ y\ \right)\;{\Bigr )},\;\ M{\mathsf {q\ a\ \!}}\!{\Bigl (}\;y,\ M{\mathsf {q\ \!a\ \!}}\!\left(\ x,\ y\ \right)\;{\Bigr )}\;\ {\biggr )}~=~M{\mathsf {q\ \!a\ \!}}\!{\bigl (}\ x,\ y\ {\bigr )}~.}

Scale-invariance: The quasi-arithmetic mean is invariant with respect to offsets and non-trivial scaling of quasi-arithmetic   f   : {\displaystyle \ f\ :} For any   p ( t )     a + b q ( t )   , {\displaystyle \ p(t)\ \equiv \ a+b\cdot q(t)\ ,} with   a   {\displaystyle \ a\ } and   b 0   {\displaystyle \ b\neq 0\ } constants, and   q   {\displaystyle \ q\ } a quasi-aritmetic function,   M p (   x   )   {\displaystyle \ M_{p}(\ x\ )\ } and M q (   x   )   {\displaystyle M_{q}(\ x\ )\ } are always the same. In mathematical notation:

Given   q   {\displaystyle \ q\ } quasi-aritmetic, and   p   :   (   p ( t ) = a + b q ( t )     t   )   a   b 0 M p (   x   ) = M q (   x   )   x   . {\displaystyle \ p\ :\ {\bigl (}\ p(t)=a+b\cdot q(t)\;\ \forall \ t\ {\bigr )}\;\forall \ a\;\forall \ b\neq 0\quad \Rightarrow \quad M_{p}(\ x\ )=M_{q}(\ x\ )\;\forall \ x~.}

Central limit theorem : Under certain regularity conditions, and for a sufficiently large sample,

  z     n     [ M f (   X 1 ,     ,   X n   ) E X (   M f (   X 1 ,     ,   X n   )   ) ]   {\displaystyle \ z~\equiv ~{\sqrt {n\ }}\ {\biggl [}\;M_{f}(\ X_{1},\ \ldots \ ,\ X_{n}\ )\;-\;\operatorname {\mathbb {E} } _{X}\!{\Bigl (}\ M_{f}(\ X_{1},\ \ldots \ ,\ X_{n}\ )\ {\Bigr )}\;{\biggr ]}\ }

is approximately normally distributed.[2] A similar result is available for Bajraktarević means and deviation means, which are generalizations of quasi-arithmetic means.[3][4]

Characterization

There are several different sets of properties that characterize the quasi-arithmetic mean (i.e., each function that satisfies these properties is an f-mean for some function f).

  • Mediality is essentially sufficient to characterize quasi-arithmetic means.[5]: chapter 17 
  • Self-distributivity is essentially sufficient to characterize quasi-arithmetic means.[5]: chapter 17 
  • Replacement: Kolmogorov proved that the five properties of symmetry, fixed-point, monotonicity, continuity, and replacement fully characterize the quasi-arithmetic means.[6]
  • Continuity is superfluous in the characterization of two variables quasi-arithmetic means. See [10] for the details.
  • Balancing: An interesting problem is whether this condition (together with symmetry, fixed-point, monotonicity and continuity properties) implies that the mean is quasi-arithmetic. Georg Aumann showed in the 1930s that the answer is no in general,[7] but that if one additionally assumes M {\displaystyle M} to be an analytic function then the answer is positive.[8]

Homogeneity

Means are usually homogeneous, but for most functions f {\displaystyle f} , the f-mean is not. Indeed, the only homogeneous quasi-arithmetic means are the power means (including the geometric mean); see Hardy–Littlewood–Pólya, page 68.

The homogeneity property can be achieved by normalizing the input values by some (homogeneous) mean C {\displaystyle C} .

M f , C x = C x f 1 ( f ( x 1 C x ) + + f ( x n C x ) n ) {\displaystyle M_{f,C}x=Cx\cdot f^{-1}\left({\frac {f\left({\frac {x_{1}}{Cx}}\right)+\cdots +f\left({\frac {x_{n}}{Cx}}\right)}{n}}\right)}

However this modification may violate monotonicity and the partitioning property of the mean.

Generalizations

Consider a Legendre-type strictly convex function F {\displaystyle F} . Then the gradient map F {\displaystyle \nabla F} is globally invertible and the weighted multivariate quasi-arithmetic mean[9] is defined by M F ( θ 1 , , θ n ; w ) = F 1 ( i = 1 n w i F ( θ i ) ) {\displaystyle M_{\nabla F}(\theta _{1},\ldots ,\theta _{n};w)={\nabla F}^{-1}\left(\sum _{i=1}^{n}w_{i}\nabla F(\theta _{i})\right)} , where w {\displaystyle w} is a normalized weight vector ( w i = 1 n {\displaystyle w_{i}={\frac {1}{n}}} by default for a balanced average). From the convex duality, we get a dual quasi-arithmetic mean M F {\displaystyle M_{\nabla F^{*}}} associated to the quasi-arithmetic mean M F {\displaystyle M_{\nabla F}} . For example, take F ( X ) = log det ( X ) {\displaystyle F(X)=-\log \det(X)} for X {\displaystyle X} a symmetric positive-definite matrix. The pair of matrix quasi-arithmetic means yields the matrix harmonic mean: M F ( θ 1 , θ 2 ) = 2 ( θ 1 1 + θ 2 1 ) 1 . {\displaystyle M_{\nabla F}(\theta _{1},\theta _{2})=2(\theta _{1}^{-1}+\theta _{2}^{-1})^{-1}.}

See also

References

  • Andrey Kolmogorov (1930) "On the Notion of Mean", in "Mathematics and Mechanics" (Kluwer 1991) — pp. 144–146.
  • Andrey Kolmogorov (1930) Sur la notion de la moyenne. Atti Accad. Naz. Lincei 12, pp. 388–391.
  • John Bibby (1974) "Axiomatisations of the average and a further generalisation of monotonic sequences," Glasgow Mathematical Journal, vol. 15, pp. 63–65.
  • Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) Inequalities. 2nd ed. Cambridge Univ. Press, Cambridge, 1952.
  • B. De Finetti, "Sul concetto di media", vol. 3, p. 36996, 1931, istituto italiano degli attuari.
  1. ^ Nielsen, Frank; Nock, Richard (June 2017). "Generalizing skew Jensen divergences and Bregman divergences with comparative convexity". IEEE Signal Processing Letters. 24 (8): 2. arXiv:1702.04877. Bibcode:2017ISPL...24.1123N. doi:10.1109/LSP.2017.2712195. S2CID 31899023.
  2. ^ de Carvalho, Miguel (2016). "Mean, what do you mean?". The American Statistician. 70 (3): 764‒776. doi:10.1080/00031305.2016.1148632. hdl:20.500.11820/fd7a8991-69a4-4fe5-876f-abcd2957a88c. S2CID 219595024 – via zenodo.org.
  3. ^ Barczy, Mátyás; Burai, Pál (April 2022). "Limit theorems for Bajraktarević and Cauchy quotient means of independent identically distributed random variables". Aequationes Mathematicae. 96 (2): 279–305. arXiv:1909.02968. doi:10.1007/s00010-021-00813-x. ISSN 1420-8903 – via Springer.com.
  4. ^ Barczy, Mátyás; Páles, Zsolt (September 2023). "Limit theorems for deviation means of independent and identically distributed random variables". Journal of Theoretical Probability. 36 (3): 1626–1666. arXiv:2112.05183. doi:10.1007/s10959-022-01225-6. ISSN 1572-9230 – via Springer.com.
  5. ^ a b Aczél, J.; Dhombres, J. G. (1989). Functional equations in several variables. With applications to mathematics, information theory and to the natural and social sciences. Encyclopedia of Mathematics and its Applications, 31. Cambridge: Cambridge Univ. Press.
  6. ^ Grudkin, Anton (2019). "Characterization of the quasi-arithmetic mean". Math stackexchange.
  7. ^ Aumann, Georg (1937). "Vollkommene Funktionalmittel und gewisse Kegelschnitteigenschaften". Journal für die reine und angewandte Mathematik. 1937 (176): 49–55. doi:10.1515/crll.1937.176.49. S2CID 115392661.
  8. ^ Aumann, Georg (1934). "Grundlegung der Theorie der analytischen Analytische Mittelwerte". Sitzungsberichte der Bayerischen Akademie der Wissenschaften: 45–81.
  9. ^ Nielsen, Frank (2023). "Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry". arXiv:2301.10980 [cs.IT].
Retrieved from "https://en.wikipedia.org/w/index.php?title=Quasi-arithmetic_mean&oldid=1333101075"