# Hellinger divergence between two homogeneous Poisson processes

Hellinger divergence/distance/metric/dissimilarity measures how different two probability measures are. I am interested in comparing spike train observations, hence I apply the Hellinger divergence to *probability measures over all possible spike trains*. The squared Hellinger divergence is defined as where and are probability measures.

In case of the simplest point process, homogeneous Poisson process, the spike train space as well as the probability measure can be partitioned according to the number of events (action potentials) in each spike train. Since, the location statistic (probability given the total number of events) is exactly same for all homogeneous Poisson process, the integral in Hellinger divergence can be simply written as the following summation,

where and are the probability that events occur. Let and be the corresponding mean number of events, then and similarly for ; they follow the Poisson distribution. Substituting and rearranging the terms, we obtain,

.

This is essentially same as just the divergence between two Poisson distributions. Let’s plot how it looks (using Matlab).

Squared Hellinger divergence as we defined it earlier ranges from 0 to 2. From the figure above, we can see that it is zero when the rates coincide and tends to 2 as the difference increases. I recently proposed a nonparameteric estimator for this measure, and it is currently under review. I would like to empirically verify the asymptotic consistency of the estimator with this result.

Note that previously proposed mCI based distance between point processes [Paiva et. al. 2009], related to van Rossum’s distance of spike trains, has a quadratic form . It only applies to Poisson process when no realization is given, which is a major limitation. The Hellinger divergence can be applied to arbitrary point processes.

Around the point where P = Q, most divergences show quadratic behavior. Therefore for optimization purposes any divergence will do fine, near the optima. The question is how they behave when they are not close to each other.