 # Einstein, Hurst, and network congestion policy

Posted by on Sep 23, 2012 in Tutorial | 0 comments We begin our story with good old Albert Einstein and his considerations of Brownian motion. In 1905, Einstein delivered four papers for publication that, in turn, addressed the photoelectric effect, Brownian motion, special relativity, and energy-mass equivalence. He could have been swept from the earth in 1906 and he’d still have been rightly famous for his work of 1905. Over 100 years later, folks are still struggling with the meaning of his work in that year. You can read what Albert had to say about Brownian motion for yourselves.

Let’s turn to the bit on Brownian motion for now. Lucretius wrote of Brownian motion in 60BC:

Observe what happens when sunbeams are admitted into a building and shed light on its shadowy places. You will see a multitude of tiny particles mingling in a multitude of ways… their dancing is an actual indication of underlying movements of matter that are hidden from our sight… It originates with the atoms which move of themselves [i.e., spontaneously]. Then those small compound bodies that are least removed from the impetus of the atoms are set in motion by the impact of their invisible blows and in turn cannon against slightly larger bodies. So the movement mounts up from the atoms and gradually emerges to the level of our senses, so that those bodies are in motion that we see in sunbeams, moved by blows that remain invisible.

The essence of Einstein’s solution to the problem of Brownian motion was to propose that the small particles (of dust or whatever) were in thermodynamic equilibrium with the molecules of the gas in which they were suspended. Of course, this model was open to generalization to a variety of other situations; e.g., electrons or holes in a semiconductor, thermal electrons “evaporating” from a hot cathode in a vacuum, and so on. What is of interest in our present discussion was that the mean displacement $\langle x\rangle$ of a particle turns out to depend on the square root of the time: $\langle x\rangle \propto \sqrt{t}$

This result is quite general. It depends only on the idea that the particle under consideration is subject to a sequence of impulses that are distributed as a set of uncorrelated Gaussian random values with a specified mean and standard deviation. One can even get pretty far in Brownian-like motions by dropping the Gaussian assumption so long as one keeps the uncorrelated constraint.

This dependence of distance moved on the square root of time derives from the principle of equipartition of energy. This principle states that in thermodynamic equilibrium, each possible independent way in which the elements of a system can move has exactly the same amount of kinetic energy on average. This amount of energy is $k_BT/2$, where $T$ is the absolute temperature in Kelvin and $k_B$ is the Boltzmann constant. We can then focus on a single particle, without any consideration of the rest; so that its average kinetic energy, will be $m\left.\left\langle v^2\right\rangle \right/2 = m\left.\left(\left\langle v_x^2\right\rangle +\left\langle v_y^2\right\rangle +\left\langle v_z^2\right\rangle \right)\right/2 = 3k_bT/2$

Because this particle can move in 3 independent directions in our 3-D space, it has an energy of $3k_bT/2$ on average. This gives a root mean square velocity of $v_{\text{rms}}=\sqrt{\frac{3k_BT}{m}}$

A generalization of Brownian motion is the Wiener process, named after Norbert Wiener. A Wiener process has three properties consistent with our view of Brownian motion so far; viz.,

1. $W_t = 0$ at $t = 0$
2. $W_t$ is almost surely continuous everywhere
3. any increment, $W_t - W_s$ where $t > s$ is normally distributed with mean zero and variance $t - s$

It is this property that the increment of a Wiener process has mean 0 and a variance that is linear in time that yields a standard deviation, or expected increment that proportional to the square root of time, $\sqrt{t}$.

Now hold that thought.

We turn our focus now to Harold Edwin Hurst who was a British hydrologist working the flow of the Nile, and whose work was ultimately important in the construction of the Aswan High Dam. Hurst’s obituary tells the story of how he came to be fascinated by the actual flood data on the Nile, records of which had been kept in Egypt for millenia.

By the time of Hurst’s studies, the concept of the Wiener process was already well understood. So, there was an expectation that, if the annual flow from a river was a normal process with some mean and standard deviation, then a dam that would store this flow would be a kind of Wiener process if the sum of the annual average value was subtracted out. This suggested that the fluctuations in the volume within the dam would be proportional to the square root of the time of storage. However, when Hurst actually analyzed the data for the Nile, he discovered instead that the annual increments varied as $t^H$ where $H = 0.72$ is now called the Hurst exponent. Hurst did his calculations using rescaled range analysis. These naturally yield an exponent that should be exactly 0.5 for uncorrelated increments in the observations.

That was not what the data revealed though, and Hurst became so captivated by the results that he began to investigate related phenomena. A natural extension of annual river flows was annual rainfall data; and a proxy would be tree ring values. Hurst looked at data for these and again found an exponent of about 0.72. In fact, it is now commonly accepted that many natural phenomena have Hurst exponents of about this value.

What does this all mean? Well, for an intuitive starting point, consider a simple particle being struck repeatedly in exactly the same direction. In this case, the increments in position all add up directly, because they are highly correlated. This is true even if the magnitudes of the impulses is itself random. Now, the change in position goes directly with the time, $\langle{x}\rangle = v t$. If instead, we whack our particle in the uncorrelated random fashion of Brownian motion, we get $\langle{x}\rangle = v_{rms} \sqrt{t}$. Now suppose that we allow for some degree of correlation between each whack and the next. We don’t line them all up completely, but if the last increment was positive, say, then the likelihood is better than 0% that the next increment will also be positive. Now we get what Hurst observed: $\langle{x}\rangle = v_{rms} t^H$, where $latex 0.5 < H \leqslant 1$. This implies that the Nile data represent a "feast or famine" correlation. In other words, if the Nile was high one year, chances were that it was high the next, and vice versa. Just to close this loop, if we arrange that the whacks that our particle receives are anti-correlated, then we get a Hurst exponent that is $latex 0 \leqslant H < 0.5$. In communications theory, we often describe noise as AWGN meaning additive white Gaussian noise. We will also make related assumptions about such noise being a Wiener process and this is of value where an AWGN component is integrated or summed. We also employ a similar intellectual structure in modeling load offered to a network. For example, we might assume that the number of packets arriving at a router or terminal was a Gaussian or Poisson process, in which case, we might expect that the net load would be a Wiener process.

In short, the problem of buffer design in a network is completely analogous to the design of a dam on a river. The question then arises, what if actual offered load was some correlated “feast or famine” process as in the case of the Nile. Well, of course it is; that’s what busy hour is all about, isn’t it?

This concept leads to the notion of runs in a random process. Suppose we have a data buffer and we can clear that buffer at an output rate, $O$. We are interested in the likelihood of $N$ intervals in which the input rate, $I$ is greater than $O$. If we had some model for this likelihood, we could then provide the buffer with some amount of memory so that the probability of an overflow was less than some design criterion. It should be apparent that this is exactly the problem that Hurst faced in analyzing a storage dam on the Nile.

Holding this thought for a moment, let’s turn our attention to the notion of congestion policy in networks. A traditional IP router on the Internet implements a very simple congestion policy; namely, any arriving packet that it cannot deliver immediately is dropped. In other words, its buffer depth is 0. Telephone switches have two different possible approaches to congestion; calls that are blocked because there are no available trunks can either be held (with some form of busy signal back) or they can be dropped (perhaps with a “your call cannot be completed at this time” alert).

If output bandwidth is considered to be a network resource, then the network element that provides access to it is a kind of server. It would be very natural to consider what service policies might optimize access to network bandwidth. In a data network with offered load having a Hurst exponent of $H \approx 0.72$, an extra strain is introduced since we now recognize the increased likelihood of correlated runs in traffic. Furthermore, if the network in question is designed to serve critical messaging for either the public or first responders or both, then it may be apparent from the context that one of the key drivers for traffic correlation will be the existence of a state of emergency.

Let’s consider a simplified version of this problem. Say that a buffer has an initial amount of memory, $M$. In each interval of time, the amount of memory left can go up by 1 unit or down by 1 unit. Suppose to begin with that the probability that we lose a unit of memory is $p$ and the probability that we gain a unit of memory is $q = (1 - p)$. Because there is a finite amount of memory, we can never have more than $M$. So in the case that the amount of memory remaining is $M$ and the increment is +1, the result remains $M$.

This problem has already been solved, with one magical addition, at least it would be magical in our context. Imagine for a moment that if the increment is +1 when the remaining memory is $M$ that the result is $M+1$. This problem is known as “gambler’s ruin“. In this variation of the problem, a gambler has a bankroll, $M$, and he is playing against a bank with value $B$. Now, each interval represents a bet that can go for or against the gambler. The question is what is the likelihood that the gambler will be ruined at the $k$th turn? It turns out that the answer is $q_k=1 - \frac{M}{B}$

In effect, our buffer is playing against a house with extremely large capital, $B$. If $M \ll B$, then the likelihood of ruin is almost certain. In the example of the buffer, the bank is the entire set of sources that can send a message. If all sources become correlated and deliver traffic in sequence, then the buffer must surely overflow unless $M > B$. In our buffer situation, the result must be worse than given by the last equation since we cannot create buffer memory in “playing” against the “bank” of traffic sources. An additional factor against the buffer is the fact that the solution to the gambler’s ruin problem assumes no correlation between plays; in other words, the sequence of increments yields a Wiener process. We have asserted that the likelihood of a run against our “gambler” (or buffer) is worse than what would happen for uncorrelated increments.

In subsequent posts, I’ll dig into these aspects of paging networks more deeply. Stay tuned. 