Noise: white, pink, brown, black, and buffers

Posted by on Nov 4, 2012 in Tutorial | 0 comments

This post is a continuation of the previous one on network congestion.

A white noise is one in which each sample is uncorrelated with any other. A white Gaussian noise is one in which the samples are distributed according to a normal curve with a stable mean and variance. The power spectrum of a white noise process is uniformly flat at all frequencies. One way to think about this property is to say that there is equal power in equal frequency bands.

Another “brand” of noise is pink noise. The power spectrum of a pink noise falls off as 1/f and so this form of noise is often called “1/f noise”. As in the case of white noise, there are equal power frequency bands; however, in this case, the equal power bands are constant on a logarithmic scale. Pink noise often arise naturally due to what are called “relaxation processes”. In a typical relaxation process, some ensemble of elements (e.g., particles, dipoles, or whatever), is in an excited state and can “relax” from that state with an exponentially distributed time. Given just one such process, the power spectrum turns out to be that of a low pass filter with a cut-off frequency of $1/{\tau}$. If there are several such relaxation processes active in a system, and if these are exponentially related to one another, the result is a pink noise. A way to simulate this kind of behavior can be done with three (or more) dice. The output of the process is the sum of the faces of the three dice. The first die is thrown at each stage. The second die is thrown at each second stage. The third die is thrown at each fourth stage. If there is a fourth die, it would be cast at each eighth stage; and so on. The average value is 10.5, which is naturally 3 times 3.5. The progression of relaxation times is simulated by the different degrees of persistence in the values of the dice.

A brown noise arises from the integration of white noise. In this case, the power spectrum falls off as $1/f^2$. Beyond brown noise lies black noise. One example of a black noise would be a white noise that had been integrated twice, for a power spectrum of $1/f^4$.

So, in general, we could consider any noise spectrum of the form $f^{-\beta}$ where $\beta$ is between 0 or greater. The specific cases we’ve considered so far at $\beta = 0$ for white noise, $\beta = -1$ for pink noise, $\beta = -2$ for brown noise, and $\beta > 2$ for black noise. In our previous post, we introduced the Hurst exponent, $H$. It turns out that there is a simple relationship between the spectral exponent $\beta$ and the Hurst exponent, $H$; namely, $\beta = 2H + 1$. So, for white noise with $\beta = 0$, $H = 1/2$ and the result is a random walk. For the common Hurst exponent of $H = 0.72$, $\beta = 2.44$; and the result is a black noise.

For these brown and black noises, there are long range dependencies (LRD) between increasingly distant samples. Stated another way, the greater the value of $\beta$, the greater the likelihood of runs of positive or negative values. When such processes are discovered in queuing systems, the usual assumptions concerning convergence towards large-scale behaviors break down.

The classic paper that presented this to an engineering audience was Leland, W., M. Taqqu, W. Willinger, and D. Wilson, “On the Self-Similar Nature of Ethernet Traffic (Extended Version),” IEEE/ACM Transactions on Networking, 2(1), pp. 1-13, 1994. I can recall my own amazement upon reading this paper for the first time. Perhaps the best representation of the concepts is the following figure, taken from a follow-up paper, Willinger, W. and V. Paxson, “Where Mathematics Meets the Internet,” Notices of the American Mathematical Society, Vol. 45, No. 8, pp. 961-970, 1998.

Difference between classic and fractal traffic

In the case of the “fractal” traffic, the image looks equally bursty at any time scale, while the “classic” traffic becomes increasingly smooth as the observation window is increased. This is an expression of the fact that the Gaussian probability distribution that underlies the “classic” traffic models has a preferred scale; namely, its mean value. As the observation time is increased well beyond the mean arrival rate of traffic, plus or minus the standard deviation, then the fluctuations around the mean become less and less apparent.

On the other hand, with “fractal” traffic, there are fluctuations at all scales; and there is no benefit in attempting to “average them out”. Of course, that was the whole idea of a buffer in the first place wasn’t it? The point of the buffer was to hold, say, 100 samples of the traffic; and assuming that it was Poisson distributed, then the fluctuations would be about ±10. More generally, for a Poisson process, a buffer that was $K$ deep would have fluctuations of about $K^{1/2}$. As buffer depth is increased, the relative fluctuations and the likelihood of buffer overflow is reduced.

This just doesn’t happen with “fractal” traffic. An excellent introductory article on this form of traffic is “A Short Tutorial on Fractals and Internet Traffic” by Thomas B. Fowler. In the same publication, there is an excellent follow-up article, “A Method for Analyzing Congestion in Pareto and Related Queues“, by Martin Fischer and Carl Harris. [Note that these two documents require Microsoft Word or some application that can open Word documents.]

If offered load is not Poisson-distributed, then what is it? As discussed in the articles I just referred to, the distributions tend to have “long tails”, which is to say that they do not fall off as rapidly as the normal distribution (or other distributions of the exponential family) for high values. More generally, these distributions are referred to as power-law distributions. A power law has for its mathematical form $K x^{-\alpha}$ where $K$ and $\alpha$ are constants, at least for some part of the domain $x$.

One class of such distributions are the Pareto distributions; however, there are a broader class of these as well. The Pareto Type I distribution is given as follows:

 \bar{F}(x) = \Pr (X>x) = \begin{cases} \left(\frac{x_m}{x}\right){}^{\alpha } & \text{for } x\geq x_m \\ 1 & \text{for } xhad to converge to nice, neat normal distributions. It turns out that the normal distribution is only one of a class of so-called Levy alpha-stable distributions for which a law of large numbers (central limit theorem) applies. Those deeply interested in the mathematics can read through the Wikipedia article that I just linked to and follow the references there.

The main point for my purposes here is to point out that these more general distributions have four parameters, one for "shift" or central location (think "mean"), one for asymmetry (think "skew"), one for spread (think "variance"), and one for asymptotic behavior (think "tail"). The normal, Lévy, Pareto, Cauchy and other distributions are all specific cases of the Lévy alpha-stable distribution. Also, each variant obeys a central limit theorem in the sense that the sum of random variables that are distributed as any member of this class is also distributed as the same distribution. Said another way, the sum of a number of Paretian random variables is also a Paretian random variable just as surely as the sum of a number of Gaussian random variables is a Gaussian.

I hope that this fact helps persuade you, gentle reader, that simply adding up some millions of subscribers, any group of which might obey some anomalous probability distribution, does not yield a normal curve due to a law of large numbers. Quite the contrary: add up millions of power-tail distributions and viola, you have a power-tail distribution.

Back to buffers... Because of the general applicability of these observations, consideration of Pareto-distributed traffic on queues has become important over the past several years. In particular, a Pareto/D/1/k model has been a frequent basis for study and simulation. A reasonable representative of this class of work is a paper entitled "Evaluation of Pareto/D/1/k queue by simulation" by Mirtchev and Goleva. The general result in this and similar studies is that the probability of congestion is often one or more orders of magnitude larger than for Poisson distributed traffic under similar assumptions of mean load.

And this brings me back to the broad concept of a congestion policy and how paging systems differ from other forms of data network. If we wrap our minds around the notion that any system given Paretian load will experience congestion at a significant rate from time to time, and especially under situations of emergency (hurricanes, floods, fire, terrorism, and et cetera), should a carrier that has accepted messages for delivery throw them in the bit bucket or retry them. In paging systems, especially in two-way paging, there is a dogged policy of message retry until delivery is confirmed. Under conditions of network congestion for any reason, messages are retained in input terminals and linked to the subscriber's account. Various agents within the network accept messages for delivery to the subscriber and either return success or failure to the account. Undelivered messages are retried by policy on both time-driven and event-driven bases.

One aspect of message delivery retry procedures that is of high importance in wireless systems is their impact on battery life at the mobile. TCP can make very strong demands on mobile battery because of its demands on end-to-end packet transmission and acknowledgement; that is, in a typical Internet environment, the onus for message confirmation lies with the sender. The intermediate network elements are free to discard packets due to congestion (or whatever) and push the responsibility for delivery back to the end points. In contrast, in a paging network, once a message is accepted, the sender is free to disconnect. In this connectionless model, the network assumes responsibility for message delivery and also assumes a certain responsibility for the efficiency of delivery; that is, not to arbitrarily consume the battery life of either sender or recipient in order to achieve delivery.

I'll continue with further thoughts on message delivery policy in the face of congestion issues, et al, in subsequent posts. Suffice it to say that I believe that any network that purports to provide service for first responders, emergency situations, or any critical response, must be designed around certain positive principles that do not just dismiss network congestion issues but adapt to them dynamically.