So, the set is equivalent to the operation; or said in plain language, the noun is the verb and the verb is the noun. In English, we could say that a given number, say, “5”, was a count or a counting, a sum or a summing up, an addition or an adding up. Likewise, we can count down: “10”, “9”, “8”, “7”, “6”, “5”, “4”, “3”, “2”, “1”, “0”, “Liftoff, we have liftoff!”, which is, of course, subtraction, the way I’ve written it down here. In practice with rocket ship liftoffs, it would actually be addition because time before is negative. The ground crew at the launch site would start that count down sequence by saying something like “It’s t minus 12 seconds and counting, minus 11, 10, …” and drop the “minus” along the way. The whole “count down” is conceptually backwards because they’re actually counting up in the usual way. If liftoff is at , then events after liftoff, like booster separation, occur at positive time into the mission.

My point here is that inverting the sense of the operation, in this case, counting up to counting down, addition to subtraction, delivers a new set of numbers by extension: we get the negative natural numbers from the natural numbers by a simple inversion of the operation. Putting the whole set together yields the integers: we can count up *ad infinitum* from 0 and we can count down *ad infinitum* from 0 in the integers.

Given that we have a set of items that we have counted, say, 100 sheep, we can divide the set into subsets; for example, 60 white sheep and 40 black sheep. We might have 100 tosses of a fair coin and get 55 heads and 45 tails. We might have 2 dozen pieces of fruit and have 1 dozen apples and 1 dozen oranges. And so on. This gives us the notion of a ratio: 60 out of 100 sheep are black, 55 out of 100 tosses are heads, 1 out of 2 pieces of fruit are apples…. It is common notation to write this down as “60/100”, “55/100”, “1/2”.

Further, we can reduce these ratios to more basic ratios. If 60 out of 100 sheep are black, then we can create 10 equivalent subsets of 10 sheep each with 6 black and 4 white sheep. We can also create 20 equivalent subsets of 5 sheep in which we have 3 black and 2 white sheep. This practical operation is intellectually equivalent to the abstract operation of finding an irreducible fraction representation for the overall set and its subsets.

This operation of dividing a set into subsets is just the inverse of creating a larger set by creating new copies of an initial set. We could start with a set of 5 elements: 3 black and 2 white sheep, and count up 20 copies of this. Or, we can begin with a set of 100 sheep of which 60 are black and 40 are white, and divide it into 20 equivalent subsets. Either way. Same thing. Division and multiplication.

To be somewhat more specific, the form of division that I started with is *partitioning*. In partitioning, we begin with a set of some specific size, *a*, and partition it into some number, *b*, of subsets of equal size. The other form of division is called quotative; we begin with a set of some specific size, *a*, and form subgroups of a smaller size, *c*: the number of subgroups, *b*, is called the quotient of *a* and *c*. A set does not have to be partitioned into subsets of equal size, but this is the model that yields division. In quotative division, it is possible to have a remainder; that is, we may divide *a* into *b* subsets of size *c*, and have a non-zero remainder subset of size less than *c* “left over”.

Of course, we can multiply with natural numbers and integers. In that case, we treat all the elements as equivalent: sheep, coin faces, fruit, whatever: either 2 copies of 3 things gives 6 things or 3 copies of 2 things gives 6 things.

This concept of nouns and verbs together yields algebras. For example, a very basic form of algebraic structure is called a *magma*, denoted . A magma has a single binary operation, “” such that for any two elements of , , then also . This single property of a magma under the binary operation “” is called *closure*. But I have argued here that there may be an even more primitive unary version of the binary operator ““, call it “latex +$”, which is a generator of , such that every element, , may be created by successive applications of on some single initiating null element that may or may not be in .

This is rather like the ladder operators of quantum mechanics or the annihilation and creation operators of quantum field theory. Ladder operators move a quantum system up and down the spectrum of eigenvalues, and hence, up and down the spectrum of discrete energy or momentum values.Annihilation and creation operators decrease or increase the number of particles, aka resonances, of the field. Creation is a fundamental operation on the field that increases the count of discrete resonances by . This immediately assumes that the resonances in a quantum field are countable, by definition; and likewise for the spectrum of eigenvalues of a quantum mechanical system.

Setting aside the question of whether or not every binary operation for any magma is associated with a more primitive unary operation of set generation, we can state that a magma is a “group-like” algebraic structure. If we add the associative property to a magma, we have a structure called a *semigroup*; that is, , . If the semigroup has an identity element “0” such that then , then we have a *monoid*. Finally, if we include an operational inverse in the monoid, we have a *group*; that is, if . In English, we’d say that if for all elements of the monoid, there is an inverse element such that the result of the group operation on the element and its inverse yields the identity element.

I have written some of the last paragraph with a loose assumption of a commutative property for the identity and the inverse. With somewhat more rigor, one can distinguish between a left identity and a right identity, and between a right inverse and a left inverse. If we accept that the operation is commutative at the level of the semigroup, we have a *semilattice*. And, if we maintain commutativity all the way to the group, then the group is *Abelian*.

There is another pathway between a magma and a group. If the magma supports division, then it is a *quasigroup*. If we add an identity element to the quasigroup, we get a *loop*. Finally, we include the associative property to the loop, and we again have a group. The first path goes magma -> *associativity* -> semigroup -> *identity* -> monoid -> *invertibility* -> group. The second path goes magma -> *divisibility* -> quasigroup -> *identity* -> loop -> *associativity* -> group. A quick look at the two paths suggests a relationship between divisibility and invertibility; and that is the case.

A quasigroup is a set, , with a binary operation ““, such that

which can be read as for any two elements of , call them and , there are two other elements, call them and , such that and . Commutativity is not assumed, and we see both left and right operations. If commutation did hold, then ; however, do not assume that as yet.

Since and exist, by the definition of the quasigroup, we can introduce a notation for them: and . These are left and right division, respectively. This begins to give us the concept of an identity. We can define a left multiplication operator, as

and a right multiplication operator, as

And since we are assured of division in the quasigroup, we can also define inverse operators and as

and

Hence, . These operations correspond, for example, to . So, left multiplication and division by any quasigroup element define an identity, and so on.

Here we see that the definition of an identity element in the algebraic structure is equivalent to the existence of an inverse operation. In other words, the introduction of the inverse operation is logically equivalent to the existence of an identity element.

Anyway, back to the rational numbers… Writing them in the form “” with and integers, suggests that all the rational numbers could be laid out in the form of a table, as follows:

This table has the numerators of each fraction in the rows and the denominators in the columns. Equivalence classes of fractions are indicated by color; for example, 1/1 = 2/2 = 3/3 and so on. The arrows show a counting plan that demonstrate that the rational numbers are countable. The plan weaves diagonally through the table and moves directly along the first row and column when it gets to them. Since the elements of the table are countable, it is apparent that a natural number can be associated with each entry in the table according to a simple rule. In this way, it can be seen, if unexpected, that there are no more rational numbers than there are natural numbers.

One could extend this table in a four-fold manner by allowing the numerators and denominators to be negative. The table shown above would then be the upper right (+,+) quadrant, in the usual convention for a Cartesian coordinate system. This would introduce another set of equivalence classes; for example, the (+,+) “1” class (that is, 1/1, 2/2, 3/3, …) would also be equivalent to the (-,-) “1” class (-1/-1, -2/-2, -3/-3,…). Doing this would not impact the countability of the entire set in any way; we could simply count each of the four corresponding elements in each of the four quadrants at each of the steps shown in the table above. The entire set is as countable as before.

Hence, a rational number is an ordered pair of integers. Each ordered pair has a magnitude and a sign. Ordered pairs that have the same magnitude and sign are equivalent. Of course, the magnitude of the rational number is not equivalent to the magnitude of the 2-vector in the Cartesian plane that would be associated with the four-fold table just described. For example, would have a very different magnitude than , yet the magnitude of and would be identical. These are different concepts. What should stand apparent so far is the extent to which the rationals are composite constructs. Of course, the natural numbers are composite constructs as well; and it should come as no surprise that more complex sets of numbers are even more composite. The natural numbers are essentially repetitions of some basic operation; namely, or or . To call this natural number “4” simply obfuscates its origin. Likewise, to make it seem in any way mysterious that “3+1 = 1+3 = 2+2 = 4” is an equal obfuscation. Once we introduce an inverse operation, “-“, then every natural number becomes a member of an equivalence class; for example, “++++ = +-+-++++ = +++–+++ = …” *ad infinitum*.

In terms of the algebraic structures discussed above, the rationals are certainly a group under addition; since they commute, they are an Abelian group. The rationals are also an Abelian group under multiplication. This brings us to the concept of a ring, an Abelian group under addition in which multiplication is associative, and also distributive over addition; that is,

and

The idea of a ring leads on to the ideal of a field, which is a ring in which all the non-zero elements have a multiplicative inverse. Another way to consider the concept of a field is as a ring whose non-zero elements are an Abelian group under multiplication, as just mentioned above. Hence, the rational numbers are also a ring and a field.

But the main point of this post is that, yet again, we encounter a set of numbers that are composite; the rational numbers are composed of two integers through the operation of forming a ratio. Since the integers themselves are composite, the rational numbers are doubly composite. Perhaps the only number that isn’t composite is , which is arguably no number at all.

]]>In the early days of the demon, the notion was that he could decide which molecules had greater or lesser energy, and with this knowledge, decrease the entropy of the system by increasing the temperature of one side of the container and decreasing that of the other. By doing this without any expenditure of work, he violates the second law of thermodynamics. This operation is show in the first few images in the gallery below. In the next to last images, the demon is shown unmixing a mixture of gas molecules, and again, decreasing the entropy of his system without doing work. Finally, in a cycle due to Szilard, the problem is reduced to a single molecule in a box. The demon first detects which half of the box contains the molecule. He then introduces a diaphragm that separates the box into two halves. Knowing which half contains the molecule, the demon cleverly attaches a mechanism to the diaphragm so that it will lift a weight by virtue of the pressure against the diaphragm exerted by the molecule with its random kinetic energy at temperature, T. Work is done by the molecule. The diaphragm is then removed and the cycle begins again.

In the context of my last post on numbers, it should be immediately apparent that any of these cases involve counting. This is especially apparent in the model due to Szilard. Over the history of the analysis of the demon, it became apparent that this simple counting problem faced by the demon involved an amount of information equal to , where is the Boltzmann constant, and is the temperature, in other words, 1 bit.

From the point of view of classical physics, the problem in the Szilard model can be stated as either “the molecule is in the right half of the box” or “the molecule is in the left half of the box”. Without a measurement, there is a symmetry to the situation, because the truth of either proposition is unknown. The act of measurement breaks the symmetry by defining a truth value for these propositions: one is true, the other false. This symmetry breaking is not spontaneous as such; it is the consequence of an act by the demon. The result of the measurement is also a count: either “1” on the right and “0” on the left, or vice versa. Whatever the count is, one bit of information is obtained by making the count.

Furthermore, knowledge of the count allows work to be done. The acquisition of information yields work and hence energy. In the early analysis of this model, it was postulated that the balance between the work done per cycle and the entropy originated in the process of acquiring the 1 bit of information. However, a more complete study of the problem, in the context of digital computation revealed that the detailed balance between the work per cycle and the entropy increase of the system is actually achieved in the erasure of the information obtained at the measurement stage. In other words, if we were to construct a digital system to execute Szilard’s model cyclically, we would have to include a 1-bit memory storage element in our digital demon. At the end of the cycle, in preparation for the measurement step of the following cycle, we have to erase the information of the previous cycle. At the time this erasure is done, one bit of information is lost and entropy, equal to the work done, increases in detailed balance.

We do not have to erase that bit of information; we could store a history of the experiment. We could transmit a copy to a colleague. In this way, we could delay the increase in entropy, perhaps indefinitely. The work done is balanced by the information stored.

This model is a classical one. It involves the notion that the position and momentum of the molecule can both be estimated simultaneously in such a manner as to achieve work. In quantum mechanics, this becomes somewhat more difficult. The act of localizing the molecule to one-half of the box allows the uncertainty in the momentum to increase such that trapping it on the side were it was found becomes just about impossible. W H Zurek analyzed this problem and published a result, “Maxwell’s demon, Szilard’s engine and quantum measurements” *Frontiers of Nonequilibrium Statistical Physics*, ed. G T Moore and M O Scully (Plenum Press, New York, 1984, pp 151-61). Zurek’s analysis proceeds by accounting for the quantum states of the demon. He finds, in a manner consistent with what I mentioned earlier, that whatever the details of the quantum mechanical system under consideration, the demon must be reset from a final, entangled, measurement state back to a “ready to measure” state. This reset of the demon, again, removes one bit of information from the system.

In short, counting gathers information; preparing a new count loses information. I have been focussed on the case of counting a single item. What about counting up to 2 or 3? This question can be answered in terms of a successor function. We simply ask, “is there another one?” We can start at “O”, and ask, is there another one? We may count “1”. We then ask, is there another one? We may count “2”. We then ask, is there another one? We may count “3”, and so on. We are done when the answer to our question is finally “no”.

This raises an interesting question about the amount of information obtained at each of these recursive steps. Is it 1 bit per step? The answer, of course, is the expectation of the logarithm of the probability of the count at each step. In the Szilard model, the problem is such that there is only one molecule, and by construction, the answer to the count on the right or left side is equally probable. If we were, instead, counting all of the sheep on the Earth, getting the first 100 sheep counted is no news whatsoever. The count does not get interesting until we have a few hundred million on the books. (According to Wikipedia, the UN FAO estimates something over 1 billion sheep. There is some *a priori* information to prime your Bayesian thinking with.) In short, the amount of information depends on what we are counting. If it were decimal numeric representations for the natural numbers, there is no information in counting the next one at all, no matter how large the number is. It is guaranteed that there is a representation for the next number.

It might be worth reviewing our units. From thermodynamics, we have an equations of the form

where is a differential change in entropy, is pressure, and is a differential change in volume. In the present context, we are equating a certain amount of work to a change in entropy of at a given temperature, . Our unit of information as 1 bit comes from the use of base-2 logarithms instead of the natural logarithms commonly used in the equations of physics like this. It would be a trivial matter to re-express Boltzmann’s constant in a way consistent with base-2 logarithms, or equally, to divide all of our units through by the constant and express energy in terms of bit-degrees. That is, the expression above shows us that the units of energy are equivalent to degrees in units of times bits. Dividing by would yield units of mass in terms of bits. Interesting? Conversely, one might measure information in terms of Joules per degree instead of bits, multiplying through by .

Back to counting and state… In statistical thermodynamics, one generally computes a partition function , also known as a “sum over states”, that constitutes the total number of particles in all possible states of the system. Estimates of the probability of the occupancy of a particular state or group of states is simply obtained by dividing the occupancy number of the state or group of states by the partition function, . To mention that this is a counting operation and that it goes directly to an estimate of entropy is virtually trivial. Like a Maxwell demon, we may sample the energy of specific molecules in a gas at some temperature. That we obtain less information in finding one near the mean energy than in finding one at three times the mean goes directly to the expectation of the logarithm of the corresponding probabilities; that is, the information.

Here is a simple exercise in Planck units. The Planck mass is kg, which is equivalent to a Planck energy of Joules. The Planck temperature is a whopping Kelvin. That would give us a Planck entropy of . But this is exactly Boltzmann’s constant. The Planck entropy is precisely 1 in natural units.

Fascinating…

]]>Anyway, between spending nearly a month in the hospital, and then being signed up for physiotherapy on my leg a couple of times a week, I didn’t find another summer job that year. So, instead of lying around completely idle, I decided to pick up some books from the university library to get ready for my next courses. Since I was taking some logic classes, I started off with Russell & Whitehead’s Principia Mathematica. Among other things, I followed that up with Wittgenstein’s Tractutus Logico-Philosophicus. A little light summer reading. I’d sit in the back yard and let the sun bake my purple knee back to health, per the doctor’s orders, sipping on lemonade and boning up on logic.

The aim of the Principia was to derive all of mathematics from pure logic. I believe that the current view is that it was a massive failure. Kurt Gödel was especially critical of the notation, the lack of formalism, and errors. Eventually, Gödel published his incompleteness theorems, which pretty much demolished the axiomatic approach that Russell & Whitehead had attempted. For his part, Wittgenstein first criticized the Principia for how it handled infinities, (infinite sets or lists), and later criticized a revised version (which had been modified to handle his criticisms) for not being able to handle large numbers effectively.

Principia derived the notion of number from that of set theory. In spite of being attacked by the likes of Gödel and Wittgenstein, this set theoretical approach was what they taught me in high school mathematics. In Principia, a set is a simply by-product of a true statement of the predicate calculus. For example, I might assert that the statement that “there exists at least one such that the color of is blue” is true. For the Principia, this tautologically implies that there is a set of blue objects, call it and , meaning that is a member of the set .

Principia derived a set “1” in this manner. Here is how Principia derives the arithmetic for “1 + 1 = 2”:

You can read the statement as “it is true that for and members of the set “1” and if the intersection of and is the empty set (the symbol sort of looks like a capital lambda, but isn’t) then the set union of and is a member of the set “2”.

If you think for a moment about the cardinality of a set, which is basically the number of its members, you are already using natural numbers to count. So that concept of set theory presupposes a set of natural numbers and a counting operation. In Principia, the set “1” is defined, more or less, from the predicate calculus proposition “there exists x such that x is one thing” which is a bit like the ontological proof of the Deity in which existence is assumed to behave like an attribute. If we think of a set as being like a bucket that contains objects, there is a strong distinction between saying ‘the bucket is empty’ versus saying ‘the bucket does not exist’. The set “1” in Principia might have an infinite number of members. Likewise, does the set “0” have an infinite number of members, that is, is it a bucket full of empty containers, or is it something that doesn’t exist? Go back to the theorem from Principia that is reproduced above and you’ll see that there is a condition on and that their intersection is the null set. Clearly, the authors want to make sure they’re going to get a set of cardinality “2” in the union. The more you look at it, the more it seems to fall apart.

There is a view, stemming from Russell’s paradox, that this approach to identifying sets with definable collections (naïve set theory) is the heart of the problem. It is possible to define all kinds of self-contradictory properties, such as lists of all lists that do not include themselves that should include themselves, and so on. The fix comes in what is called Zermelo-Fränkel set theory with the axiom of choice (aka ZFC). In ZFC, there is a recursive definition of the natural numbers.

One begins with the empty set, {}, which presumably is a bucket with nothing in it. We then, recursively, put something in the bucket: . Hence “1” = {{}}, “2” = {{{}}}, and so on. The underlying idea is that of an initial condition and a successor. Whether the initial condition is “0” or “1” is a matter of dispute in the context of natural numbers; but that “2” is the successor of “1” is a matter of no dispute. The most natural and obvious criticism of this proposal is that it is just counting in the context of the symbols of set theory. To quote the author(s) of the Wikipedia article on set theory and number:

- Zero is defined to be the number of things satisfying a condition which is satisfied in no case. It is not clear that a great deal of progress has been made.
- It would be quite a challenge to enumerate the instances where Russell (or anyone else reading the definition out loud) refers to “an object” or “the class”, phrases which are incomprehensible if one does not know that the speaker is speaking of one thing and one thing only.
- The use of the concept of a relation, of any sort, presupposes the concept of two. For the idea of a relation is incomprehensible without the idea of two terms; that they must be two and only two.
- Wittgenstein’s “frills-tacked on comment”. It is not at all clear how one would interpret the definitions at hand if one could not count.

To my humble point of view, this notion of “0” and “1”, necessary to get the set-theoretic concept of number and counting going, is very closely related to the notion of state in classical physics. In classical physics, a system can be in one state only. A paradigmatic example might be a coin on a table-top that can be in either the “heads” state or the “tails” state. Now, in more modern physics, certain conditions of symmetry are allowed; for example, the condition in which the coin is tossed and still flying through the air has a symmetry between “heads” and “tails”. But this condition of symmetry requires a certain amount of energy, on the one hand, and is unstable, on the other. Sooner or later, the coin falls, the symmetry is broken, and one or the other condition arises. If we were tallying up “heads” and “tails” in an experiment, this is when we would add “1” to the appropriate column in our tally sheet.

“State” and “count” or “measure” are in many ways synonymous. We might count sheep, let’s say, or black sheep, or sheep in the west pasture or black sheep in the west pasture. But to count any of these kinds of sheep, we have to be able to either find them, or at a minimum, imagine finding them. I do not mean in the past-time of counting sheep to fall asleep, rather, I mean estimating the size of a large population where obtaining a complete count is impractical. In this, I am trying to draw a distinction between, say, counting the three sheep immediately in front of me versus counting all of the sheep in the universe. Even counting all of the sheep on the Earth is a problem fraught with potential error. I don’t want to count dead sheep or unborn sheep. But if the radius of the Earth is around 6400 km, a signal from the far side has to travel at least 6400 π, or roughly 20,000 km to reach me. A radio signal to give me a count of sheep from there will take almost 1/10 of a second to reach me, and that ignores any signaling protocols to reduce errors and guarantee message arrival and accuracy. And some sheep will die and some will be born in that amount of time. So, the state of being a live sheep is not a stable one for this count. It is rather like the symmetric condition of the coin in the air: it can and will fall and become “heads” or “tails”, but I can pick it up again and toss it another time. So it goes with the sheep: the unborn one is rather like the coin still in the air, it could become a live sheep any time now. The live sheep is rather like the coin on the table: it can be picked up any time now and tossed again too.

This brings us to the kind of state transitions that classical physics allows. We cannot have ambiguous state transitions; that is, we cannot allow a sequence in which one state might transition into two possible outcome states. And by time reversal, we cannot have the situation in which two possible preceding states both transition onto one following state. Paths of states must be clearly delineated. Classical physics does not allow for this idea of symmetry breaking. For the coin toss, the classical physicist would have to argue that, in principle, the final condition of the coin is predictable from the exact details of the toss. This could be true, but there are so many tunable parameters and so many errors in measurement that, for any toss, getting the answer right is (a) difficult and (b) pointless. Having a good model for a coin toss comes down to having a random number generator that yields two distinct states with equal probability.

One of my philosophy tutors from back in the day took this business of considering state into the example of a cat. We all know what a cat is, don’t we? It’s rather like my example of counting up sheep. But where is the boundary of the cat? What is part of the cat? What about the air it’s breathing? Is that part of the cat when the air flows into its lungs? Or when the oxygen gets into its blood-stream? When precisely does the air become part of the cat, if ever? What about the air mixed up in all of its hair. There is a lot of hair! Those air molecules might have been trapped in there for years now; they may have a better claim at being part of the cat than the air it’s breathing, which comes and goes. There are many questions about what’s part of the cat. Or part of us. Almost every molecule in our bodies is replaced every 7 years or so. Our energy store is even more limited, which is why we’ll starve in a relatively short time. And yet we maintain a notion of self decade after decade. My memories of seeing the skin peel of my knee are still “fresh” as it were, even though, apparently, whatever material part of me that information is encoded in has changed many times over. “I” remain continuous: a single thing: a “1”. There is just one of me.

This question of state in classical physics is a big problem in the transition to quantum mechanics and quantum field theory. The notion of state in quantum theory is very different than in classical physics. In classical physics, a system is in one and only one possible state. In quantum physics, a system is in a linear superposition of possible states, each of which has some probability, at least until a measurement is made that forces a specific condition. To give an example, say we have an electron which can have its spin up or down. If I take an electron and measure its spin, I might get up or I might get down. This seems rather like the state of the coin: I toss it and get heads or tails. Like the coin, if I measure the electron’s spin again, after measuring up, I’ll continue to get up. However, suppose I know check to see whether its spin is to the right or left. I will either get right or left; it’s a 50:50 deal, just like the previous test for up or down. If I measure right or left again, I’ll get whatever I got the first time. My measurement has frozen in the state I first found. But now, let’s say I go back and try to measure if the electron’s spin is up or down again. Now, it is once more a 50:50 shot as to whether I get up or down. My measurement of spin in the right-left axis has unfrozen the spin in the up-down axis. Similar tricks apply to photon polarization states.

The main point is that in quantum mechanics, the state of a quantum system is in a kind of limbo until a measurement is done. More elegantly put, perhaps, the state of the system is indeterminate until it is entangled with the state of a measurement apparatus. Typically, the state of a quantum system is given by a countable set of eigenvalues that correspond to a set of eigenvectors. The eigenvalues represent energy states and the eigenvectors represent the wave function for the system in the corresponding energy condition.

This is, arguably, a question that goes straight to the heart of what constitutes a proper concept of numbering in quantum theory versus that in classical mechanics. If integers, or reals, or vectors, or tensors, are the proper numbers for classical mechanics, then what is the right number for quantum mechanics?

Some very intelligent people have argued that number theory should properly precede set theory, and any attempt to derive numbers from sets has put the cart before the horse. I am rather inclined to agree. However, I am also struck by the strong relationship between the concept of state, classically, and the concept of number. This ties nicely with the “naïve” set theory model that a set is defined by some statement of the predicate calculus. Assuming that a statement of the predicate calculus has a truth value, then the classical mechanics notion of state tells us something very similar about a physical system; for example, it is true that the energy of the system is 10 Joules or that the speed is 299,792,458 meters per second. Classical mechanics is about true statements concerning reality. Quantum mechanics stumbles with notion: is it true that the electron spin is up or down? No, it depends: it always depends. And often enough, the answer is that the spin is both up and down. The law of the excluded middle completely fails in quantum mechanics. It is not that we do not know what the spin is: it is that the spin is both up and down equally. To bother at all with the truth value of the electron spin being up or down misses the point. We need a way to express the equality of the truth values of both up/down, right/left, back/front. The truth is all of the options, or as a physicist would say, the spectrum. Is the entire spectrum a number? More of this later…

Meanwhile, let’s begin with “0” and “1” (or {} & {{}} ) and see where we get.

]]>In my last post, I was going over the three kinds of space that could be broadly consistent with a homogeneous distribution of matter; this leading to a criterion of uniform curvature. To review briefly, this yields either a universe of positive energy and negative curvature in the form of a 3-hyperbola or a universe of zero energy and zero curvature in the form of a Euclidean 3-space or a universe of negative total energy and positive curvature in the form of a 3-sphere.

In this post, I want to introduce the metrics for these spaces and to show how these metrics would impact the measurement of angles subtended by distant objects. Along the way, we’ll see a few other interesting features.

But first, I’m going to use polar or spherical coordinates for this procedure. There is a simple reason: it makes imminent sense to imagine ourselves at the center of our universe since that is our (non-Copernican) point of view. We look out from where we are placed along radial lines. If we can distinguish what we would see at some given distance along such a line, depending on the curvature of our universe, then this may be a way to tell what that curvature is. Also, I’m going to start with the metrics of a space and then work to the metrics of the three versions of space-time.

For a simple, flat Euclidean space in two dimensions, we get a metric of the form

and of course, if we work in three dimensions, we’d have

In polar coordinates, this metric becomes

where is the polar angle. This corresponds to the construction of a circle, aka a 1-sphere, at a distance from the origin. If I consider the metric on a unit 1-sphere, it is just . For reference, I want to define this metric as . I can rewrite this metric for Euclidean space as

In other words, what this is telling us is that as I look out into a Euclidean 2-D space I see circles of radius at a distance of away from me. Of course, this makes all the sense in the world, in a flat space. What else would the radius of those circles be, if not ? So, our flat 2-d space can be thought of as being constructed out of a densely nested set of concentric circles expanding out forever.

By extension, we can go to a Euclidean 3-space. Now, we will look out and at a distance we have a the surface of a 2-sphere of radius at each distance away from our view point at the origin of coordinates. To consider this construction, we first have to get the metric for the 2-sphere, and we are going to do this using the metric on the 1-sphere (aka the circle).

We made a flat Euclidean 2-space by filling it with circles of radius at each distance . We get a 2-sphere, instead, by creating a finite space through a sequence of circles of radius at a distance . Frankly, this mathematical construct makes more sense we either set and only let run from to or set and let run from to . The idea of the construct is simple enough though: as we look out on the surface of the 2-sphere at some distance , we have a circle (1-sphere) of radius . The following picture may make this clearer:

[You can click on the thumbnail for a larger version.] For a world constrained to live on the surface of the 2-sphere, distance away from the origin, which I’ve chosen to be the “West Pole” as it were, is measured like a radian distance on the surface. I’ve shown the size of the maximum circle at a distance of and the radius of that 1-sphere is . In this way, the notion of distance and parametric angle are rather merged together subject to the constraint of a unit 2-sphere.

This construct is different than the case of the flat space in which the nested 1-spheres have a radius that grows linearly with distance; instead, the nested 1-spheres grow almost linearly to begin with (since for small ) but then they reach a maximum and finally begin to shrink back to . If a 2-dimensional astronomer lived on such a world, he might confuse his world for a flat one because locally there might be little difference if he could see out only over a small fraction of his world. This would be very much like our ancestors back in the Middle Ages who thought that the Earth was flat because they could only see an extremely small fraction of the entire surface.

With the idea that distance on the unit 2-sphere is equivalent to a radian angle, the metric on the 2-sphere is

Note the difference between this metric and that of the one for the Euclidean space; namely, the 1-sphere part is multiplied not by , but by . This metric can be called in a notation consistent with labeling an n-sphere as . We’ll use this construction in the form to develop metrics in 3-dimensional spaces soon enough.

Holding this model, we can construct a hyperbolic plane simply enough. Just as we look out into a 2-sphere and see a nested sequence of 1-spheres that have a radius of at a distance of , for the hyperbolic plane, we look out and see a 1-sphere of radius at a distance of . If we designate the unit hyperbolic plane as then its metric is just

Again, a 2-dimensional astronomer on such a surface might not realize any difference between this and a Euclidean surface since for small . So if he could see only out to a small fraction of the total hyperbolic plane, he might think that it was flat.

Now, we can extend these concepts to 3-dimensional spaces that are either flat, spherical or hyperbolic. For the flat 3-dimensional space, we have for flat space (call it )

For the 3-sphere

And for the 3-hyperbola

In each case, we look out to a distance and see objects at that distance on the surface of a 2-sphere. This is very much how our 3-dimensional world looks to us, in fact. We look out a few meters, and we see objects on a sphere of radius that many meters. We look out as far as our sun, and see the sun as if it were on a sphere of one astronomical unit in radius. So far, it seems as if our universe were flat. The question is, when we look out a distance of about 13.8 billion light-years, what are we seeing?

If our space were a 3-sphere, when we looked out a sufficient distance away, we would see a smaller set of 2-spheres. From a sufficiently great distance, we would, in effect, see the single point at the “East Pole” in every direction in the sky. Is this why we see the Cosmic Microwave Background Radiation everywhere? Yes and no… we have to add time into the picture to get to this view. So, hold your horses.

Likewise, if our space were a 3-sphere, as we looked off into the distance, we would see a decreasing number of galaxies becoming increasingly magnified in size, assuming that the density of stuff, including galaxies, were uniform. This is different from a flat 3-dimensional space with a uniform density of galaxies. As we would look out to some great distance , we would find a sphere of radius and we would count up a certain number of galaxies consistent with the overall density of them. But in a 3-sphere, we would eventually find some distance at which the number of galaxies began to shrink, and then approach .

Conversely, if our universe were a 3-hyperbola of some form, we would look off into the distance and see a vastly increasing number of galaxies; their number would grow at an exponential rate in a way that would apparently exceed their local density. This is because the size of each more distant 2-sphere is expanding exponentially, since for large .

Of course, as 3-dimensional astronomers, we might be fooled into thinking that our universe was Euclidean if we could see only some small fraction of the total. How might that work? Well, imagine that our universe were expanding in time such that objects further than some distance had a velocity that exceeded that of light and they vanished behind a cosmic horizon. Imagine that this had been going on for some time so that the total radius of the universe were, say, . In this way, the total volume of the universe would be roughly 1000 times more than what was visible to us. Of course, that volume would depend on the shape of the universe including what was invisible and behind the cosmic horizon; but you get the idea. Of course, this is just the situation we find ourselves in. is about 14 billion light-years and we can only extrapolate as to what is behind that horizon; just like a geographer in the Middle Ages would have had to extrapolate beyond his horizon on the Earth.

Let’s go back to the simplest possible view:

Here are our three candidate spaces as forms of line. The flat line has no curvature and goes on forever. The circular line bends around in such a way as to “bite its tail” and form a finite set. The hyperbolic line goes on forever but has a bend; and someone moving on this line, in effect, finds that bend moving along with them. One is always, locally, at that bend. Look at this, think about this. Let it sink in.

Meanwhile, imagine that we could find distant objects, say galaxies, that we knew at least an average size for. Say they were of length across and at some distance . We can use our three metrics to consider what angle they might subtend in our sky, depending on the curvature of our space. Our metrics are constructed in such a way that they have a radial component and then an angular component; and for a distant object like a galaxy, the radial component is nil. The integral of is just and we can consider that equivalent to a small angle . For each of our metrics then, we get

for flat space, or

for positively curved space, or

for negatively curved space. So, for a flat space, the angle subtended is inversely proportional to the distance. For a positively curves space, the angle is inversely proportional to the sine of the distance; and for a negatively curved space, it is inversely proportional to the sinh of the distance. Since is always less than , the angles are larger than in flat space. When begins to collapse back to , we’d find very distant galaxies would begin to look very large indeed. Conversely, in a hyperbolic space, is always greater than , the angles are smaller than in flat space. Hence, if we could look far enough into cosmological distances, we could tell something about curvature in this way. So, we can count galaxies at some distance based on a notion of uniform density, and see differences. We can also measure the angles subtended, and find differences depending on the kind of space.

So far, we’ve been looking at metrics for unit n-spheres, etc. We can scale these to any size by adding in a scale factor , where consistent with our previous models of expanding or contracting universes based on the Friedmann equations, we’ll let the scale factor be a function of time. This is easy to do by just multiplying the metric on the unit space by . We get for flat space ()

For the 3-sphere ()

And for the 3-hyperbola ()

This gives us time-dependent metrics for our three classes of space. As the scale factor is increased, the curvatures decrease for both the positively and negatively curved spaces. The curvatures are inversely proportional to the scale factor. On the grand scale of the entire universe, the curvatures become very small and very difficult to measure, especially if we factor in the notion that much of the universe is hidden behind a cosmic horizon beyond the Hubble distance.

It is now very straightforward to construct three metrics for three space-times based upon our three candidate spaces. They are just obtained by including a factor of into . I’m going to work with , so these space-time metrics are simply, for flat space ()

For the 3-sphere ()

And for the 3-hyperbola ()

All nice and compact. These are the space-time geometries that are strong candidates for our own universe, assuming homogeneous distributions of matter and energy. We can work out, for example, the paths of light in these universes, since these paths would be null paths; that is, for light.

Likewise, we can figure out the space-like distances between any two objects, such as galaxies, at some time. We could place our comoving cosmological lattice on any of these structures, and find that each will be perfectly consistent with the Hubble law. In short, everything I’ve worked out so far concerning expansion of the scale factor as a consequence of the presence of matter or radiation applies to each of the three models directly.

A note on my title, you have to pronounce the function as “shine” as opposed to “sinch”. Get it? sinhy spheres = shiny spheres. Sinhy means they’re expanding exponentially. Good pun, eh?

Next, I’ll go on and work with these metrics in the context of general relativity and what that says about cosmology.

]]>There are various sorts of objects that have non-uniform curvature; for example, an American football or a rugby ball. These are more or less ellipsoids. The curvature is greater at the ends than in the middle. There are surfaces like the cone where the curvature is concentrated at the tip. There are surfaces like a torus where the curvature depends upon the path you follow over the surface.

However, none of these surfaces can apply to a universe that is homogeneous: the curvature would have to be the same everywhere (at least at a grand scale, as I say). This raises only three interesting alternatives: positive, zero, or negative curvature. Positive curvature corresponds to a closed and finite universe with a topology like a 3-sphere. This implies a negative total energy, in the context of the Friedmann equations that I’ve been working with in the last few posts. A zero curvature corresponds to ordinary, flat Euclidean space; and this would imply a null total energy. Finally, there is the case of negative curvature: this implies a hyperbolic and open universe with positive total energy.

What can we say about these different sorts of space? In a previous post, I went over the development of some flat and positively curved n-dimensional spaces. In that post, I neglected the development of any hyperbolic n-dimensional spaces. However, the model for a hyperbolic surface is mathematically very similar to that for a spherical one. The essence of the difference is easiest to see in the 2-dimensional case:

is the unit 1-sphere or circle, while

is the unit 1-hyperbola.

One very interesting way to look at these curves is in terms of conic sections. If you have Wolfram’s CDF player, you can see a couple of demonstrations of conic sections below. These will allow you to play with the double cone in terms of various sections through it in three dimensions to see circles, ellipses, parabolae, hyperbolae, lines and points. You will also be able to get a feel for the two dimensional results. It is well worth installing the CDF player to be able to see this for yourself.

[WolframCDF source=”http://face-paging.com/wp-content/uploads/2013/04/ConicSectionsTheDoubleCone.cdf” CDFwidth=”685″ CDFheight=”700″ altimage=”http://face-paging.com/wp-content/uploads/2013/04/ConicSectionsTheDoubleCone.cdf”]

[WolframCDF source=”http://face-paging.com/wp-content/uploads/2013/04/ConicSectionsPolarEquations.cdf” CDFwidth=”685″ CDFheight=”700″ altimage=”http://face-paging.com/wp-content/uploads/2013/04/ConicSectionsPolarEquations.cdf”]

One exceptional way to tell the difference between being in a space that is either flat, positively or negatively curved has to do with the angles subtended by distant objects. For example, in a flat space, an object of length at a distance is at the base of an isosceles triangle of height and base . Knowing this, we can work out that the angle will be when . That’s in a flat Euclidean space.

However, angles and triangles don’t quite work out this way in curved spaces. Let’s approach this idea through the concept of a stereographic projection, which can map the surface of an n-dimensional object onto an (n-1)-dimensional plane. To keep things very simple, I’ll begin by mapping a 2-dimensional circle and hyperbola onto a line.

Begin with a hyperbola:

In this figure, you can almost see the conic section quality of the hyperbola as I have included the asymptotes that the hyperbola is, in effect, sliced out of. Now, imagine that we have a small line segment, on the surface of the hyperbola right where it intersects the abscissa. We could project this line segment onto the line shown at , and for small, we would simply find that the length projected is , by simple trigonometry. But this wouldn’t be true for line segments of length placed anywhere else along the surface of the hyperbola. In fact, as we move the line segment further and further away from the origin, the projection becomes increasingly small.

If we take this hyperbola and turn it into a surface of revolution, we obtain a hyperbolic plane as the surface. This last figure continues to work as a means of projecting the hyperbolic plane onto a Euclidean plane stereographically. We could now imagine tiling the hyperbolic plane with some figure, such as a triangle or square, that would cover the surface. We would then get a projection of the tiles onto our plane at . If you have the CDF player installed, then this next demonstration will show you the result of such a tiling in a stereographic projection.

[WolframCDF source=”http://face-paging.com/wp-content/uploads/2013/04/TilingTheHyperbolicPlaneWithRegularPolygons.cdf” CDFwidth=”685″ CDFheight=”700″ altimage=”http://face-paging.com/wp-content/uploads/2013/04/TilingTheHyperbolicPlaneWithRegularPolygons.cdf”]

If you set the number of sides for the polygon at 3 (a triangle), the number of polygons at each vertex at 7 (the minimum), and hit the “Centralize” button, you’ll get a tiling with triangles that is relatively simple to visualize. Note the important feature that the projection of the triangles becomes smaller and smaller as the distance from the origin increases. Note that on the surface of the hyperbolic plane itself, each triangle is of equal size. More distant triangles occupy less and less area in the plane of projection.

The opposite picture arises for the projection of a circle onto our plane. The following very simple figure shows a unit circle centered at with a projection line at . Now, assume a small line segment on the surface of the circle near the abscissa around . We project from the origin through this line segment, and again, using simple trigonometry, we expect a projected line segment of length about . Again, as we consider other line segments of the same length on the surface of the circle, we’ll get larger and larger projections as we bring these closer and closer to the origin.

By turning the circle into a surface of rotation, we get a 2-sphere covering a globe, and we can now project a tiling on the 2-sphere onto the plane. If you, again, have the CDF player, the following demonstration will show the result of such a tiling (almost). The demonstration projects Platonic solids onto the plane, rather than a tiling of the sphere; but if you select the dodecahedron, you will see the projection of what is almost a tiling by pentagrams on the 2-sphere. What is important to note is that, in strong contrast to the projection of the hyperbolic plane, we see that the projected areas of the tiles increase dramatically as their location approaches the “North Pole”, or the origin of the projection.

[WolframCDF source=”http://face-paging.com/wp-content/uploads/2013/04/StereographicProjectionOfPlatonicSolids.cdf” CDFwidth=”685″ CDFheight=”700″ altimage=”http://face-paging.com/wp-content/uploads/2013/04/StereographicProjectionOfPlatonicSolids.cdf”]

These observations go together with the concepts of an ordinary triangle (in Euclidean space), a hyperbolic triangle, and a spherical triangle. If you look at these topics in the links I’ve provided, you will immediately see that the hyperbolic triangles appear concave while the spherical triangles appear convex. Also, the sum of the angles in the hyperbolic triangle is less than while that for a spherical angle is greater than . This angular deficit or excess goes back to the curvature of the space that the triangle is embedded in. This idea is generalized as a Schwarz triangle.

By measuring the angle subtended by an object of known size at a known distance, it would therefore be possible to estimate the curvature of the space. For example, imagine that you are an astronomer and you know that some distant object, say a galaxy is precisely 10 light-years across and it is 1 million light-years away. From this, you can estimate the angle it should subtend in our sky by simple Euclidean geometry. If you measure the actual angle, you will either get what you expected, or a number that is lower or one that is greater.

If you were a cosmologist interested in the broad structure of the universe, you could do this same experiment for objects all over the sky. You could then come to a conclusion as to whether or not the universe was closed and finite (a sphere of positive curvature and negative total energy), open and flat (Euclidean and with 0 total energy), or open and infinite (a hyperbola with negative curvature and positive total energy).

So, is there anything in the sky that is at a known distance of cosmological scale and that has a known length that we could use to decide this question? The answer is “yes”: fluctuations in the Cosmic Microwave Background Radiation.

For a variety of reasons that I’ll explore in future posts, we can expect these fluctuations to be at about 13.8 billion light-years away and to be about 370,000 light-years across. By measuring the power spectrum of these fluctuations across the sky, one can arrive at a density in terms of angle:

The first peak is what is relevant for establishing curvature, and it is consistent with a flat universe. There are some issues though. It has been suggested that the higher angular variations are more consistent with a Poincaré dodecahedron than with a flat universe. Another possibility, with negative curvature, is a Picard horn.

This is an area of active research. More later…

TTFN

]]>where I am continuing with the same notation that I set up in that previous post: is the mass of some test galaxy under consideration, is the total mass within a sphere of radius from the origin of coordinates, is Newton’s gravitational constant, is the velocity of the galaxy, and is the total energy. Assuming all is well with the universe and that energy is conserved, this is a constant. It doesn’t matter right now whether it’s greater than, less than, or equal to ; it’s just some number.

Some rearrangement of this last equation yields

in which the constant involves the total energy and the mass of the test galaxy. By substituting for the matter density in this equation, and then by recalling that , we can find

in which I’ve just accounted for the volume of a sphere of radius . [You may want to go back to that previous derivation to get a feel for this equation.]

A quick glance at this equation reveals some simple facts. Assuming that the terms on the RHS are positive, then the LHS must also be positive (it is after all, the square of some real number), and this would imply a universe that would continue to expand forever, if it was expanding at any given point in time. The expansion might slow down or pick up, but it would have to stay positive.

I already considered the case in which the pertaining to the total energy was . In the very early stages of the universe, assume that was small. In this case, the term will trump the term; and so this case looks the same as the one in which the total energy is .

The complete solution to this differential equation is rather nasty in the extreme. Here is Mathematica’s representation of the solution:

What this is saying is that the small piece in square brackets at the end of the string, having to do with , is the argument to a certain inverse function. In that piece, is a constant of integration. The is the in the formulation of the differential equation. That argument should be substituted in wherever you see the . However, the actual function is not what is shown: it is, instead, the inverse of this function. Very nasty. Far better to do a numerical solution for particular values, if you wanted to get a sense for what is going on.

However, we can make sense of the other extreme without too much effort. At the other end, if becomes very large, then the term becomes less relevant than the term. So, we could write as an approximation to this case

where this constant is the square root of the previous one. This tells us that and the universe expands uniformly. So, this limit is quite simple also. Apparently, the solution begins out looking like something proportional to and crosses over to something just proportional to somewhere along the way.

If we let the total energy be less than , we get instead

In this case, it is possible for the sign of to change and for the expansion to reverse itself and become a contraction. This is just like the case of a particle in a gravitational field that begins with less than the escape velocity. Just as in the previous model, the universe begins to expand proportional to ; but at some point, the velocity of the scale factor reverses itself and in the limit becomes proportional to .

This set of solutions represents what is called the “matter-dominated universe” because all that we have accounted for so far is a uniform density of matter. This matter is assumed to very slowly moving (non-relativistic) and more or less fixed to a comoving cosmological lattice. We now need to consider energy, in the form of photons.

We can see that we have been playing with a form of Einstein’s Field Equations with the Friedmann equations that we’ve been pushing around here lately. If I recast the Friedmann equation into

we see this structure more plainly. The first term on the LHS will be part of a metric tensor. The second term on the LHS is a curvature. I have introduced a common convention of calling this constant and using another convention that the sign of is negative if the total energy is positive. More of this later. The term on the RHS is an energy-mass density, or basically the 00 term of the energy-stress tensor. So far, I’ve just accounted for, as I say, a matter density.

But we derived this equation from energy considerations alone, not curvature of space or space-time. In that model, the explanation involved a summation of terms involving kinetic and potential energy. The potential energy term has become a matter density on the RHS here. The total energy term is now cast in the role of the curvature and metric tensor term. The kinetic energy is now standing in for the Ricci tensor. We see a unification between the Ricci tensor for the curvature of space-time and a kinetic energy of space. Since the elements of the Ricci tensor may be either positive or negative, we begin to form the notion of a kinetic energy of space that may also either be positive or negative. Of course, this kinetic energy term originally pertained to the motion of a given “test galaxy” in the usual form of ; but if a uniform density of such objects acquire a motion because of the time dependence of the metric of space-time, the explanation of the source of the kinetic energy must account for this effect.

However, if we recall good old , or equivalently , then we can introduce an effective mass density that corresponds to some arbitrary energy density of photons into this Friedmann equation. To get to thinking about this, go back to our cosmological lattice:

The volume of a unit cube in this lattice is . Recall that the energy of a photon is where is Planck’s constant, is the frequency, is the speed of light, and is the wavelength. If we have a wave in some medium and we gradually change the dimensions of the medium, then the wavelength of the wave will change proportionately. This implies that the energy of photons is inversely proportional to the scale factor: . This effect is due to the fact that is a component of the metric of space. As the space stretches, so does the wavelength of each photon. Note that this is not due to any standing-wave aspect of the photons.

Here we find that there is an essential difference between the total energy in a cube of the grid for ordinary matter and for photons. For ordinary matter, the total, within a volume, does not depend on the scale factor. For photons, the energy decreases with increasing . So, for matter, we had that the density, . For photons, the energy density has one more factor of : we find that the density is . This is a consequence of a property called adiabatic invariance, which states that certain physical parameters will remain constant if some other quantity is very slowly varied. In this case, what is being varied is the metric of space-time; that is, the scale factor . The parameter that is remaining constant is the number of nodes in the wave character of the photon or wave-packet of light, if you prefer. Since, by Fourier analysis, any complex waveform can be seen as a linear superposition of component waves, we can simply think about this behavior in terms of the properties of infinitely long uniform waves. As the size of the universe expands, the number of zero-crossings of these waves are constant; hence, the wavelengths increase. If the wavelengths of all of the component waves increase, then so will the size of any wave-packet.

For a radiation-dominated universe then, we can redo the Friedmann equation in the form

in the specific case where the total energy is . The solution to this differential equation, ignoring various constants yields that . So, in a radiation-dominated universe (at a total energy of anyway) the universe expands as .

To compare then, at a net of energy, a matter-dominated universe would expand with a scaling exponent of and a radiation-dominated universe would expand with a scaling exponent of . These two relations are not wildly different, but the difference in the scaling exponents would yield significant and measurable differences over long intervals of time.

Going back to what I said earlier concerning kinetic energy of matter being affected by the time dependence of the metric of space-time, in a general relativity point of view, the same comments apply here. We begin to see this balance between the time-dependence of the Ricci tensor, which corresponds to (or yields) variations in the kinetic energy of matter or radiation, and the densities of these elements of the stress-energy tensor. As we continue to work our way through these equations, we’ll see different forms of energy in the form of pressures and tensions introduced by matter, radiation and what amounts to the cosmological constant.

One way to think about the case of light energy is to begin with the idea of a box of dimensions equal to one cube of the comoving cosmological lattice, but with reflecting walls. In such a hypothetical case, there would be a pressure on the walls of the box due to the impulse-momentum theorem as photons reflect off of the walls. For each photon striking a wall, there is a photon reflected back. In the real case of imaginary boxes, the result is the same in a homogeneous universe: for each photon leaving the box, there is another one entering. The net effect of a pressure is identical: the momentum leaving equals the momentum entering.

Let’s now roll both matter and radiation together in a Friedmann equation assuming a net of energy:

For small the radiation term dominates and for large the matter term dominates. We could envisage a universe in its early stages being dominated by radiation and then, as time progresses, matter becoming more significant. In this model, the early universe would expand as and then later, it would expand as . This claim makes a fairly big assumption that there is no conversion between radiation and energy; or that any conversion is consistent with the claim. This claim is therefore in need of some verification; but hold that thought for now.

The items that are comprised by the matter term include all the heavier particles that are essentially at rest with respect to our cosmological lattice. These include the ordinary matter particles with which we are all familiar: anything made out of quarks (like protons & neutrons) and at least some of the leptons (the electron and positron). However, there is another component to this, at least in theory; this component is so-called dark matter.

The evidence for dark matter is varied. One of the most fundamental aspects of this evidence comes from the rotational speeds of galaxies. From Kepler’s third law, we can write for the orbital angular frequency (in radians per second)

where is the distance of the orbiting object to the large mass at the focus of the ellipse. Notice that the angular frequency is proportional to the square root of the large mass, . For a galaxy, this implies that stars will orbit the galactic center in a way that depends on the total mass. By adding up the mass of all of the stars and dust, it is possible to get an estimate of what the mass should be. Of course, by the shell theorem, the relevant mass is just that within a sphere of radius . For surveys of hundreds of galaxies, it turns out that the orbital frequencies are higher by almost an order of magnitude that what would be calculated by just the visible matter. Other evidence for dark matter comes from gravitational lensing effects that exceed what would be expected on the basis of visible matter alone.

Assuming that we are not wrong in our assumptions about Newtonian mechanics and the general theory of relativity, this evidence suggests some other form of matter that is not accounted for in the standard model of particle physics, that has mass, and that does not interact with photons. A working assumption is some subatomic particle that has, as yet, not been discovered; probably because it does not interact with quarks or leptons; that is, with ordinary observable matter.

There is some more direct evidence as recently as this month (April 2013) from the Alpha Magnetic Spectrometer in terms of a high count of positrons (anti-electrons) that could be explained by the annihilation of dark matter particles in space.

Another set of experiments designed to search for dark matter involve the placement of detectors deep in underground mines around the world. This placement is intended to reduce the detection of cosmic rays, high energy photons from outer space. Assuming that dark matter interacts only weakly with ordinary matter, it is to be expected that it would easily penetrate into these deep mines much more easily than even the highest energy photons. By accumulating a sufficient number of detection events, it is hoped that a unique signature for dark matter particles would become evident.

A variety of candidates have been proposed for dark matter, one being the lightest of the so-called neutralinos, a particle predicted by supersymmetry. There are other candidates broadly known as WIMPs (weakly interacting massive particles). I’ll come back to these ideas in later posts.

At present, dark matter is estimated to be around 85% of the matter in the universe, with only 15% being the ordinary matter that we are familiar with.

It is worth touching on the fact that the Hubble law, , tells us that the more distant an object is from us, the greater its velocity relative to us. This velocity can exceed the speed of light, and this could be taken as a violation of special relativity, which states that nothing can move faster than light. It also seems to raise some vexing questions about the proper form for an expression of kinetic energy; that is, or .

First note that those distant galaxies that are receding with a velocity greater than are no longer visible to us: they have passed a kind of horizon not unlike a Rindler horizon. So these distant objects can not send us any energy or information. On their side of the horizon, everything seems perfectly normal locally, just as it does to us on our side of the horizon. We have no sense of hyper-speed motion, because locally we are not traveling at speeds exceeding .

So, recession at great distances at relative velocities that exceed is perfectly feasible, with the addition of a horizon through which information cannot flow. Of course, that horizon, from our point of view, is very far away, about 14 billion light years. Someone who was on a planet 7 billion light years away from us would presumably see their horizon at a different place than we do. They would still have a horizon about 14 billion light years away from them. This suggests that the universe could be much larger than we can see from our vantage point. More of this later, too.

The picture that I have drawn so far is, in many ways, false. At present, we do not believe that the universe is either expanding at some rate like or or collapsing back after a previous phase of expansion. Rather, we find that the universe is expanding at an accelerating rate. Something has been missed in the equations I’ve presented so far, since nothing in them would provide for an accelerating expansion.

If the picture I’ve painted so far were accurate, then the velocity of recession of distant objects would gradually reduce, and perhaps even reverse itself. In this case, objects that are presently beyond the horizon would become visible again as their velocity slowed to less than . However, our current understanding suggests that this is not that case and that galaxies that are not yet beyond the horizon will gather speed and disappear from our point of view. And this will continue into the foreseeable future until perhaps only our galaxy or local cluster remains visible.

However, Houston, we have a problem. We still have nothing in our equations that comes close to what is actually observed. On to this problem in my next post.

]]>This is consistent with an assumed symmetry of the problem; namely, that on a grand cosmological scale the density of stuff is uniform everywhere. In this way, it doesn’t matter what location is chosen for the center of mass in the initial problem. Anywhere is as good as anywhere else. If we did have a problem with a specific center of mass, then our test mass (our chosen galaxy) would be attracted towards that center. But in the case of this uniform distribution of matter, what does it mean to say that our galaxy is attracted towards the center?

Of course, the answer is quite simple. It means just what it says. If anywhere is as good a center of mass as any other place, then everything is collapsing together. Or expanding apart; that depends on the conditions of the problem, as we’ll see.

There are a few different ways to think about this. One way is to postulate, perhaps as in general relativity, that space is a manifold that has properties such as curvature, a metric, and so on. In this view, the metric of space is dependent upon the matter and energy within it; and the notion that space would have a “scale factor” that would depend upon the density of stuff within it is not at all surprising. After all, that’s just what Einstein’s Field Equations are about: matter and energy curve space-time, and space-time supports matter and energy.

Another way to think about this is to assume that space is nothing but measurable relationships between matter and energy. To say that space is expanding (or contracting) is just another way of saying that the measurable distances between objects that are not bound by strong forces are increasing (or decreasing).

Either way that floats your balloon is equally valid, as far as I can tell you. I personally like the model that space and space-time have certain properties derived from a metric that is itself an inherent attribute of the manifold. But that’s just me. It makes thinking about certain objective computations more natural.

You should understand that this cosmological expansion (or contraction) does not mean that our galaxy is getting bigger or that our solar system is getting bigger or that our Earth is getting bigger or that we are getting bigger or that protons are getting bigger or whatever. All of these elements are bound by forces that are much greater than whatever is driving the expansion of the universe, which kicks in only at scales above about , where is the Hubble distance (about 14 billion light-years). At scales below this value, material is collapsing, galactic clusters are aggregating, stars are forming, galaxies are colliding, dinner is cooking, neutrons are decaying, and so on.

Now, on to weirder forms of matter and energy…

]]>I want to expand on this idea in the present post. To change the focus slightly, let us imagine that we are given a sequence of numbers extracted from a Gaussian probability distribution, with mean and variance : . Assume that these are independent of one another and each identically distributed. In other words, there is no correlation between samples. If we establish a sampling histogram of these numbers, then we should find, after accumulating a sufficient number of samples, that there will be some finite number of samples at any arbitrary distance from the origin. It might take a while, but in principle, this distribution is non-zero for any finite line segment along the real-number line, .

Let’s now take pairs of numbers from this set, . We now have a probability distribution in a plane defined by these pairs of numbers. Again, with a sufficient accumulation of samples, we should find that a sampling histogram will eventually fill the entire plane; that is, there is a finite non-zero probability of finding some sample in any arbitrary element of area of the plane, .

We can go up another notch by taking ordered triples, . We now have a Gaussian distribution in 3 dimensions; but the same logic applies. Accumulate enough samples and any volume element, , no matter how far away from the origin, will eventually include a sample of this random process because there is a finite and non-zero probability of the occurrence of whatever combination of 3 random numbers might exist in that volume element.

We can continue to do this up to any arbitrary dimensionality that we choose. However we segment the n-dimensional space in which we are embedding the random process, we will eventually find an element of the sample set within that segment. To make matters simple, we divide the n-dimensional space into n-dimensional boxes of some on a side. Each “hyper-box” is of volume . If we count up boxes on the basis of just whether or not they contain at least one element of the random set, without any concern for the density of elements in any box, then every possible box with any choice of scale, will be counted eventually. It might take a while, and longer at higher dimensions; but in principle a space of any arbitrary dimension will be filled by a truly random process of the sort I described.

The picture would be quite different for a uniform random process with equal likelihood on the interval . Going up in dimensionality in a similar fashion would simply fill a hyper-cube of length on a side. We could never get an n-tuple like out of such a procedure since any individual component could never be greater than .

Now consider the following system of differential equations:

which is known as the Lorenz system. For , , and , this system of equations yields a solution that goes onto a *dynamical attractor* independent of the initial conditions. The concept of a dynamical attractor is not wildly complicated in its essence. For example, a damped harmonic oscillator has an attractor that spirals into a state of zero energy. An undamped harmonic oscillator has an attractor that involves the continuous exchange of kinetic and potential energy between two state variables. This Lorenz attractor can be diagrammed in a 3-dimensional space of its state variables .

Here is a gallery of different views of 10,000 points of a numerical solution of this set of equations with the parameters as specified in the last paragraph [Click thumbnails for a larger view]:

The correlation dimension of this system is about . One can almost see that topologically it appears to be very nearly comprised of two sheets of spiral flows. That the dimension is slightly greater than can be seen from the slight depth of each sheet.

Suppose now that we generate an “observer” of the internal state of the Lorenz system by simply adding up the state variables at any time; that is, we define

for such a solution sequence as I’ve graphed above. Suppose we accumulated a long sequence of such samples and then we repeated our process of embedding them in spaces of higher and higher dimension. Would we find that this sequence filled the embedding space in the same way a truly random Gaussian process does? Would we find it filling up just some sub-set of the space like the uniform probability process? Or would we discover some other behavior?

The answer is that the correlation dimension of this observer sequence will turn out to be the same as that of the original state variables. This was first noted by Grassberger and Procaccia in 1982 (See **Measuring the Strangeness of Strange Attractors**, Physica D (1983) 189-208). We could actually have taken almost any linear combination of the state variables for this exercise, including any one of the state variables by itself.

This makes the observation about the scaling exponent of the correlation dimension of matter in the universe at lengths below about very interesting. It identifies an almost sheet-like structure of galaxies, galactic clusters, and super-clusters, which “dissolves” into a uniform (random) distribution of mass at longer scales.

A reasonably compelling explanation for this difference in distribution of observable matter is the effect of gravitation since the Big Bang on relatively small initial fluctuations in energy density. A process known to yield fractal dimensions of around is diffusion-limited aggregation (DLA). One working hypothesis is that the correlation dimension of about of observable matter in the universe at distances below is that a DLA process, in which diffusion of matter under a local force of gravity aggregates towards initial areas of higher density, has led to the observed patterns of galaxies and their clusters and super-clusters. At scales above this clustering distance, a combination of other effects must have prevented DLA from working, at least so far in time.

Interesting, no?

]]>Later, in developing a couple of simple forms of the Friedmann equations, we encountered something of the form . This raises a question.

The answer is simple. The scale factor depends on the scale of the comoving cosmological grid that’s being used. Since proper distances depend upon the product of the scale factor and the size of the grid, , changing the size of the lattice must have an inverse effect on the corresponding scale factor assuming that the proper distance, , is a given.

Hence, it is only ratios of the scale factor that can be invariant. For example, independent of the choice of grid, the value of ; that is, the ratio of the scale factor at two different times, and , would be an invariant as to the choice of the scale of the lattice. Likewise, in the Hubble constant, we divide the time derivative of the scale factor by the scale factor itself in order to get an invariant measure of the rate of change of the scale factor. The same is true of the acceleration of is expressed in the Friedmann equations.

If we expected fluctuations in , we would write these in the form as well. This is not surprising either; fluctuations in some parameter are often expressed in this way. Then, we can consider a fluctuation of, say, 1%, as being greater than one of, say, 0.01%, or alternatively, 1 part per million, by some invariant factor.

A similar question arises with regard to scaling the density of, say, matter, in our comoving cosmological grid. We introduced a density that was invariant to the grid, call it ; but this value will again depend upon the scale of the lattice. We could not call this a *proper density*, if such a term were used. Instead, we employ as a measure of the density of matter relative to a proper volume comprised of a sphere of a radius of given proper length, .

In this way, when we wrote a simple form of the Friedmann equations for a universe with a net of energy

we had invariants on both sides of the equation. This equation expresses an exact balance between a kinetic energy term on the LHS and a potential energy term on the RHS. Not too surprisingly, they have the form of Einstein’s field equations; and the kinetic energy term on the LHS also has the character of a metric of space that is in balance with a 00-component of a tensor on the RHS.

However, this form of the equation contains an implicit occurrence of the scale factor in the term that we have to make explicit in order to solve the equation, . Assuming that we are working with a density of ordinary matter, made up of protons, neutrons, and electrons, then we can just about assume that is some constant. Protons are stable and slow, and we can hang their numbers to our cosmological grid of some grand scale. Neutrons are not quite so stable though, but decay into protons and electrons on their own with a half-life of about 14 – 15 minutes or so. The problem is not so much getting out another proton from a neutron decay, the quantity of matter is roughly equivalent, it is getting out the occasional photon.

This puts us in mind of the fact that we have so far been considering only the density of ordinary matter in our equation for density, and we are well aware that we should really be adding up all of the energy present on the RHS of our last equation. This is true even if we want an exact null balance between that kinetic energy term on the LHS and a matter-energy term on the RHS.

In subsequent posts, we’ll begin to account for these other terms that have to be considered in a complete treatment …

]]>So, where are we starting and going? Let our test mass or chunk of matter be a galaxy. We go back to our cosmological lattice and imagine that our galaxy under study is at some location on the lattice with respect to its center of coordinates. In the general vicinity of the center of coordinates and our test galaxy is some uniform distribution of matter.

By extending this lattice as necessary, we can establish that the distance between the origin and our test galaxy is just , where is the scale factor and the coordinates are the comoving coordinates of our space. If I just define , then I can set , and is a constant. Then the velocity of this test galaxy is and the acceleration is . [Note that I’m using capitals for and to avoid confusion with the scale factor .]

Our picture so far is identical to the one in my last post, except that I have changed the scale significantly. The test chunk of matter is now a galaxy and the distances are on the order of tens or hundreds of megaparsecs.

So far, we’ve just made some simple definitions of terms. Now, let’s assume that we can apply the shell theorem with a center of mass at the center of coordinates (hold your nose!). We assume that the total mass within a sphere of radius centered at the origin of coordinates is . If we define the mass of the test galaxy as , then the force on it will be

where the minus sign implies that the force is attractive. By dividing out the mass of the galaxy, , we can get an expression for the acceleration, , and setting this equal to , and substituting for , we find

Then, dividing both sides by , we get

But the volume of the sphere centered at the origin of coordinates and with radius out to our test galaxy is just , and the density of matter in the sphere is . So we can rewrite this last equation as

Here is an interesting result. Assuming that we have a homogeneous universe with a uniform density at some grand scale, the actual scale becomes irrelevant; it has been removed from the equation. We could have picked any point for an origin of coordinates and any test galaxy at any distance from the origin, and we’d have come to the same result.

What does it mean that there would be an attractive force towards any arbitrary location? We could have picked the origin of coordinates to be at any arbitrary location in the previous derivation. Any location could be the center of mass. The result is a differential equation for the scale factor, , which tells us that the acceleration of can only be if the universe has no matter in it. Since the scale factor is component of a metric of space, this equation tells us that this metric cannot be static if space contains matter.

There are other issues are pertinent to our working scale. As we saw in previous posts, for scales of the order of planets, solar systems and galaxies, where the distribution of matter is not uniform and where we can pick out the center of mass of a system otherwise isolated by distance from more remote objects, the equations of motion yield harmonics, Kepler orbits, Lissajous orbits, and other such classical motions. As we begin to work up to scales of the order of many galaxies, we encounter something completely different, which has been called variously a clustering heirarchy or a fractal [See for example, **The Fractal Galaxy Distribution**, by P.J.E. Peebles, in *Fractals in Physics*, ed. J. Feder & A. Aharony, North-Holland, (1990), p.273-278, *aka* Physica D 38.] At these intermediate scales, the distribution of matter appears to be a fractal whose correlation function scales with a power of -1.77±0.04 [See H. Totsuji, & T. Kihara, Publ. Astron. Soc. Japan 21 (1969) p.221, and P.J.E. Peebles, Astrophys. J 189 (1974) L51.] This scaling appears to hold up to about twice a clustering length of roughly , where is the Hubble distance, estimated at about 14 billion light-years. So, the cut-off for fractal scaling is around 56 million light-years. For scales larger than this, the distribution of matter becomes homogeneous; and we can begin to think about a uniform density of matter.

So, earlier I pointed out that there is a preferred frame of reference, one that is stationary with respect to the cosmic microwave background radiation. Now, we find a preferred scale, one that is large enough that the distribution of matter is homogeneous.

If we consider this uniform distribution of matter with respect to our comoving cosmological lattice, we can think of a density such that where is the length of a side of a cube of the comoving lattice. This gives us that . We can then substitute for in our last equation to get

This is one form of the Friedmann equations, about which we are going to have much to add in this and subsequent posts. While Friedmann derived this form from the trace of the Einstein field equations, there has been nothing fancier than good old Newtonian mechanics in this derivation so far.

Assuming that must be positive, this last equation suggests that the acceleration of the scale factor should be negative; but whether the scale factor is actually increasing or decreasing depends upon initial conditions that we have not worked out yet. That is, we have not considered as yet. So, the scale factor could be increasing but at a decreasing rate, just as an object thrown from the Earth might leave the planet if it is started with something greater than the escape velocity.

Nonetheless, this equation is a model of a decelerating universe; but in point of fact, our universe appears to be expanding in an accelerating manner. So, something is missing. Let’s go back and start over using energy conservation for our test galaxy:

The total energy can be less than, equal to, or greater than depending upon whether or not the kinetic energy is greater than the gravitational potential energy. The escape velocity is the critical point when the total energy is , and . By analogy, the entire universe is similarly situated: if the total energy is , it expands at a decreasing rate. If the total energy is greater than 0, it keeps on going. If the total energy is less than , it eventually contracts.

Substituting for and in terms of the scale factor and , we get

Assume for now that the total energy is , then after some rearranging,

We’ll come back to the cases where the total energy is not later. Recognizing that the volume of our sphere is , we can rewrite this as

Recall that is the Hubble non-constant. This is another form of the Friedmann equation, derived from energetics. Substituting for in terms of ,

There are several ways to solve this differential equation, and without belaboring the point, the result is

assuming that for being the present time. In other words, the scale factor increases as the rd power of the time. Note that the actual value of depends upon our choice of scale, so the constant on the LHS is somewhat fungible.

Such a universe with a scale factor of this form would look as follows, where I’ve just set the constant multiplying to be :

[Click on the image for a larger version.] With the constants set up so that at , the units of time are the age of the universe, about 14 billion years. So, in this version of the universe, it simply keeps on expanding although at a constantly reducing rate.

In coming posts, I’ll start to include various forms of energy as well as matter into the picture. Each component will add its own unique element to this dynamic picture of cosmological space and time.

]]>