>>As noted by Toby Bartels I did not say that QM is statistics, just that
>>the idea of the collapse of the wave function plays the same role in QM
>>as conditional probability does in statistics.
>How could it do that?
OK, first of all, I'm going to use the Heisenberg picture,
according to which the state doesn't evolve with time
but instead each moment in time has its own set of observable operators.
One consequence of this is that every change in the state
will be a collapse (or something that might be called "collapse").
To begin with, I'll look at a classical statistical system.
This is described by a Hamiltonian, which we assume has been solved.
(This is a practical impossibility in most statistical systems;
mathematically, however, exact solutions exist.)
This space of solutions forms a topological manifold X
(there's further structure on X, but I won't need it).
If the system consists of a bunch of particles,
X can have as coordinates the positions and momenta
of each of the particles at some given time.
That is, specifying all of these values at once determines
the values of all the positions and momenta at any other time.
If the system includes fields, X is infinite dimensional,
and a typical coordinate is the value of some field at some place and time
or the value of the field's first derivative.
Now, any function of X can be thought of as an observable.
In anticipation of quantum mechanics,
we allow not only real functions but also complex functions.
For example, if l is an observable which refers to a certain length,
then to observe il you measure the length l with a ruler,
write down the value of the mark on the ruler, and then write "i" afterwards.
The only things that we can reasonably expect to observe directly, however,
are *continuous* functions of X, so we'll restrict attention to those.
The continuous functions form a space with a lot of structure;
what I need is the structure of a * algebra:
one can add functions, multiply them, multiply them by complex numbers,
and take their complex conjugates (or adjoints).
There's one more structure I'm going to need: the norm.
If a function's magnitude has a supremum on X,
then that supremum is the function's norm.
The space of functions whose norm is defined is C_b (X).
Note that C_b (X) is closed under the algebraic operations;
in fact, it has the structure of a commutative C* algebra.
Essentially, all observables actually belong to C_b (X).
For example, consider a free particle in R^3.
We think of its position along the x axis at a given time as an observable,
but no actual measuring apparatus can measure every possible position.
A real measuring apparatus has boundaries,
and this is reflected in the boundedness condition of C_b (X).
(Nevertheless, you can recover any continuous function on X
as a limit of functions in C_b (X).)
Incidentally, if A is *any* commutative C* algebra,
then A is precisely equivalent to C_b (X) for some topological space X.
Furthermore, X is unique up to homeomorphism
if you require it to be compact (which is always possible);
this unique compact X is called "Sigma (A)".
Of course, X is not compact in many real physical situations,
and Sigma (A) is not always a manifold;
but, if you finesse it, you can study classical mechanics
from a point of view where you *start* from a commutative C* algebra.
Everything I do below will work in such an arbitrary case.
However, we don't actually need to take that point of view here.
(BTW, Sigma (C_b (X)) is the Stone-Cech compactification of X.)
Now, the actual situation of the system is some point in X.
But let's suppose you're ignorant of the actual state.
What you then have instead is a probability distribution on X.
This induces a probability distribution for each f in C_b (X),
one which gives the probabilities of the possible values of f.
Now, it's reasonable to expect that the probability distribution is Borel,
meaning that it's sensible to speak of
the probability that the actual state is in any given closed set.
If this is the case, then the expectation value of f
is guaranteed to exist for all f in C_b (X).
Furthermore, you can recover the probability distribution on f
by examining the expectation values of f, f^2, f^3, and so on.
Expectation values of products encode information about correlations.
So, you can describe the probability distribution
by specifying the expectation values of all functions in C_b (X).
This information really is enough to recover the entire distribution,
because you can examine functions which vanish
except on arbitrarily small neighbourhoods of any point x in X
to determine the probability that x is the actual state of affairs.
Now, there are certain conditions that the expectation values must satifsy.
Let E be the function from C_b (X) to the complex numbers C
which maps every function to its expectation value.
If f and g are functions in C_b (X), E(f + g) must be E(f) + E(g).
If c is a complex number, E(cf) must be c E(f).
The function 1 which is 1 at all points in X must have E(1) = 1.
E(f) is always a nonnegative real number whenever f is positive
(meaning f takes only nonnegative reals as values).
In summary, E is a positive linear functional of norm 1.
Conversely, any function satisfying these properties
corresponds to some probability measure on X;
let the probability that the actual state of affairs is in the set A be
the infimum of E(f) as f ranges over all positive functions
which are at least 1 everywhere in the set A.
So, call such a function E a "state of knowledge".
(Essentially, a state of knowledge is a probability measure on X,
but we'll find it useful to examine it from the point of view of E.)
Note that a state of knowledge is a different kind of state
from a state of the real physical system (which is a point in X).
However, each point x in X gives rise to a state of knowledge delta_x,
defined by delta_x (f) = f(x) for every f in C_b (X).
This state of knowledge corresponds to
knowing beyond any doubt that x is really the case.
It is a *pure* state in the sense that no increase in knowledge is possible.
You can combine two states E and F to form a new state aE + bF,
where (aE + bF)(f) = a E(f) + b F(f), if a >= 0, b >= 0, and a + b = 1.
(Note that a and b add like ordinary probabilities,
necessary for the requirement that (aE + bF)(1) = 1.
Also, like probabilities, they must be between 0 and 1,
necessary for the requirement that (aE + bF)(|f|) >= 0.
In fact, if E and F are pure states,
corresponding to say the points x and y,
then aE + bF is a state of knowledge in which
the situation is x with probability a and y with probability b.)
Such a combination is called a "mixture".
You can also form mixtures of more than two states.
Another way to define a pure state is that it
isn't a mixture in any fashion of different states.
Now, suppose you have one state of knowledge E,
and you're given some new information, a measurement of some kind.
Now your knowledge is different, and you have a new state of knowledge.
If you've been talking about probability distributions on X,
then you'll change the distribution to
the appropriate conditional probability distribution,
depending on the new information.
From the point of view of states of knowledge,
the state E is replaced with a new state F that includes the new knowledge.
If the knowledge you gain is that
the actual state of affairs belongs to the measurable set A
and f is a positive observable in C_b (X),
F(f) is the infimum of E(g) as g runs over
all positive functions in C_b (X) which equal f on A,
divided by the result of this process for f = 1
(which is the division by the probability of A in conditional probability);
arbitrary functions can be recovered
as a linear combination of positive functions.
For example, if E is the state a delta_x + b delta_y,
for points x and y in X and a + b = 1,
you observe the observable f to be 0, and f(x) = 0 but f(y) = 1,
then your state of knowledge changes to F = delta_x.
These changes are sometimes called "collapse"
in anticipation of their generalization to quantum systems,
but others reserve that word for a special kind of collapse
that only occurs in quantum mechanical situations.
Note that this change in the state E
is not an evolution according to laws of physics.
That the world changes with time is reflected in
that x(t_0) and x(t_1) have always been different observables.
The laws of physics are simply equations relating these observables.
But the change in the *state* reflects a change in your knowledge
and is intimately connected with you as an observer.
Also, note that you can collapse only from a mixed (impure) state
to another mixed state or a pure state;
you can never move between pure states or go back to a mixed state.
(Of course, if you *forget* information,
then your state may remix, but that's a different phenomenon.)
OK, having done classical statistics, let's move on to quantum statistics.
From an abstract point of view, the new twist in quantum mechanics
is that the C* algebra of observables no longer commutes.
(Incidentally, this means that not every element of the C* algebra
can actually be regarded as an observable.
That's OK; even in the classical case,
they didn't *all* have to be observables, although that was allowed,
but what matters is that every observable is in the C* algebra somewhere.
Traditionally, one accepts only Hermitean elements (f = f*) as observables,
although, mathematically, normal elements (ff* = f*f) are good enough.)
A typical example of a C* algebra is the space B(H)
of bounded (or continuous) linear operators of a Hilbert space H.
In fact, this example is more than typical;
just as a commutative C* algebra can always be realized
as the algebra of bounded continuous functions on some topological space,
so can an arbitrary C* algebra always be realized
as a subspace of the space of bounded linear operators on some Hilbert space.
For example, C_b (X) (which happens to be commutative, of course)
acts by multiplication on L^2(X) if you define the Lebesgue space L^2(X)
with respect to a sufficiently well behaved measure.
Let's take the simple -- and standard -- case
where the C* algebra is the *entire* space B(H) for a Hilbert space H,
not just a subspace thereof.
Define a state analogously to the classical case
as a positive linear functional of norm 1.
(You may wonder how to determine whether an element of B(H) is positive.
There is a very general mapping of such notions to arbitrary C* algebras;
in the case of postivity and B(H), the requirement is
that the element T be Hermitean with only nonnegative eigenvalues,
or equivalently that be a nonegative real for all |psi> in H.)
You can form mixtures of states just as in the classical case.
As in the classical case, E(T) is the expectation value of T
corresponding to the state of knowledge given by E,
and you can find the probability distribution of T
by examining E(T), E(T^2), E(T^3), and so on.
However, the existence of situations where E(TU) != E(UT)
shows that there can be no probability distribution on all of reality
which will account for all the correlations between the observables.
One quite common example of a state is given by a vector |psi> in H.
Define the state E_|psi> by E_|psi> (T) = .
This satisfies the requirement E(1) = 1 iff |psi> has unit length.
E_|psi> = E_|phi> iff |psi> and |phi> are equal up to a phase.
If you care to work it out, you can calculate that
states corresponding in this way to unit vectors
are pure in the sense that they're not mixtures of different states.
Note that a superposition of pure states is just another pure state,
where a superposition of E_|psi> and E_|phi> for = 0
is any state of the form E_(a|psi> + b|phi>) for |a|^2 + |b|^2 = 1.
This is because a|psi> + b|phi> is every bit as good
a unit vector in H as |psi> and |phi> are.
In particular, the superposition E_(a|psi> + b|phi>)
is *not* the mixture a E_|psi> + b E_|phi>.
(Heck, you don't even have a + b = 1 if a^2 + b^2 = 1,
except in the degenerate cases where one of a and b is 0.)
Also, if the C* algebra is only a subspace of B(H),
you may find that the unit vectors no longer correspond to pure states.
(In fact, if you represent C_b (X) as a subalgebra of B(L^2(X)),
unit vectors in L^2(X) in general *never* correspond to pure states.)
In nonstatistical quantum physics, one uses only pure states.
Another example of a state, heavily used in
statistical quantum physics, is the so called "density matrix".
A bounded linear operator D is a density operator iff
its trace exists and equals 1 and it's positive as a member of B(H).
These are just the conditions necessary for E_D (T) := trace (DT)
to be defined and satisfy the requirements of a state.
Unlike with unit vectors, adding density operators
corresponds to forming mixtures, not superpositions.
Also unlike unit vectors, different density operators lead to different states.
But unit vectors can actually be seen as a special case of density operators;
the vector |psi> corresponds to the operator |psi> has unit length,
and |psi> and |phi> differ by a phase.
A density operator is a pure state iff it corresponds to some unit vector;
also, any density operator can be decomposed into mixtures of pure states.
Thus, you can think of a density operator as describing
several possibile pure states, each with some probability.
Density operators are just an aside, to show how
the standard machinery of quantum statistical physics fits into this framework.
What you need to know now is how to change your state after a measurement.
Let's suppose you observe the observable T to have value c.
I'll need a definition: a state E is an eigenstate of the observable T
with eigenvalue c iff E(T^n) = c^n for every natural number n.
(Note that, in the case E = E_|psi>,
this means |psi> is an eigenvector of T with eigenvalue c.)
If you're in an eigenstate of T with eigenvalue c,
you're guaranteed to measure T to be c.
Conversely, if you measure T to be c,
then your state ought to become an eigenstate of T with eigenvalue c.
Specifically, if you start with the state E and measure T to have the value c,
then the new state F is defined so that, for any positive observable U,
F(U) is the infimum of E(V) as V ranges over all positive observables
such that G(U) = G(V) whenever G is an eigenstate of T with eigenvalue c,
divided by the result of this process on U = 1 (to normalize);
again, nonpositive observables are just linear combinations of positive ones.
(Note that if the C* algebra is C_b (X) for some space X,
then this is equivalent to the procedure I gave in the classical case.)
If E is the mixture aF + bG, F is an eigenstate of T with eigenvalue c,
and G is an eigenstate of T with a different eigenvalue,
then this procedure will change E to F.
This is classical collapse; it's just conditional probability,
if you interpret E as being F with probability a and G with probability b.
If E is the superposition E_(a|psi> + b|phi>),
|psi> is an eigenvector of T with eigenvalue c,
and |phi> is an eigenvector of T with a different eigenvalue,
then this procedure changes E to E_|psi>.
This is pure quantum collapse; the term "collapse" is often reserved for this.
Both of these are special cases of the more general form of collapse.
Thus, traditional quantum collapse is analogous to conditional probability;
they're both special cases of the same general phenomenon,
the updating of states to reflect new information.
(Note that, unlike in the classical case,
collapse can take you from one pure state to another,
which shows that pure states, while providing *maximum* information,
can't be interpreted as providing *all* the information.
In other words, there must always be some uncertainty.)