Probability – 2

In a comment on my previous post arguing that probability is arbitrary, Stephen Bourque wrote

Probability is an empirical measurement of an ensemble of events. It means: Given a set of N independent events, the probability of a specific event is, to a degree of certainty, the number of times the specific event occurred divided by the total N, as N becomes large. By “a degree of certainty,” it is meant simply that the uncertainty in the measurement can be made smaller and smaller by increasing N. (Since this is an inductive process, it has the characteristics of induction, including the requirement of objectively determining when N is large enough to achieve certainty of the probability measure.)

Let me work out the math to calculate the degree of certainty. Consider a coin tossed N times. Suppose that M tosses resulted in a ‘heads’ (H) outcome. To simplify the math (by keeping it in the discrete domain), suppose I know that the coin has been designed to have a “true” heads probability ‘r’ for a single toss of either ‘p’ or ‘q’. Let HM,N denote the event of obtaining M heads from N tosses. Let P(A/B) denote the conditional probability of A given B.

Using Bayes’ theorem,
P(r = p / HM,N) = P(HM,N / r = p) P(r = p) /

[P(HM,N / r = p) P(r = p) + P(HM,N / r = q) P(r = q)]

with

P(HM,N / r = p) = NCM rM(1-r)N-M

and

P(HM,N / r = q) = NCM qM(1-q)N-M

If one knows P(r = p), the probability of the true probability being p, one can calculate P(r = p / HM,N), the degree of certainty for the probability estimate of r = p given the empirical data. The problem is that to calculate the degree of certainty of a probability estimate based on empirical data, one needs another probability number. To take a concrete example, suppose I know that my coin has a ‘true’ probability of either 0.3 or 0.4 for a single toss. I toss the coin 100 times and get 33 heads, so that N = 100, M = 33, p = 0.3, q = 0.4. If I use P(r = 0.3) to be 0.5, then the degree of certainty works out to be 69.7 %. The problem is that the value of 0.5 for P(r = 0.3) is still arbitrary. It has no basis in empirical data.

One can extend this to the continuous domain, where r may take any value between 0 and 1. To get a degree of certainty measure, one will need a prior probability distribution for the “true” probability and this distribution will have to be arbitrary. Just as I used a value of 0.5 in my concrete example, one may take this distribution to be the uniform distribution. I have not worked out the math for this case, but it should be easy to do so.

Anyway, it turns out that as one increases the values of N and M proportionately, the degree of certainty for the probability estimate r = M / N, rises to 100% very fast irrespective of the arbitrarily chosen prior probabilities. Practically, this is a very useful feature and this is what Stephen refers to when he writes that the uncertainty in the measurement can be made smaller and smaller by increasing N. But does it change the epistemological status of probability calculations? I don’t think so. As long as N is finite – that is, always – the degree of certainty is arbitrary. At some level, probability calculations always depend on an arbitrary choice of equal likelihood. To see this, just consider Bayes’ theorem above. It uses a weighted average where the weights are prior (or unconditional) probabilities. These unconditional probabilities are usually themselves estimated with other empirical data. Regardless, the calculation of an average assumes an equality of significance of the numbers being averaged. My position is that this assumption of equality is an arbitrary assumption. By using more and more empirical data, one can drive this assumption deeper and deeper, but unless one develops a physical theory – a cause and effect relationship – one cannot get rid of it.

Probability

I have struggled with the concept of probability for a long time. Not with the maths but with the meaning. Does a probability number really mean anything at all? And if so, what? Recently, I have reached a definite position on this. Here it is.

In a metaphysical sense, it seems clear that the probability number is meaningless. Or, more accurately, that it is not a property of the event in question at all (I am using the word ‘event’ loosely to refer to anything for which a probability may be calculated). An event either occurs or does not occur. No fractions are possible. Probability therefore must be a measure of a person’s state of knowledge of the factors that determine the event in question. That is, probability is an epistemological concept rather than a metaphysical one. It originates because of the need to make choices in the face of incomplete knowledge. This is clear since probability is used not just for future events but also for past ones. A classic example of this is the use of medical tests in conjunction with statistical analyses to arrive at a probability of a patient’s having a particular disease. In reality, either the person has the disease or not. The probability assigned to the possibility of disease is merely a tool used to decide whether further investigation is warranted.

This seems to suggest that probability is a subjective rather than an objective. But the precise math used to calculate probabilities suggests otherwise. Is probability subjective or objective? To answer this, it would be useful to look at what the words subjective and objective mean. In a comment on an old post, Burgess Laughlin wrote (and I agree):

“Objective,” in my philosophy (Objectivism), has two meanings. First, in metaphysics, it means existing independent of consciousness. The redundant phrase “objective reality” captures this meaning. Second, in epistemology, “objective” refers to knowledge that is drawn (inferred) logically from facts of reality. (See “Objectivity,” The Ayn Rand Lexicon.)

Subjective, as I use the word, refers to judgements or responses that cannot be traced back to facts of reality or the thought processes of the subject (A typical example is emotions).

Probability is not objective in the metaphysical sense. In fact, without consciousness, it would not exist at all. It is also not objective in the epistemological sense since it arises only in cases where the subject does not have complete knowledge of the facts of reality. And the fact that there are precise mathematical rules to calculate probabilities means that probability is not subjective either. If probability is neither objective nor subjective, what is it? Consider the case of a coin being tossed. Lacking any knowledge of the composition and weight distribution of the coin, the velocity with which it was tossed, the composition of air, the nature of the ground etc, the probability of a particular side showing up is taken to be 0.5. Where did this number come from? It is quite clear that this choice is purely arbitrary. The entire math of probability is based on a simple principle applied consistently. Given multiple possibilities and a complete lack of quantitative knowledge of relevant causes, each possibility has an equal probability. Clearly this is arbitrary, but it is the best one can do. And applied consistently, it provides a very precise framework for quantifying a lack of knowledge. It allows quantification of that which we do not even know!

Anyone who is familiar with Rand’s philosophy should note that my use of ‘arbitrary’ is different from (though related to and inspired from) Rand’s use of the word in the classification of the epistemological status of statements as true, false or arbitrary. To apply the principle of equal likelihood, one already needs to have identified all the possibilities. This means that probability cannot be applied to arbitrary (in Rand’s sense) assertions. What about the truth status of a statement involving probability? Such a statement can be demonstrated to be true (or false) subject to the equal likelihood principle. Without the principle, it is arbitrary (in Rand’s sense).

I think this classification – objective, subjective and arbitrary – might be useful in several other areas of math as well. For example (I need to think more about this though), it can be applied to Euclid’s axioms (in geometry). These axioms could be described as arbitrary and theorems could then be considered as true subject to the axioms.

In my next post, I will try to relate my position to statistics and randomness.