Rationality

In common usage, people sometimes tend to use the words rational and logical somewhat interchangeably. The purpose of this post is to distinguish between these.

Logic:

Logic is the set of rules that allows me to evaluate an arguement independent of its content, purely from its structure. Just as I use grammar to parse a sentence and determine the relationships between the words in the sentence, I use logic to parse an arguement and determine the relationships between the statements in the arguement. Just as a grammatical sentence may be meaningless (Colorless green ideas sleep furiously), a logical arguement may be meaningless or irrelevant. However, the analogy with grammar only goes so far. There are many different grammars and all of them are equally valid within the context of their application – a given language. Any consistently applied way of meaningfully combining words in a sentence forms a grammar. Grammar is a matter of convention. The same is not true of logic. The word itself has no plural. This is a striking fact. Think about it. It indicates that man cannot even conceive of a plural for logic. There can be no such thing as my logic vs your logic. Logic is the structure of coherant thought. It is a part of the mental apparatus that man is born with. It is implicit in the capacity to think. By implicit, I mean that I cannot choose to think illogically (though I may make mistakes). To identify mistakes in thinking, the implicit rules of logic need to be made explicit by identifying them. This is a science. Like all sciences, the science of logic also presupposes several things. In particular, it presupposes man’s ability to use logic (implicitly). Whether the word logic refers to the implicit set of rules or to the science which deals with identifying them depends on context. In this post, I am going to use the word logic to refer to the implicit set of rules.

Reason:

Reason is the faculty of understanding and integrating sensory material into knowledge. Reason does not work automatically. To reason, man has to consciously choose to think and to direct his thoughts to achieve understanding. By directing thoughts, I mean preventing thoughts from wandering by staying focussed. Reasoning involves the use of logic. It also involves several other techniques. “Reason employs methods. Reason can use sense-perception, integration, differentiation, reduction, induction, deduction, philosophical detection, and so forth in any combination as a chosen method in solving a particular problem.” [Burgess Laughlin in a comment on an old post] Deduction obviously uses logic. I believe induction does too but the science on inductive logic is nowhere as well developed as it is on deductive logic. Sense-perception, integration and differentiation don’t use logic (Note: integration and differentiation refer to grasping the similarities and differences between various things). Reason then is not simply the faculty of using logic.

Rationality:

“Rationality is man’s basic virtue, the source of all his other virtues… The virtue of rationality means the recognition and acceptance of reason as one’s only source of knowledge, one’s only judge of values and one’s only guide to action.” [Ayn Rand, in The Virtue of Selfishness]

In the discussion that motivated this post, a colleague argued that if the use of reason does not guarantee correct decisions, it cannot be one’s only guide to action. A gut feeling or intuition might sometimes be a better guide to action. There are two separate issues here – the fact that the use of reason cannot guarantee correct decisions and the claim that intuition can be an alternative guide to action.

Consider intuition first. Intuition is an involuntary automatic evaluation of the available choices. There is no conscious awareness of the reasons for the evaluation. Intuition is a learnt response from previous experience. As such intuition is extremely helpful in any decision making process. However, the fact remains that in every voluntary decision – the sort of decision where there is enough time to reason – intuition is only one of the inputs to the use of reason. As long as I make a decision consciously and deliberately, reason remains the only guide to action. The only alternative is to evade the responsibility of a choice. Relying upon intuition is not irrational in itself. I might decide that I do not have sufficient knowledge to reach a decision and choose to rely upon intuition instead. As long as I identify the lack of knowledge, my decision is fully rational. Identifying the lack of knowledge (and hopefully doing something about it) will actually allow me to learn from the new experience and improve my intuitions for future use. Blindly relying on intuition – by default instead of by choice – will actually weaken my intuition in the long run. Intuition is one of the most valuable tools for decision making but it needs to be carefully cultivated by the use of reason for it to be good or useful.

It is important to stress that rationality (in the context of making a decision) involves the use of all my knowledge to the best of my ability. In particular, this includes knowledge of the time available, the relevance of prior experience and any known gaps in knowledge. It is this last aspect of rationality – the use of known gaps in knowledge – that is the motivation for the field of probability. Probability is about quantifying uncertainty by making use of all known information and postulating equal likelihood where no information is available. The consistent use of the equal likelihood postulate is at the heart of probability theory and it is what gives probability its precise mathematical characteristics. In modeling an outcome for an uncertain event, I start with a uniform distribution (every outcome is equally likely) and use available information to transform it into a more appropriate distribution. The parameters of the transformation represent a quantitative use of known information. The shape of the final distribution represent a qualitative use of the known information.

With this brief treatment of probability, I can now address the obvious fact that the use of reason cannot guarantee correct decisions. Consider an example. I have historical data for the exchange rate between a pair of currencies. I also have market quoted prices for various financial instruments involving the currency pair. To model the exchange rate at some future time with a probability distribution, I can use the historical data to establish the shape of the distribution and the market quoted prices to obtain the parameters of the distribution. If I had more information (say a model for other parameters that affect the exchange rate), I could incorporate that too. A decision based on such a model would be a rational decision. On the other hand, I could say that since the model does not guarantee success, I will simply use a uniform distribution (Ouch!! That is not even possible since the range for the exchange rate is unbounded. Let me simply restrict the range to an intuitive upper bound) with the arguement that the uniform distribution might actually turn out to be better. Yes, it might turn out to be better, but the arguement that it should be used is still invalid (Consequentialism is invalid and I am not going to argue this). Not all decisions can be formulated with precise mathematics like this, but the principle is the same. It is always better to use all my knowledge to the best of my ability.

Another aspect of the original discussion remains unaddressed – the claim that rationality is subjective. Since this post has already got long enough, I will just stress here that there is a difference between context-dependent and subjective.

Advertisements

Probability – 2

In a comment on my previous post arguing that probability is arbitrary, Stephen Bourque wrote 

Probability is an empirical measurement of an ensemble of events. It means: Given a set of N independent events, the probability of a specific event is, to a degree of certainty, the number of times the specific event occurred divided by the total N, as N becomes large. By “a degree of certainty,” it is meant simply that the uncertainty in the measurement can be made smaller and smaller by increasing N. (Since this is an inductive process, it has the characteristics of induction, including the requirement of objectively determining when N is large enough to achieve certainty of the probability measure.)

Let me work out the math to calculate the degree of certainty. Consider a coin tossed N times. Suppose that M tosses resulted in a ‘heads’ (H) outcome. To simplify the math (by keeping it in the discrete domain), suppose I know that the coin has been designed to have a “true” heads probability ‘r’ for a single toss of either ‘p’ or ‘q’. Let HM,N denote the event of obtaining M heads from N tosses. Let P(A/B) denote the conditional probability of A given B.

Using Bayes’ theorem,
P(r = p / HM,N) = P(HM,N / r = p) P(r = p) /

[P(HM,N / r = p) P(r = p) + P(HM,N / r = q) P(r = q)]

with

P(HM,N / r = p) = NCM rM(1-r)N-M

and

P(HM,N / r = q) = NCM qM(1-q)N-M

If one knows P(r = p), the probability of the true probability being p, one can calculate P(r = p / HM,N), the degree of certainty for the probability estimate of r = p given the empirical data. The problem is that to calculate the degree of certainty of a probability estimate based on empirical data, one needs another probability number. To take a concrete example, suppose I know that my coin has a ‘true’ probability of either 0.3 or 0.4 for a single toss. I toss the coin 100 times and get 33 heads, so that N = 100, M = 33, p = 0.3, q = 0.4. If I use P(r = 0.3) to be 0.5, then the degree of certainty works out to be 69.7 %. The problem is that the value of 0.5 for P(r = 0.3) is still arbitrary. It has no basis in empirical data.

One can extend this to the continuous domain, where r may take any value between 0 and 1. To get a degree of certainty measure, one will need a prior probability distribution for the “true” probability and this distribution will have to be arbitrary. Just as I used a value of 0.5 in my concrete example, one may take this distribution to be the uniform distribution. I have not worked out the math for this case, but it should be easy to do so.

Anyway, it turns out that as one increases the values of N and M proportionately, the degree of certainty for the probability estimate r = M / N, rises to 100% very fast irrespective of the arbitrarily chosen prior probabilities. Practically, this is a very useful feature and this is what Stephen refers to when he writes that the uncertainty in the measurement can be made smaller and smaller by increasing N. But does it change the epistemological status of probability calculations? I don’t think so. As long as N is finite – that is, always – the degree of certainty is arbitrary. At some level, probability calculations always depend on an arbitrary choice of equal likelihood. To see this, just consider Bayes’ theorem above. It uses a weighted average where the weights are prior (or unconditional) probabilities. These unconditional probabilities are usually themselves estimated with other empirical data. Regardless, the calculation of an average assumes an equality of significance of the numbers being averaged. My position is that this assumption of equality is an arbitrary assumption. By using more and more empirical data, one can drive this assumption deeper and deeper, but unless one develops a physical theory – a cause and effect relationship – one cannot get rid of it.

Probability

I have struggled with the concept of probability for a long time. Not with the maths but with the meaning. Does a probability number really mean anything at all? And if so, what? Recently, I have reached a definite position on this. Here it is.

In a metaphysical sense, it seems clear that the probability number is meaningless. Or, more accurately, that it is not a property of the event in question at all (I am using the word ‘event’ loosely to refer to anything for which a probability may be calculated). An event either occurs or does not occur. No fractions are possible. Probability therefore must be a measure of a person’s state of knowledge of the factors that determine the event in question. That is, probability is an epistemological concept rather than a metaphysical one. It originates because of the need to make choices in the face of incomplete knowledge. This is clear since probability is used not just for future events but also for past ones. A classic example of this is the use of medical tests in conjunction with statistical analyses to arrive at a probability of a patient’s having a particular disease. In reality, either the person has the disease or not. The probability assigned to the possibility of disease is merely a tool used to decide whether further investigation is warranted.

This seems to suggest that probability is a subjective rather than an objective. But the precise math used to calculate probabilities suggests otherwise. Is probability subjective or objective? To answer this, it would be useful to look at what the words subjective and objective mean. In a comment on an old post, Burgess Laughlin wrote (and I agree):

“Objective,” in my philosophy (Objectivism), has two meanings. First, in metaphysics, it means existing independent of consciousness. The redundant phrase “objective reality” captures this meaning. Second, in epistemology, “objective” refers to knowledge that is drawn (inferred) logically from facts of reality. (See “Objectivity,” The Ayn Rand Lexicon.)

Subjective, as I use the word, refers to judgements or responses that cannot be traced back to facts of reality or the thought processes of the subject (A typical example is emotions).

Probability is not objective in the metaphysical sense. In fact, without consciousness, it would not exist at all. It is also not objective in the epistemological sense since it arises only in cases where the subject does not have complete knowledge of the facts of reality. And the fact that there are precise mathematical rules to calculate probabilities means that probability is not subjective either. If probability is neither objective nor subjective, what is it? Consider the case of a coin being tossed. Lacking any knowledge of the composition and weight distribution of the coin, the velocity with which it was tossed, the composition of air, the nature of the ground etc, the probability of a particular side showing up is taken to be 0.5. Where did this number come from? It is quite clear that this choice is purely arbitrary. The entire math of probability is based on a simple principle applied consistently. Given multiple possibilities and a complete lack of quantitative knowledge of relevant causes, each possibility has an equal probability. Clearly this is arbitrary, but it is the best one can do. And applied consistently, it provides a very precise framework for quantifying a lack of knowledge. It allows quantification of that which we do not even know!

Anyone who is familiar with Rand’s philosophy should note that my use of ‘arbitrary’ is different from (though related to and inspired from) Rand’s use of the word in the classification of the epistemological status of statements as true, false or arbitrary. To apply the principle of equal likelihood, one already needs to have identified all the possibilities. This means that probability cannot be applied to arbitrary (in Rand’s sense) assertions. What about the truth status of a statement involving probability? Such a statement can be demonstrated to be true (or false) subject to the equal likelihood principle. Without the principle, it is arbitrary (in Rand’s sense).

I think this classification – objective, subjective and arbitrary – might be useful in several other areas of math as well. For example (I need to think more about this though), it can be applied to Euclid’s axioms (in geometry). These axioms could be described as arbitrary and theorems could then be considered as true subject to the axioms.

In my next post, I will try to relate my position to statistics and randomness.

Book Review: Fooled by Randomness

I chanced upon Fooled by Randomness – The Hidden Role of Chance in Life and in the Markets by Nassim Nicholas Taleb at a friend’s place and took the time to read it. Having a bit of a financial background – I work in a company that did some financial modeling before I joined it – I had heard of Taleb and was curious. Besides, I want to understand probability better than I currently do – I mean philosophically, not mathematically – and the title was attractive.

The book is divided in three parts. Part I starts off with a long and rather boring story of two traders – a rash, ignorant and over-confident John and a conservative Nero. John succeeds for a time – purely through luck – makes a lot of money and then blows up – market slang for losing more money than you thought possible. Nero remains risk-averse and makes a steady amount but suffers snubs from people like John before being vindicated. The reason for including this story is primarily to show how large a role randomness plays in the markets. Taleb also comments on the fact that Nero suffered emotionally from the snubs by people who made more money than him though he always knew himself to be better. Taleb says that this shows that the rational mind cannot prevent us from experiencing irrational emotions. Taleb then discusses an “accounting method” by which a dentist is much richer than a lottery winner. If one were to consider all the “paths” that the dentist’s life could take, there would not be much variation in the money he makes and the “average” would be close to what he makes in any particular “path”. If one considers all the paths that the lottery winner’s life could take, the average would be much lower than the money he makes on the winning path. This notion should seem familiar to anyone with a knowledge of Monte-Carlo simulations but I had not seen anyone putting it so explicitly. Taleb then goes on to discuss the difference between noise and significant information and how noise can affect perceptions in short timescales. He also discusses the dangers in fitting models to historical data. This is interrupted by an unexpected attack on Hegel’s pseudo-scientific philosophy that draws on Alan Sokal’s famous hoax. Taleb then talks of rare events, how their existence makes the difference between the median and the mean important and how most people including statisticians often unwittingly ignore this difference. He then talks briefly about Bacon, Hume and Popper in relation to the problem of induction and the difficulty of induction in the presence of rare events.

Part II deals with various biases in the perception and evaluation of events and outcomes in areas where randomness plays a major role. He draws on work by Kahneman and Tversky – which I am not even remotely familiar with – to claim that in dealing with uncertainty, our minds adopt certain heuristics/biases that are blind to reason (Prospect theory, Affect heuristic, Hindsight bias, Belief in the law of small numbers, Two systems of reasoning and Overconfidence). While it is easy to see how a person with no understanding of probability theory could be misled in the many examples Taleb gives, it is difficult to believe that people trained in probability would also be misled.

Part III deals with Taleb’s interpretation of stoicism as the solution to living in a world with so much uncertainty. Taleb writes that we should accept that we are incapable of making our emotions rational and attempt to behave with dignity in all circumstances. He writes that stoicism should not mean a stiff upper lip and a banishment of emotions but an acceptance of emotions and the uncertainties of life with the focus being on the process rather than the outcome. This part is titled Wax in my ears in a reference to the story of Odysseus and the Sirens. Taleb writes that he knows that he is not as great as Odysseus and instead of tying himself, he chooses to have wax in his ears. That is, he chooses to accept that his emotions will always be fooled by randomness and the only solution is to avoid situations where he might encounter such emotions (by not listening to the news or not tracking prices of assets on a moment-by-moment basis etc).

Overall, several anecdotes in the book are mildly entertaining, but intellectually, there is very little that I gained from the book. I agree with a lot of Taleb’s views on the role of luck in the markets and the inadequacy or even meaninglessness of most financial models, but I had already reached these views before reading Taleb and frankly I don’t think they merit a significant part of a book. These views can be easily expressed in a few pages – perhaps I will write a post myself. Taleb does not provide any definition of probability – something that I had hoped for – apart from the following excerpt. Taleb’s style is quite disconnected and the numerous back and forward references are irritating, especially since the references are hardly convincing. For example in the following excerpt he refers to something in Chapter 3, but there is no convincing arguement there, not even a hint.

Ask your local mathematician to define probability, he would most probably show you how to compute it. As we saw in Chapter 3 on probabilistic introspection, probability is not about the odds, but about the belief in the existence of an alternative outcome, cause, or motive. Recall that mathematics is a tool to meditate, not compute. Again, let us go back to the elders for more guidance – for probabilities were always considered by them as nothing beyond a subjective, and fluid, measure of beliefs.

The only thing that I got from the book is a reminder that I need to formulate more completely a proper alternative to Popper’s scepticism.

%d bloggers like this: