Saturday, August 18, 2012

The role of evidence in science

In Another Question For Atheists, John Barron asks
For the sake of argument let’s grant that the quantity and quality of evidence, according to your standard, brought the probability of whether God exists to 50%. In other words, it was as equally probable that God exists as doesn’t according to your own criteria. Presuming agnosticism is not an option, ould you choose theism or atheism?, and equally as important, why?

Barron does not understand the role of evidence in the scientific method. The scientific method does not consist of using the evidence to assign probabilities to all the competing hypotheses, and then "believing" the most likely.

The second component, "believing" the most likely hypothesis, is easier to address. A probabilistic statement cannot be reduced to a statement of truth or falsity: it is always a fallacy to say, "the probability that X is true is p; therefore, X is true," for any value of p, even 99.999999999%. The best you can ever say is, "The probability that X is true is p." Even this statement is oversimplified; generally even the simplest probabilistic statement has at least two dimensions: "The probability that X is between a and b is p." The fallacy of equivocation between probability and truth is at the heart of the lottery "paradox".

The first component is a little harder to address. In science, we do not just create any old hypotheses and try to assign probabilities to them. Rather, we create two very special kind of hypotheses. The first is the null hypothesis, which hypothesizes that two variables are "related" only by chance; they are independent of each other. The second is the alternative hypothesis, that the two variables are correlated in reality; they are not "related" only by chance.

If, for example, we want to investigate the relationship between the amount of food eaten and the amount of weight gained, we formulate two hypotheses:

  • H0: The amount of food eaten and the amount of weight gained are related only by chance.
  • Ha: The amount of food eaten and the amount of weight gained are correlated in reality.

We do an experiment, measuring the amount of food eaten and the amount of weight gained, and we do magic statistics to calculate a p-value. It is extremely important to understand what the p-value means. The p-value does not represent the probability that either the null or alternate hypothesis are true. Instead, The p-value means if the null hypotheses were true, what is the probability that we would observe the measured values by chance.. And the farther the p-value is from 0.50, the greater the likelihood of rejecting the null hypothesis; both very low (near 0) and very high (near 1) p-values represent the unlikely "tails" of the underlying distribution. If the null hypothesis is true, we should almost always get p-values near 0.5.

There are a lot of different ways you have to set up the analysis of any experiment or observation to get meaningful p-values, and of course the above analysis applies only to hypotheses that can be expressed quantitatively. The underlying philosophy, however, can be applied to qualitative hypotheses. First, the statement must be reducible to a null hypothesis and a mutually exclusive and exhaustive alternative hypothesis. Second, there must be some potential observation that is in some sense "unlikely" were the null hypothesis true. If we actually observe the unlikely potential observation, we have grounds for rejecting the null hypothesis in favor of the alternative hypothesis. If you cannot even qualitatively or abstractly represent an idea in this manner, it is not, in this view, a meaningful statement about the world.

Thus, saying "the probability that God exists is 50%" is not the sort of probability that's meaningful to deciding the question of whether God exists. Instead, I would want to see probabilities like "If God did not exist, the probability of observing X is less than 0.5%; Since we do observe X, we have grounds for rejecting the null hypothesis and accept that God exists."

Keep in mind, however, that testing the null hypothesis like this is only half of the criteria. The null-hypothesis methodology underdetermines the mechanisms of correlation, even when applied piecemeal to the intermediate steps between two variables. Therefore, we also apply Occam's Razor: the alternate hypothesis must also be the simplest way of negating the null hypothesis.

The Fine-Tuning Argument for the existence of God is a good example of a hypothesis that is well-formed by the null/alternate hypothesis criteria, but fails Occam's Razor.

  • H0: The physical constants of the universe are a product of chance.
  • Ha: The physical constants of the universe were "fine-tuned" to allow life to exist.

There is no controversy that as best we presently understand physics, the p-value for the observation that the physical constants of the universe allow life to exist is very low. Precisely how low is a matter of controversy, but there is no controversy but we can assume for the sake of argument [see below] that it is "statistically significant" to at least the 99.9% confidence level, which would be accepted as sufficient evidence to reject the null hypothesis in even the most rigorous study.

However, the alternate hypothesis is not exhaustive (alternate hypotheses are never exhaustive). For example, a low probability, no matter how low, does not entail the impossibility of that event. The "alternate" hypothesis that we just "got lucky" is simpler than another alternate that requires equal or greater luck. And indeed the probability that by chance we got a god who wanted this particular universe is actually lower than the probability that by chance we got this particular universe, because the population of all possible gods exceeds the population of all possible universes governed by physical law.

There are other reasons to reject the Fine Tuning argument, but it is at least meaningfully formed according to the null/alternate hypothesis method.

ETA: Thinking about the issue more carefully, it's actually difficult to draw any solid statistical inferences about physical constants, even if we assume they are in some sense, perhaps metaphysical, randomly distributed. The problem is that we have a sample of size one. The least restrictive assumptions are that each physical constant is randomly distributed, and the physical constants are all mutually independent. Even on those assumptions, the best estimate of the mean, median, and mode of each constant is the value we actually observe. Furthermore, we have no way of estimating what kind of random distribution the (possibly metaphysical) population of each constant follows: the normal distribution is only one kind of random distribution. And even if we assume a normal distribution, a sample size of one gives us no way at all of estimating the variance of the population: the estimate of the variance from one sample divides by zero. So talking about the probability of this particular universe in the population of all possible universes requires assumptions that can be neither theoretically nor empirically justified.

The best we can say is that given there are 20 independent, normally distributed physical constants, the probability is 0.9520 = 0.358 that all the constants we observe in this actual universe are within about two (unknown) standard deviations of the population mean, which is insufficient evidence at even the loosest confidence level to reject the null hypothesis that this universe is unusual. (There would have to be 77 normally distributed physical constants for more than 80% of all possible universes to have even one constant outside of about two standard deviations of the mean.)

However, under all possible assumptions where the same assumptions govern the population of all (perhaps metaphysical) universes and all (perhaps metaphysical) gods, the probability that this particular universe occurred by chance is always higher than the probability that a god who created this possible universe occurred by chance.

1 comment:

  1. In Bayesian analysis, it makes more sense to make a claim like "The probability that X is true is 50%". But it's still not a good statement to make, because it's meaningless without stating your priors.

    And ideally, you should not commit to any single set of prior probabilities, you should try out several prior probabilities to determine if your result is robust. A result of 50% is inherently non-robust, and should not be trusted.

    ReplyDelete

Please pick a handle or moniker for your comment. It's much easier to address someone by a name or pseudonym than simply "hey you". I have the option of requiring a "hard" identity, but I don't want to turn that on... yet.

With few exceptions, I will not respond or reply to anonymous comments, and I may delete them. I keep a copy of all comments; if you want the text of your comment to repost with something vaguely resembling an identity, email me.

No spam, pr0n, commercial advertising, insanity, lies, repetition or off-topic comments. Creationists, Global Warming deniers, anti-vaxers, Randians, and Libertarians are automatically presumed to be idiots; Christians and Muslims might get the benefit of the doubt, if I'm in a good mood.

See the Debate Flowchart for some basic rules.

Sourced factual corrections are always published and acknowledged.

I will respond or not respond to comments as the mood takes me. See my latest comment policy for details. I am not a pseudonomous-American: my real name is Larry.

Comments may be moderated from time to time. When I do moderate comments, anonymous comments are far more likely to be rejected.

I've already answered some typical comments.

I have jqMath enabled for the blog. If you have a dollar sign (\$) in your comment, put a \\ in front of it: \\\$, unless you want to include a formula in your comment.