Three years after I wrote a post on being "bad" at math, I've improved considerably.
Calc III (multivariate calculus) was traumatic and a bit of a slog. I wasn't yet very good with algebra, and I have a hard time visualizing three dimensional space. But I did take a few more math classes after that, and graduated with a minor in math. More importantly, I started working as a math tutor and later as a math instructor, which improved my algebra enormously. I snuck the "discipline" in by way of my interest in teaching.
Indeed, my ability has increased and my interest has been restored sufficiently that I started a Master's in Applied Mathematics (along with the Economics Master's) last fall. (I have an interesting opportunity for a full time teaching position, so in the fall, I'm going to switch back to just the Econ Master's and do the Math Master's later.)
My view of math hasn't changed that much. It's just a way of constructing and examining certain kinds of patterns in certain kinds of ways. Many of those patterns are easily translatable to the real world; others are surprisingly applicable to the real world, and others seem (at present) to have no possible application to the real world. I'm just getting better at seeing the patterns.
[T]he superstition that the budget must be balanced at all times, once it is debunked, takes away one of the bulwarks that every society must have against expenditure out of control. . . . [O]ne of the functions of old-fashioned religion was to scare people by sometimes what might be regarded as myths into behaving in a way that long-run civilized life requires.
Showing posts with label logic and mathematics. Show all posts
Showing posts with label logic and mathematics. Show all posts
Monday, March 20, 2017
Wednesday, December 04, 2013
I'm "bad" at math
I have a confession to make: I'm "bad" at math. My math professors seem to disagree; I've received A's in all my math classes so far, including my math-heavy economics classes. I'm probably not going to get an A in my latest math class, but that has more to do with the fact that my personal life is very weird right now, and that I'm not going to pursue math as math any farther. (I'll still do math in economics, though.)
The thing is, I'm "bad" at math, but I'm good at doing things I'm "bad" at. I should probably explain what I mean here.
To be "good" at something is to fully internalize the fundamental mental tools of a discipline to the point where the conscious mind can simply take the tools for granted.
For example, I'm "good" at expository writing. Although I'm always making refinements and improvements, I have fully internalized the fundamental tools of grammar, punctuation, and spelling, as well as paragraph and larger-unit organization. I don't have to consciously think about any of these elements; most of the major cognitive work has been moved to my subconscious. When I have an idea, it just "appears" in my conscious mind in properly constructed sentences and paragraphs. My subconscious is not perfect, and I do of course still have to think consciously about writing, but 90% of the work happens in my subconscious. Internalizing these low-level tools is not sufficient to be "good" at writing, and it is possible to write well without these tools, but internalizing the tools makes writing well consistently and frequently easier and more enjoyable. I can spend almost all of my time thinking about the subject matter, rather than the presentation, and when I think about presentation, I can focus on "higher-level" tasks rather than struggling to make sure each sentence is grammatically correct. Similarly, I'm "good" at computer programming, and I've internalized the fundamental syntax and organization of computer programming, freeing my mind to think about "higher level" work.
(Note that my dry, abstract, and somewhat dense style is by design. I write what I like to read.)
My facility with writing is not a matter of "talent" or innate ability, except to the extent, I think, that I "innately" enjoy reading and writing. Basically, because I enjoy the subject matter, I enjoy practicing to gain these low-level skills. I don't have to "force" myself to read or write, and I don't have to "force" myself to write computer programs.
In contrast, I'm not "good" at math because I haven't internalized the fundamental mental tool of mathematics, which is ordinary algebra. I consciously know algebra, but I haven't internalized it. I could, I suppose, but unlike writing and computer programming, I don't innately enjoy algebra. When I get a difficult algebraic problem, I have to force myself to solve it, and if I can use a crutch, like a computer-aided algebra system, I will do so without hesitation.
I don't fully agree with Doron Zeilberger; I don't think mathematicians should let computers do all of the algebra (including the "algebra" of integrals) precisely because so much of higher math seems to involve "creative" algebra: seeing the "hidden" algebraic relationships necessary to solve complex problems; Seeing these "hidden" relationships requires internalizing the low-level algebraic mechanics. In writing, "seeing" how to express a complex thought requires internalizing the low-level mechanics of grammar. (Again, you can express a complex thought without internalizing the low-level mechanics of grammar, but it's much more difficult to do so: you can't just "see" the correct expression.)
I'm good at doing things I'm bad at because 90% of most interesting endeavors is seeing patterns, and I'm "good" at seeing patterns, precisely because I enjoy looking for patterns and practice a lot. I can do most anything that isn't pure "muscle memory," precisely because I'm good at picking up on the patterns within a field. But if I don't enjoy actually acquiring the muscle memory, my progress is limited: I won't practice. I'm not a big fan of "discipline," practicing things I don't enjoy doing for the sake of internalizing the fundamentals. I'd rather spend my time practicing things I actually do enjoy.
I'm just finished Calculus III (multivariate calculus), and I'm done with math as math. Calc III, at least as I've been taught, is 1% generally interesting patterns, 2% patterns interesting to physicists, and 97% grinding out algebra. I'm not really complaining; Calc III is the gateway to a math degree (and most STEM degrees), and intensively practicing algebra enough to internalize it is absolutely necessary. You need to either really enjoy doing algebra, or have enough discipline to practice it anyway. But I have neither enjoyment nor sufficient discipline to continue.
I wouldn't change that math majors really should practice algebra continuously; they need it. Still, there are a couple of things I wouldn't mind seeing in math instruction.
First, it would be awesome to have a math track for people who don't do math as math; focusing on the higher-level patterns in math which are (even if the underlying algebra is tedious) amazingly interesting, beautiful, and incredibly useful. A lot of different fields, including economics (and, to some extent, political science), can use a lot of higher math without having to actually grok the math as math.
The second thing I'd like to change is how math is taught in economics. Just as a political scientist, not to mention a ratical revolutionary communist, the pretense that economics is not a normative discipline is ridiculous. To uphold the pretense, economics has retreated into math; "economists" just prove mathematical theorems that they suspect might have some tenuous relevance to how people produce, distribute, and consume goods and services. I have been advised many times that if I want to get a Ph.D. in economics, it's nearly useless to study undergraduate economics; top grad schools would rather have candidates who are great mathematicians who know little to nothing about economics than people with a deep understanding of economics with less than the most excellent mathematical skill. I have just enough discipline to master enough math to get a Master's in economics, but I can think of few endeavors I would find more boring and pointless than to do what passes for economics at the Ph.D. level. If I'm offending any of my current or future professors or advisors, oh well; they will have to console themselves that they are at least ensuring that a future political scientist and theorist with not be completely ignorant of the structure of capitalist economics.
I'm not saying that economics should not use a lot of math. Math is an extremely useful language for talking about the world. I am saying, however, that unlike mathematicians, and perhaps unlike physical scientists, economists do not need to be "good" at math. Being "good" at math is, I think, useful for purely descriptive fields, but economics is normative (and anyone who tells you differently is trying to sell you something). All the problems in economics that I find interesting are not about finding new ways of describing the world in rigorous mathematics, they are about looking at how our social relations interact and intersect with real economic behavior (producing, distributing, and consuming real goods and services). (Hence I'm more-or-less a "Marxist.") Math is useful, but not fundamental. It's more important to be "good" at economics, to internalize thinking about real economic behavior, than to be "good" at math.
But the world isn't as I wish it to be. Fortunately, there's enough wiggle room in the system that I can educate myself in what I want to learn while still doing what academia wants me to do to gain the credentials that I need.
ETA: I've since improved my math.
The thing is, I'm "bad" at math, but I'm good at doing things I'm "bad" at. I should probably explain what I mean here.
To be "good" at something is to fully internalize the fundamental mental tools of a discipline to the point where the conscious mind can simply take the tools for granted.
For example, I'm "good" at expository writing. Although I'm always making refinements and improvements, I have fully internalized the fundamental tools of grammar, punctuation, and spelling, as well as paragraph and larger-unit organization. I don't have to consciously think about any of these elements; most of the major cognitive work has been moved to my subconscious. When I have an idea, it just "appears" in my conscious mind in properly constructed sentences and paragraphs. My subconscious is not perfect, and I do of course still have to think consciously about writing, but 90% of the work happens in my subconscious. Internalizing these low-level tools is not sufficient to be "good" at writing, and it is possible to write well without these tools, but internalizing the tools makes writing well consistently and frequently easier and more enjoyable. I can spend almost all of my time thinking about the subject matter, rather than the presentation, and when I think about presentation, I can focus on "higher-level" tasks rather than struggling to make sure each sentence is grammatically correct. Similarly, I'm "good" at computer programming, and I've internalized the fundamental syntax and organization of computer programming, freeing my mind to think about "higher level" work.
(Note that my dry, abstract, and somewhat dense style is by design. I write what I like to read.)
My facility with writing is not a matter of "talent" or innate ability, except to the extent, I think, that I "innately" enjoy reading and writing. Basically, because I enjoy the subject matter, I enjoy practicing to gain these low-level skills. I don't have to "force" myself to read or write, and I don't have to "force" myself to write computer programs.
In contrast, I'm not "good" at math because I haven't internalized the fundamental mental tool of mathematics, which is ordinary algebra. I consciously know algebra, but I haven't internalized it. I could, I suppose, but unlike writing and computer programming, I don't innately enjoy algebra. When I get a difficult algebraic problem, I have to force myself to solve it, and if I can use a crutch, like a computer-aided algebra system, I will do so without hesitation.
I don't fully agree with Doron Zeilberger; I don't think mathematicians should let computers do all of the algebra (including the "algebra" of integrals) precisely because so much of higher math seems to involve "creative" algebra: seeing the "hidden" algebraic relationships necessary to solve complex problems; Seeing these "hidden" relationships requires internalizing the low-level algebraic mechanics. In writing, "seeing" how to express a complex thought requires internalizing the low-level mechanics of grammar. (Again, you can express a complex thought without internalizing the low-level mechanics of grammar, but it's much more difficult to do so: you can't just "see" the correct expression.)
I'm good at doing things I'm bad at because 90% of most interesting endeavors is seeing patterns, and I'm "good" at seeing patterns, precisely because I enjoy looking for patterns and practice a lot. I can do most anything that isn't pure "muscle memory," precisely because I'm good at picking up on the patterns within a field. But if I don't enjoy actually acquiring the muscle memory, my progress is limited: I won't practice. I'm not a big fan of "discipline," practicing things I don't enjoy doing for the sake of internalizing the fundamentals. I'd rather spend my time practicing things I actually do enjoy.
I'm just finished Calculus III (multivariate calculus), and I'm done with math as math. Calc III, at least as I've been taught, is 1% generally interesting patterns, 2% patterns interesting to physicists, and 97% grinding out algebra. I'm not really complaining; Calc III is the gateway to a math degree (and most STEM degrees), and intensively practicing algebra enough to internalize it is absolutely necessary. You need to either really enjoy doing algebra, or have enough discipline to practice it anyway. But I have neither enjoyment nor sufficient discipline to continue.
I wouldn't change that math majors really should practice algebra continuously; they need it. Still, there are a couple of things I wouldn't mind seeing in math instruction.
First, it would be awesome to have a math track for people who don't do math as math; focusing on the higher-level patterns in math which are (even if the underlying algebra is tedious) amazingly interesting, beautiful, and incredibly useful. A lot of different fields, including economics (and, to some extent, political science), can use a lot of higher math without having to actually grok the math as math.
The second thing I'd like to change is how math is taught in economics. Just as a political scientist, not to mention a ratical revolutionary communist, the pretense that economics is not a normative discipline is ridiculous. To uphold the pretense, economics has retreated into math; "economists" just prove mathematical theorems that they suspect might have some tenuous relevance to how people produce, distribute, and consume goods and services. I have been advised many times that if I want to get a Ph.D. in economics, it's nearly useless to study undergraduate economics; top grad schools would rather have candidates who are great mathematicians who know little to nothing about economics than people with a deep understanding of economics with less than the most excellent mathematical skill. I have just enough discipline to master enough math to get a Master's in economics, but I can think of few endeavors I would find more boring and pointless than to do what passes for economics at the Ph.D. level. If I'm offending any of my current or future professors or advisors, oh well; they will have to console themselves that they are at least ensuring that a future political scientist and theorist with not be completely ignorant of the structure of capitalist economics.
I'm not saying that economics should not use a lot of math. Math is an extremely useful language for talking about the world. I am saying, however, that unlike mathematicians, and perhaps unlike physical scientists, economists do not need to be "good" at math. Being "good" at math is, I think, useful for purely descriptive fields, but economics is normative (and anyone who tells you differently is trying to sell you something). All the problems in economics that I find interesting are not about finding new ways of describing the world in rigorous mathematics, they are about looking at how our social relations interact and intersect with real economic behavior (producing, distributing, and consuming real goods and services). (Hence I'm more-or-less a "Marxist.") Math is useful, but not fundamental. It's more important to be "good" at economics, to internalize thinking about real economic behavior, than to be "good" at math.
But the world isn't as I wish it to be. Fortunately, there's enough wiggle room in the system that I can educate myself in what I want to learn while still doing what academia wants me to do to gain the credentials that I need.
ETA: I've since improved my math.
Sunday, December 09, 2012
Saturday, January 14, 2012
Can we deduce supply and demand curves
Can we deduce the the supply and demand curves in terms of opportunity cost assuming only declining marginal utility of consumption? (I.e. without assuming increasing marginal labor cost of production.)
I think it can (and perhaps it's already been done), but I haven't seen it done and I don't think I yet have all the right mathematical tools to derive it. Perhaps a reader who has better math than me could help?
Declining marginal utility of consumption means basically that to obtain the first widget, which takes three hours* to produce, I might forego the last doodad, which takes one hour to produce. To obtain the second widget I will not, however, forego the second-to-last doodad, but I might forego the last thingamabob, which takes two hours to produce.
*of abstract labor time
Given fixed marginal labor time of supply (it takes x hours to produce one more of any good at any quantity) but declining marginal utility of consumption, what is the overall equilibrium price of each good in an economy?
I think it can (and perhaps it's already been done), but I haven't seen it done and I don't think I yet have all the right mathematical tools to derive it. Perhaps a reader who has better math than me could help?
Declining marginal utility of consumption means basically that to obtain the first widget, which takes three hours* to produce, I might forego the last doodad, which takes one hour to produce. To obtain the second widget I will not, however, forego the second-to-last doodad, but I might forego the last thingamabob, which takes two hours to produce.
*of abstract labor time
Given fixed marginal labor time of supply (it takes x hours to produce one more of any good at any quantity) but declining marginal utility of consumption, what is the overall equilibrium price of each good in an economy?
Monday, March 14, 2011
Friday, January 14, 2011
Simple, statistical and emergent properties
Consider John Conway's Game of Life. This game is important because it gives us a conceptual handle on simple, statistical and emergent properties.
Briefly, we have a grid of cells, and each cell in the grid has a state: It can be "alive" or "dead" (or on/off, 1/0, etc.) The state of a cell is a simple property.
There are also statistical properties of a grid, such as the number of alive cells, the mean or median number of alive cells in some set of distinct arbitrarily-defined subgroups of cells (e.g. the mean or median number of alive cells in distinct squares of nine or sixteen cells, or the mean number of consecutive alive cells, etc.) We can define these sorts of statistical properties in terms of computability: A property is statistical if it can be computed for a finite grid by a polynomial of the size of the grid.
The Game of Life also specifies transformation rules: Given a grid G0 of finite size, with some specific set of alive and dead cells, there is exactly one grid G1 of the same size as G0 that results from the transformation rules applied to grid G0. It might or might not happen to be the case that G0 and G0 are identical. If G1 is not identical to G0, then we can apply the transformation rules to G0 to determine grid G2.
If we preserve the grid size during transformations, then we know that for every grid G, there is either a final grid Gn such that Gn is identical to Gn-1, or there is a final period of grids Gn,m (with count m - n) such that Gm+1 is identical to Gn. (A final grid is the same as a final period of grids of count 1). The maximum final period must be less than 2s where s is the size of the grid. (It has to be strictly less because we know that some grids are part of are part of periods smaller than 2s.) It should be clear that the final period of grids is determined exclusively by the initial grid G: For every grid G, there is exactly one final period of grids Gy,z. Thus the final period is ontologically reducible to the initial grid. (Note that a final grid is not necessarily epistemically reducible to the initial grid: There are many initial grids that produce the same final period of grids.)
However, we cannot generally compute the final period of grid G in polynomial time. For size-preserving transformation rules, we can generally compute the final period in only exponential time, i.e. 2c, where c is less than the grid size s. We'll classify these properties as emergent.
Things get even more interesting when we do not preserve grid size. Conceptually, we can compute Gn+1 by first adding an extra margin of dead cells around Gn, applying the transformation rules, and then trim the result to the grid containing all the live cells. Alternatively, we can start with an enumeration of a finite number of alive cells, each with a position of finite size, and apply the transformation rules, which will result in another finite number of alive cells (which may be more or less than the original number) each with a position of finite size. Again, it seems clear that the result (whatever it happens to be) is strictly ontologically reducible to the initial state: one initial state will produce exactly one result.
The result can be strict periodicity such as a block or boat with count 2, or a blinker with count 2; or abstract periodicity such as a block plus a glider. But since the grid size is not constant, we are not assured even that there is always any periodicity, strict or abstract. So not only can we not generally calculate the computational complexity at all to determine the final period, we can't even generally tell in finite time if some finite initial state even has any final period. And yet we have not at all removed the conditions of strict ontological reducibility.
The Game of Life, and the notions of simple, statistical and emergent properties is not just interesting in and of itself, it has some interesting philosophical implications for both atheism and economics.
Briefly, we have a grid of cells, and each cell in the grid has a state: It can be "alive" or "dead" (or on/off, 1/0, etc.) The state of a cell is a simple property.
There are also statistical properties of a grid, such as the number of alive cells, the mean or median number of alive cells in some set of distinct arbitrarily-defined subgroups of cells (e.g. the mean or median number of alive cells in distinct squares of nine or sixteen cells, or the mean number of consecutive alive cells, etc.) We can define these sorts of statistical properties in terms of computability: A property is statistical if it can be computed for a finite grid by a polynomial of the size of the grid.
The Game of Life also specifies transformation rules: Given a grid G0 of finite size, with some specific set of alive and dead cells, there is exactly one grid G1 of the same size as G0 that results from the transformation rules applied to grid G0. It might or might not happen to be the case that G0 and G0 are identical. If G1 is not identical to G0, then we can apply the transformation rules to G0 to determine grid G2.
If we preserve the grid size during transformations, then we know that for every grid G, there is either a final grid Gn such that Gn is identical to Gn-1, or there is a final period of grids Gn,m (with count m - n) such that Gm+1 is identical to Gn. (A final grid is the same as a final period of grids of count 1). The maximum final period must be less than 2s where s is the size of the grid. (It has to be strictly less because we know that some grids are part of are part of periods smaller than 2s.) It should be clear that the final period of grids is determined exclusively by the initial grid G: For every grid G, there is exactly one final period of grids Gy,z. Thus the final period is ontologically reducible to the initial grid. (Note that a final grid is not necessarily epistemically reducible to the initial grid: There are many initial grids that produce the same final period of grids.)
However, we cannot generally compute the final period of grid G in polynomial time. For size-preserving transformation rules, we can generally compute the final period in only exponential time, i.e. 2c, where c is less than the grid size s. We'll classify these properties as emergent.
Things get even more interesting when we do not preserve grid size. Conceptually, we can compute Gn+1 by first adding an extra margin of dead cells around Gn, applying the transformation rules, and then trim the result to the grid containing all the live cells. Alternatively, we can start with an enumeration of a finite number of alive cells, each with a position of finite size, and apply the transformation rules, which will result in another finite number of alive cells (which may be more or less than the original number) each with a position of finite size. Again, it seems clear that the result (whatever it happens to be) is strictly ontologically reducible to the initial state: one initial state will produce exactly one result.
The result can be strict periodicity such as a block or boat with count 2, or a blinker with count 2; or abstract periodicity such as a block plus a glider. But since the grid size is not constant, we are not assured even that there is always any periodicity, strict or abstract. So not only can we not generally calculate the computational complexity at all to determine the final period, we can't even generally tell in finite time if some finite initial state even has any final period. And yet we have not at all removed the conditions of strict ontological reducibility.
The Game of Life, and the notions of simple, statistical and emergent properties is not just interesting in and of itself, it has some interesting philosophical implications for both atheism and economics.
Monday, August 09, 2010
P != NP
A paper purports to prove that P != NP. Any of my mathematical readers care to comment?
(via Bruce Schneier)
(via Bruce Schneier)
Saturday, February 13, 2010
Evidentiary and deductive reasoning
Evidentiary and deductive reasoning are two related but substantively different modes of reasoning.
Deductive reasoning is the reasoning mathematicians typically use, at least when they are creating proofs. We use deductive reasoning when we one or more statements as axiomatic*, i.e. "true" a priori or by definition, and we serially apply a specific, finite set of mechanical inference or transformation rules to those statements, one rule at a time. A set of axioms and inference rules comprises a formal system. By definition, the theorems, i.e. any and every statement generated in this manner, regardless of the order the inference rules were applied, is also "true". Douglas Hofstadter goes into the deductive process in great detail in his book, Gödel, Escher, Bach: An Eternal Golden Braid. Simple deductive systems typically use propositional calculus or first-order logic as the inference rules, so we typically distinguish different systems by their axioms. Start with Euclid's axioms and you have plane geometry; start with Peano's axioms and you have natural arithmetic.
*We can also use an axiom schema, a rule for producing axioms. We can, however, consider an axiom schema as a simple formal system with no loss of generality.
Using deductive reasoning, I can write a simple computer program to print out true theorems of any deductive system faster than a roomful of mathematicians. The inference rules are mechanical and deterministic: each inference rule produces exactly one output for any given input. Therefore I can write a computer program that takes the first axiom, and applies the first inference rule on that axiom to generate a theorem and prints the theorem. Then program then applies the second inference rule to the axiom and prints out that theorem. Once we've applied each inference rule to the first axiom, we apply each inference rule to the second axiom, and so forth. We then repeat the process of applying each inference rule to the theorems generated in the first round. If we have an infinite amount of memory (to remember all the theorems we've generated) and an infinite amount of time, we will print every theorem of the formal system.
But of course we don't have infinite memory and time. In fact, with this brute-force method we will quickly exhaust even a universe-scale computer before we ever get to an "interesting" theorem, such as the theorem of arithmetic that there are infinitely many prime numbers. We might never even get to "1+1=2"! Cleverness in deductive reasoning consists of finding the chain of inference rules that leads to "interesting" theorems. (Indeed two extremely clever people, Alfred North Whitehead and Bertrand Russell, require 362 pages to lay the groundwork to prove that 1+1=2, and do not complete the proof until 86 pages into the second volume. We would require Knuth notation to describe the number of universes required to find this proof by brute force.)
The deductive method poses some deep and interesting philosophical problems, but if we use simple enough inference rules , we always know with absolute certainty that our theorems are "true"... or at least they are as "true" as our axioms. (Philosophers typically more-or-less understand and use first-order logic, which is known to be consistent, and known to be insufficiently powerful to express all "interesting" conjectures. Mathematicians, I suspect, roll their eyes in tolerant amusement when philosophers get all excited about the self-referential weirdnesses in more more powerful systems.)
But we don't always know, or cannot arbitrarily specify, a set of axioms and inference rules; all we know are the "theorems". This is basically the situation we're in regarding our experience: our experiences are like theorems, and our goal is to discover the inference rules (basic and abstract natural laws) and/or the starting premises (what happened in the past) that connect these experiences. In these cases, because we do not have well-defined and pre-specified axioms and inference rules, we must use evidentiary reasoning. The experiences or "theorems" are the evidence, and we want to discover the axioms (or at least other theorems) and inference rules that connect and explain that evidence.
(Philosophers made a valiant effort to put science on a purely deductive footing with Naive Empiricism (a.k.a. Logical Positivism): our observations are axioms, we use the "universal" a priori rules of logic as our inference rules, and attempt to deduce the underlying natural laws and earlier conditions using this formal system. Unfortunately, it didn't work, for a lot of reasons.)
We still use deduction in evidentiary reasoning, because we want to express the connections and explanations with the same sort of mechanistic, deterministic rigor that characterizes deductive reasoning. But in evidentiary reasoning, deduction is only a part of the process; it's not helpful to say that the deductive theorems are just as "true" as the axioms, because we're in doubt about the axioms and inference rules themselves.
We find it convenient to separate evidentiary reasoning into two primary modes. The first mode is to discover inference rules. A convenient and efficient way to discover inference rules is to use experimental science: very precisely observe (or experience) what's "true" at one point in time, wait, then observe what's true a little later, and propose inference rules ("laws of nature") that would rigorously explain the transformation. The controlled experiment refines this process even further, since it's very difficult to actually observe everything that's true at any point in time. Instead we create two situations that are as alike as possible in all but one element, and then a little later observe what's true about those situations, and propose inference rules to rigorously explain the difference in the outcomes in terms of the difference in the initial conditions.
The second mode is to discover the initial or preceding conditions when we can observe only the resulting conditions. A convenient and efficient way to discover preceding conditions is historical science: take the inference rules we have discovered from experimental science and propose initial conditions that those inference rules would have transformed into what we presently observe.
Evidentiary reasoning appears much more difficult that deductive reasoning, at least to do consciously. In every literate culture, we see the development of mathematics follow almost instantly on the heels of literacy. It took Western European culture, however, nearly two thousand years of literacy and mathematics to develop and codify evidentiary reasoning, and (AFAIK) no other culture independently developed and codified evidentiary reasoning and used it on a large scale.
On the other hand, perhaps paradoxically, evidentiary reasoning does not require consciousness or codification. Biological evolution itself is an "evidentiary" process: we try out different "formal systems" (biological arrangements of brains) at random; organisms with brains that fail to accurately model reality do not survive to reproduce and are selected against.
With simple enough inference rules (which do give us considerable power) we can be rigorously certain not only that all of our deductions do correctly follow from our axioms, but also that our inference rules never produce a contradiction (eliminating half the possible statements as non-theorems, statements that cannot be generated from the axioms and inference rules) and that all possible statements are definitely theorems or non-theorems. Philosophy typically uses propositional calculus (provably consistent and complete) or first-order logic (consistent and semicomplete). Higher-order logic, however, confuses most philosophers.
Evidentiary reasoning also does not give us the kind of confidence we can get from deductive reasoning. We have only a finite amount of evidence (our actual observations and experience), but there are an infinite number of possible formal systems that would account for that evidence (i.e. the facts in evidence are theorems of the formal system). Furthermore, it might be the case that there is no formal system that accounts for the evidence. It might be the case, for example, that the universe is infinite and truly random, in which case a set of observations and experiences that looks like the workings of every underlying set of natural laws modeled by a formal system will occur at one point or another.
Therefore we have to apply additional formal criteria to evidentiary reasoning for it to have any utility. The additional criteria are simplicity and falsifiability. The criteria of simplicity specifies that if more than one formal system accounts for the evidence, we prefer the formal system with the fewest axioms and inference rules. (A corollary of the simplicity criterion is that two formal systems with the same theorems are equivalent.) But the simplicity criterion isn't enough, otherwise we would prefer the simplest "degenerate" explanation that all statements are true: obviously all statements about evidence follow from this explanation. The criterion of falsifiability specifies that only formal systems where statements that contradict true statements about observation or experience are non-theorems are interesting.
Note that simplicity is not a criterion of deductive reasoning: the most complicated proof in the world (such as the four color theorem or Fermat's last theorem) are just as good as the most elegant, compact proof. The criterion of falsifiability has an analog in the deductive criterion of non-contradiction, but it's more trivial: it specifies that exactly half of all decidable statements are theorems and the other half non-theorems (i.e. if X is a theorem, then not-X is a non-theorem, and vice-versa. There are some interesting exceptions to this rule, sadly beyond the scope of this post.)
Although related, deductive and evidentiary reasoning work in "opposite" directions. Deduction asks the question: what interesting statements are theorems of this formal system? Evidentiary reasoning asks the opposite question: in what formal system are these interesting statements theorems?
Deductive reasoning is the reasoning mathematicians typically use, at least when they are creating proofs. We use deductive reasoning when we one or more statements as axiomatic*, i.e. "true" a priori or by definition, and we serially apply a specific, finite set of mechanical inference or transformation rules to those statements, one rule at a time. A set of axioms and inference rules comprises a formal system. By definition, the theorems, i.e. any and every statement generated in this manner, regardless of the order the inference rules were applied, is also "true". Douglas Hofstadter goes into the deductive process in great detail in his book, Gödel, Escher, Bach: An Eternal Golden Braid. Simple deductive systems typically use propositional calculus or first-order logic as the inference rules, so we typically distinguish different systems by their axioms. Start with Euclid's axioms and you have plane geometry; start with Peano's axioms and you have natural arithmetic.
*We can also use an axiom schema, a rule for producing axioms. We can, however, consider an axiom schema as a simple formal system with no loss of generality.
Using deductive reasoning, I can write a simple computer program to print out true theorems of any deductive system faster than a roomful of mathematicians. The inference rules are mechanical and deterministic: each inference rule produces exactly one output for any given input. Therefore I can write a computer program that takes the first axiom, and applies the first inference rule on that axiom to generate a theorem and prints the theorem. Then program then applies the second inference rule to the axiom and prints out that theorem. Once we've applied each inference rule to the first axiom, we apply each inference rule to the second axiom, and so forth. We then repeat the process of applying each inference rule to the theorems generated in the first round. If we have an infinite amount of memory (to remember all the theorems we've generated) and an infinite amount of time, we will print every theorem of the formal system.
But of course we don't have infinite memory and time. In fact, with this brute-force method we will quickly exhaust even a universe-scale computer before we ever get to an "interesting" theorem, such as the theorem of arithmetic that there are infinitely many prime numbers. We might never even get to "1+1=2"! Cleverness in deductive reasoning consists of finding the chain of inference rules that leads to "interesting" theorems. (Indeed two extremely clever people, Alfred North Whitehead and Bertrand Russell, require 362 pages to lay the groundwork to prove that 1+1=2, and do not complete the proof until 86 pages into the second volume. We would require Knuth notation to describe the number of universes required to find this proof by brute force.)
The deductive method poses some deep and interesting philosophical problems, but if we use simple enough inference rules , we always know with absolute certainty that our theorems are "true"... or at least they are as "true" as our axioms. (Philosophers typically more-or-less understand and use first-order logic, which is known to be consistent, and known to be insufficiently powerful to express all "interesting" conjectures. Mathematicians, I suspect, roll their eyes in tolerant amusement when philosophers get all excited about the self-referential weirdnesses in more more powerful systems.)
But we don't always know, or cannot arbitrarily specify, a set of axioms and inference rules; all we know are the "theorems". This is basically the situation we're in regarding our experience: our experiences are like theorems, and our goal is to discover the inference rules (basic and abstract natural laws) and/or the starting premises (what happened in the past) that connect these experiences. In these cases, because we do not have well-defined and pre-specified axioms and inference rules, we must use evidentiary reasoning. The experiences or "theorems" are the evidence, and we want to discover the axioms (or at least other theorems) and inference rules that connect and explain that evidence.
(Philosophers made a valiant effort to put science on a purely deductive footing with Naive Empiricism (a.k.a. Logical Positivism): our observations are axioms, we use the "universal" a priori rules of logic as our inference rules, and attempt to deduce the underlying natural laws and earlier conditions using this formal system. Unfortunately, it didn't work, for a lot of reasons.)
We still use deduction in evidentiary reasoning, because we want to express the connections and explanations with the same sort of mechanistic, deterministic rigor that characterizes deductive reasoning. But in evidentiary reasoning, deduction is only a part of the process; it's not helpful to say that the deductive theorems are just as "true" as the axioms, because we're in doubt about the axioms and inference rules themselves.
We find it convenient to separate evidentiary reasoning into two primary modes. The first mode is to discover inference rules. A convenient and efficient way to discover inference rules is to use experimental science: very precisely observe (or experience) what's "true" at one point in time, wait, then observe what's true a little later, and propose inference rules ("laws of nature") that would rigorously explain the transformation. The controlled experiment refines this process even further, since it's very difficult to actually observe everything that's true at any point in time. Instead we create two situations that are as alike as possible in all but one element, and then a little later observe what's true about those situations, and propose inference rules to rigorously explain the difference in the outcomes in terms of the difference in the initial conditions.
The second mode is to discover the initial or preceding conditions when we can observe only the resulting conditions. A convenient and efficient way to discover preceding conditions is historical science: take the inference rules we have discovered from experimental science and propose initial conditions that those inference rules would have transformed into what we presently observe.
Evidentiary reasoning appears much more difficult that deductive reasoning, at least to do consciously. In every literate culture, we see the development of mathematics follow almost instantly on the heels of literacy. It took Western European culture, however, nearly two thousand years of literacy and mathematics to develop and codify evidentiary reasoning, and (AFAIK) no other culture independently developed and codified evidentiary reasoning and used it on a large scale.
On the other hand, perhaps paradoxically, evidentiary reasoning does not require consciousness or codification. Biological evolution itself is an "evidentiary" process: we try out different "formal systems" (biological arrangements of brains) at random; organisms with brains that fail to accurately model reality do not survive to reproduce and are selected against.
With simple enough inference rules (which do give us considerable power) we can be rigorously certain not only that all of our deductions do correctly follow from our axioms, but also that our inference rules never produce a contradiction (eliminating half the possible statements as non-theorems, statements that cannot be generated from the axioms and inference rules) and that all possible statements are definitely theorems or non-theorems. Philosophy typically uses propositional calculus (provably consistent and complete) or first-order logic (consistent and semicomplete). Higher-order logic, however, confuses most philosophers.
Evidentiary reasoning also does not give us the kind of confidence we can get from deductive reasoning. We have only a finite amount of evidence (our actual observations and experience), but there are an infinite number of possible formal systems that would account for that evidence (i.e. the facts in evidence are theorems of the formal system). Furthermore, it might be the case that there is no formal system that accounts for the evidence. It might be the case, for example, that the universe is infinite and truly random, in which case a set of observations and experiences that looks like the workings of every underlying set of natural laws modeled by a formal system will occur at one point or another.
Therefore we have to apply additional formal criteria to evidentiary reasoning for it to have any utility. The additional criteria are simplicity and falsifiability. The criteria of simplicity specifies that if more than one formal system accounts for the evidence, we prefer the formal system with the fewest axioms and inference rules. (A corollary of the simplicity criterion is that two formal systems with the same theorems are equivalent.) But the simplicity criterion isn't enough, otherwise we would prefer the simplest "degenerate" explanation that all statements are true: obviously all statements about evidence follow from this explanation. The criterion of falsifiability specifies that only formal systems where statements that contradict true statements about observation or experience are non-theorems are interesting.
Note that simplicity is not a criterion of deductive reasoning: the most complicated proof in the world (such as the four color theorem or Fermat's last theorem) are just as good as the most elegant, compact proof. The criterion of falsifiability has an analog in the deductive criterion of non-contradiction, but it's more trivial: it specifies that exactly half of all decidable statements are theorems and the other half non-theorems (i.e. if X is a theorem, then not-X is a non-theorem, and vice-versa. There are some interesting exceptions to this rule, sadly beyond the scope of this post.)
Although related, deductive and evidentiary reasoning work in "opposite" directions. Deduction asks the question: what interesting statements are theorems of this formal system? Evidentiary reasoning asks the opposite question: in what formal system are these interesting statements theorems?
Sunday, June 07, 2009
Population and p-value
A week ago, I reported on what seemed to be a case of apparent statistical illiteracy. I have a question for my more statistically sophisticated readers.
The study in question appears to take into account the entire population of all Chrysler dealers. Is the p-value even meaningful in this case? Doesn't the p-value talk about sampling error? Under what circumstances is the p-value meaningful when applied to a population statistic, instead of a sample statistic? Should the population of Chrysler dealers be considered a sample of some larger population?
The study in question appears to take into account the entire population of all Chrysler dealers. Is the p-value even meaningful in this case? Doesn't the p-value talk about sampling error? Under what circumstances is the p-value meaningful when applied to a population statistic, instead of a sample statistic? Should the population of Chrysler dealers be considered a sample of some larger population?
Saturday, June 06, 2009
Mathematically proven to be a fucktard
Nazariel natters about numbers:
I have seen the odds of having the right amino acids and other components coming together from the primordial ooze and creating life mathematically computed. It produces a number followed by so many zeros that if I used the smallest font on my computer it would take the rest of this article to write them in, plus another several hundred pages to record them all. ...Nazariel is correct about one thing, though:
Long before the advent of the computer mathematicians were able to calculate odds. At the close of World War One a famous eschatologist, Mr. Clarence Larkin had some mathematicians make some calculations for him. They calculated that the events that took place in the last twenty four hours in the life of Christ which fulfilled Old Testament prophecies about him took place against the odds of fifty three million to one in favor. Put simply, no one but Jesus Christ could have fulfilled these prophecies as he did. ...
What I am saying is that even with a fifty-fifty chance the atheists, agnostics and evolutionists are the poorest gamblers in the world.
I never did well in math.You don't say.
Monday, June 01, 2009
Lies, damned lies and statistics
Matt Taibbi picks up the story about the study noting:
(Taibbi's article is noteworthy for the quip, "Hell, if you want to punish a Chrysler dealer, it seems to me that the best thing to do is force him to keep trying to sell Chryslers.")
Finding such a p-value "eyebrow raising" reveals an inexcusible ignorance of statistics. First of all, the 95% confidence interval is not a high bar or a "gold standard". It is, rather, a relatively low standard, a rule of thumb to indicate whether some correlation is worth investigating further. One in twenty studies where the null hypothesis is actually true will find results good to 95% by chance. That we found a one in eight chance to accept the null hypothesis indicates the correlation is not worth investigating further.
Second, the original authors admit they "matched dealer data against several variables including" (but presumably not limited to) seven specific criteria (party affiliation, donations to three candidates and "other", donation amount and zip code). When you compare several variables, you are doing several studies. Even assuming they calculated only seven different possible correlations, the probability that one of them would have achieved a p-value of 12.5% by chance is extremely high. There are statistical tests, such as Tukey's test, that correctly account for doing multiple comparisons. The authors do not report the results of any multiple comparison analyses.
I realize that even my two-week tutelage under a statistician gives me a better understanding of statistics than most scientists (and perhaps many professional statistiticans), but really, it's completely indefensible and evidence of nothing but statistical illiteracy to see this study as having anything but a completely negative result.
Update: The hypothesis that the Obama administration would favor Clinton donors (p 0.125) more strongly than Obama donors (p 0.509), and treat Republican (p 0.636) and Democratic (p 0.676) donors equally is wildly implausible. It's hard to interpret drawing a causal conclusion as anything but incompetence or dishonesty.
a highly positive correlation between dealer survival and Clinton donors[.] Granted, that P-Value (0.125) isn’t enough to reject the null hypothesis at 95% confidence intervals (our null hypothesis being that the effect is due to random chance), but a 12.5% chance of a Type I error in rejecting a null hypothesis (false rejection of a true hypothesis) is at least eyebrow raising.
(Taibbi's article is noteworthy for the quip, "Hell, if you want to punish a Chrysler dealer, it seems to me that the best thing to do is force him to keep trying to sell Chryslers.")
Finding such a p-value "eyebrow raising" reveals an inexcusible ignorance of statistics. First of all, the 95% confidence interval is not a high bar or a "gold standard". It is, rather, a relatively low standard, a rule of thumb to indicate whether some correlation is worth investigating further. One in twenty studies where the null hypothesis is actually true will find results good to 95% by chance. That we found a one in eight chance to accept the null hypothesis indicates the correlation is not worth investigating further.
Second, the original authors admit they "matched dealer data against several variables including" (but presumably not limited to) seven specific criteria (party affiliation, donations to three candidates and "other", donation amount and zip code). When you compare several variables, you are doing several studies. Even assuming they calculated only seven different possible correlations, the probability that one of them would have achieved a p-value of 12.5% by chance is extremely high. There are statistical tests, such as Tukey's test, that correctly account for doing multiple comparisons. The authors do not report the results of any multiple comparison analyses.
I realize that even my two-week tutelage under a statistician gives me a better understanding of statistics than most scientists (and perhaps many professional statistiticans), but really, it's completely indefensible and evidence of nothing but statistical illiteracy to see this study as having anything but a completely negative result.
Update: The hypothesis that the Obama administration would favor Clinton donors (p 0.125) more strongly than Obama donors (p 0.509), and treat Republican (p 0.636) and Democratic (p 0.676) donors equally is wildly implausible. It's hard to interpret drawing a causal conclusion as anything but incompetence or dishonesty.
Monday, May 12, 2008
Modal logic
Just as standard logic is a system for dealing rigorously with the ideas of "true" and "false", modal logic is a system for dealing rigorously with the ideas of "sometimes true" and "always (or never) true".
Modal logic is fine, as far as it goes. You can use it for a lot of different things, so long as your definition of "sometimes" and "always" is consistent. I'm sometimes in my house, but not always; I'm always male. Scientific laws are universals; facts are accidents. It's required that you pay your taxes; giving $3 of those taxes to presidential campaign funding is optional.
But (at least some) philosophers seem to like modal logic because it's easy to create equivocations which are difficult to detect. Plantinga's modal ontological argument is a perfect example of an equivocation fallacy.
As Graham Oppy observes, "Perhaps somewhat surprisingly, Plantinga himself agrees: the "victorious" modal ontological argument is not a proof of the existence of a being which possesses maximal greatness." When a philosopher denies the conclusion of his own argument, you must suspect he's bullshitting you.
A careful examination of premise 3 — Maximal greatness is possibly exemplified. That is, it is possible that there be a being that has maximal greatness. — shows the problem. But first some background.
One of the uses of modal logic is to examine the concept of logically possible worlds, i.e. those worlds where true statements about that world are logically consistent, but some statements that are true about that world are not true of our world. This is just a rigorous way of talking about subjunctive and counterfactual reasoning, which people routinely employ: e.g. "If I hadn't gone back for my wallet, I would have caught the train," or, "If Ralph Nader hadn't run, then Al Gore would have been inaugurated President in 2001."
This semantic way of employing modal logic, though, assumes that each set of consistent non-modal truths defines a possible world. A non-modal statement (e.g. "Al Gore was inaugurated as US President in 2001") is different from a modal statement (e.g. "There exists a possible world in which Al Gore was inaugurated President in 2001"). The non-modal statement is (sadly) false in this particular world, but it could easily have been true (along with other statements) without any fundamental logical contradiction.
The modal statement, however, is true in all logically possible worlds. Even if Al Gore was or was not inaugurated President in this or any particular possible world, it is true in all possible worlds that some such possible world exists, perhaps elsewhere. A modal statement in possible world semantics does not divide possible worlds into those worlds where it is true and those worlds where it is not. It's either true everywhere or true nowhere.
So on one horn of the dilemma, Plantinga's premise #3 is simply not well-formed, it is not a statement of modal logic.
But perhaps Plantinga does not intend logically possible world semantics. Perhaps, as his comment leads us to believe, he means epistemic possibility: we don't know whether or not God exists; its epistemically possible that God exists. But if so, he seems to use modal logic in a weird way, weird even for a philosopher.
Consider this similar argument:
Plantinga goes on to say, "Take any valid argument: once you see how it works, you may think that asserting or believing the premise is tantamount to asserting or believing the conclusion." His argument can be construed to mean that "either the existence of God is logically impossible or it is logically necessary." I'm sure that those paying Dr. Plantinga's salary are quite pleased that he has gone to elaborate lengths to prove that logical arguments are indeed logical, and modal logic is indeed modal.
Modal logic is fine, as far as it goes. You can use it for a lot of different things, so long as your definition of "sometimes" and "always" is consistent. I'm sometimes in my house, but not always; I'm always male. Scientific laws are universals; facts are accidents. It's required that you pay your taxes; giving $3 of those taxes to presidential campaign funding is optional.
But (at least some) philosophers seem to like modal logic because it's easy to create equivocations which are difficult to detect. Plantinga's modal ontological argument is a perfect example of an equivocation fallacy.
- It is proposed that a being has maximal excellence in a given possible world W if and only if it is omnipotent, omniscient and wholly good in W; and
- It is proposed that a being has maximal greatness if it has maximal excellence in every possible world.
- Maximal greatness is possibly exemplified. That is, it is possible that there be a being that has maximal greatness. (Premise)
- Therefore, possibly it is necessarily true that an omniscient, omnipotent and perfectly good being exists
- Therefore, it is necessarily true that an omniscient, omnipotent and perfectly good being exists. (By S5)
- Therefore, an omniscient, omnipotent and perfectly good being exists.
As Graham Oppy observes, "Perhaps somewhat surprisingly, Plantinga himself agrees: the "victorious" modal ontological argument is not a proof of the existence of a being which possesses maximal greatness." When a philosopher denies the conclusion of his own argument, you must suspect he's bullshitting you.
A careful examination of premise 3 — Maximal greatness is possibly exemplified. That is, it is possible that there be a being that has maximal greatness. — shows the problem. But first some background.
One of the uses of modal logic is to examine the concept of logically possible worlds, i.e. those worlds where true statements about that world are logically consistent, but some statements that are true about that world are not true of our world. This is just a rigorous way of talking about subjunctive and counterfactual reasoning, which people routinely employ: e.g. "If I hadn't gone back for my wallet, I would have caught the train," or, "If Ralph Nader hadn't run, then Al Gore would have been inaugurated President in 2001."
This semantic way of employing modal logic, though, assumes that each set of consistent non-modal truths defines a possible world. A non-modal statement (e.g. "Al Gore was inaugurated as US President in 2001") is different from a modal statement (e.g. "There exists a possible world in which Al Gore was inaugurated President in 2001"). The non-modal statement is (sadly) false in this particular world, but it could easily have been true (along with other statements) without any fundamental logical contradiction.
The modal statement, however, is true in all logically possible worlds. Even if Al Gore was or was not inaugurated President in this or any particular possible world, it is true in all possible worlds that some such possible world exists, perhaps elsewhere. A modal statement in possible world semantics does not divide possible worlds into those worlds where it is true and those worlds where it is not. It's either true everywhere or true nowhere.
So on one horn of the dilemma, Plantinga's premise #3 is simply not well-formed, it is not a statement of modal logic.
But perhaps Plantinga does not intend logically possible world semantics. Perhaps, as his comment leads us to believe, he means epistemic possibility: we don't know whether or not God exists; its epistemically possible that God exists. But if so, he seems to use modal logic in a weird way, weird even for a philosopher.
Consider this similar argument:
- All true arithmetic statements are true in all possible worlds. (Definition)
- If Goldbach's conjecture is true in any possible world, it is true in all possible worlds. (By 1)
- It's possible that Goldbach's conjecture is true. (Premise)
- Therefore Goldbach's conjecture is true in at least one possible world.
- Therefore Goldbach's conjecture is true in all possible worlds. (By 1)
- Therefore Goldbach's conjecture is true.
Plantinga goes on to say, "Take any valid argument: once you see how it works, you may think that asserting or believing the premise is tantamount to asserting or believing the conclusion." His argument can be construed to mean that "either the existence of God is logically impossible or it is logically necessary." I'm sure that those paying Dr. Plantinga's salary are quite pleased that he has gone to elaborate lengths to prove that logical arguments are indeed logical, and modal logic is indeed modal.
Thursday, May 08, 2008
Christian Logic?
I'm checking out Christian Logic. Their list of fallacies is sound and thorough enough. I'll have to check out the rest of the site.
Monday, September 10, 2007
How to solve the logic problems (part 2)
I'll talk here about the last three problems from the How Logical Are You? quiz.
Question 6: One of the very tricky things about logic problems with all/none/some problems is that English is ambiguous and inconsistent about what "not", "none" or "no" really apply to. If the problem doesn't use "some" (see question 8), we can use material implications. "All P's are Q's" means "if P then Q" (p -> q). "No P's are Q's" means "if not P then not Q" as well as "if not Q then not P" (~p -> ~q and ~q -> ~p). Again, we can extract the valid statements by using the contrapositive (see question 1).
If you're middle-aged as I am, you might remember Venn diagrams from your New Math class. A Venn diagram is a good way of visualizing all/some/none relationships.

Notice that "All A's are B's" is asymmetrical: A's relationship to B is different from B's relationship to A. That's why we use the asymmetrical if ... then... On the other hand "No A's are B's" is symmetrical: A has the same relationship to B as B does to A. That's why we use two material implications.
One important trick in logic questions is that "not true" does not always mean "definitely false" (and "not false" does not always mean "definitely true"). In many cases some assertions are—given the premises—uncertain, not determinate, unknown. You can validly assert neither their truth or falsity. This question tries to trick you because it specifies a relationship between musicians and chefs and a relationship between teachers and chefs, but it doesn't specify any relationship at all between musicians and teachers.
Question 7: This question is not a logic question per se, but rather a pattern recognition question. There's a general pattern in these types of problems: The patterns that come earlier in the test are usually simple and direct; they become more complicated and abstract as the test progresses. Since this is the first pattern-type question, the pattern you're looking for is indeed fairly simple.
Once you get the pattern, you can get the answer with some simple arithmetic.
Question 8: Like question 6, this is a all/none/some question, but this question uses "some". "Some A's are B's" (and "some A's are not B's"), however, gives us very little information, so little that you can rarely draw any sort of valid conclusions. The Venn diagrams are ambiguous:

Since "some" questions are ambiguous, you want to look for "none of the above". The definitely true or false statements you can extract from a "some" condition are so grammatically complex and ambiguous that test makers rarely bother to include them.
Question 6: One of the very tricky things about logic problems with all/none/some problems is that English is ambiguous and inconsistent about what "not", "none" or "no" really apply to. If the problem doesn't use "some" (see question 8), we can use material implications. "All P's are Q's" means "if P then Q" (p -> q). "No P's are Q's" means "if not P then not Q" as well as "if not Q then not P" (~p -> ~q and ~q -> ~p). Again, we can extract the valid statements by using the contrapositive (see question 1).
If you're middle-aged as I am, you might remember Venn diagrams from your New Math class. A Venn diagram is a good way of visualizing all/some/none relationships.

Notice that "All A's are B's" is asymmetrical: A's relationship to B is different from B's relationship to A. That's why we use the asymmetrical if ... then... On the other hand "No A's are B's" is symmetrical: A has the same relationship to B as B does to A. That's why we use two material implications.
One important trick in logic questions is that "not true" does not always mean "definitely false" (and "not false" does not always mean "definitely true"). In many cases some assertions are—given the premises—uncertain, not determinate, unknown. You can validly assert neither their truth or falsity. This question tries to trick you because it specifies a relationship between musicians and chefs and a relationship between teachers and chefs, but it doesn't specify any relationship at all between musicians and teachers.
Question 7: This question is not a logic question per se, but rather a pattern recognition question. There's a general pattern in these types of problems: The patterns that come earlier in the test are usually simple and direct; they become more complicated and abstract as the test progresses. Since this is the first pattern-type question, the pattern you're looking for is indeed fairly simple.
Once you get the pattern, you can get the answer with some simple arithmetic.
Question 8: Like question 6, this is a all/none/some question, but this question uses "some". "Some A's are B's" (and "some A's are not B's"), however, gives us very little information, so little that you can rarely draw any sort of valid conclusions. The Venn diagrams are ambiguous:

Since "some" questions are ambiguous, you want to look for "none of the above". The definitely true or false statements you can extract from a "some" condition are so grammatically complex and ambiguous that test makers rarely bother to include them.
Sunday, September 09, 2007
How to solve the logic problems
On Friday, I took the How Logical Are You? quiz. I thought it might be useful to discuss how to answer each of the questions (I won't give the actual answers, though).
Question 1: There's an easy way and a hard way to solve this problem. The easy way is to examine each answer in turn and see if it satisfies the conditions in the question: If Ralph is 60, is he four times as old as Frank is at 15? If so, add 20 years to each age: When Ralph is 80, will he be twice as old as Frank at 35?
There's an additional shortcut. In a question like this, typically all the answers will satisfy the simpler (usually first) condition, but only one will satisfy the more complicated. Check the more complicated condition first for each answer; check the simpler condition only if the answer satisfies the first condition.
You can solve questions like this the hard way by solving simultaneous algebraic equations. Note: You should learn basic algebra, it's very useful.
There are two variables, Ralph's and Frank's present ages, so there have to be two conditions (There are an infinite number of ways for two variables to satisfy one condition). Write down the conditions algebraically (R means Ralph's present age, F means Frank's):
You can also solve this problem using graphs. Each condition specifies a line; the answer is the point where the lines intersect. I've left the numbers off the graph so as to not spoil the question.

Question 2: This question tests your understanding of material implication. The key words are "if ... then ..." and the absence of the key word "only". (See question 4). What follows the "if" is usually called the antecedent (often labeled as p), and what follows the "then" is called the consequent (q); we can express a material implication in logical notation as p -> q. Sometimes the test taker will try to trip you up by reversing the order in the question (see question 4).
There is only one valid way of transforming a material implication by changing the order of the antecedent and consequent and/or adding "not" to either or both: The contrapositive: reverse the antecedent and consequent, and add not to both: if not q then not p (~q -> ~p). All the other combinations are invalid. The two most common invalid combinations have special names: The converse: q -> p and the inverse: ~p -> ~q. All the other combinations (such as ~p -> q) are invalid, but don't have names.
Question 3: This is a fairly easy question to solve; I think most people will get this one right by intuition. It's interesting, though, because it's self-referential: All the possible answers refer to themselves: Each statement refers to a set of statements, and the set includes each statement. To devotees of Russell and Whitehead, this question is not, strictly speaking, meaningful. If you allow this sort of self-reference, directly or indirectly, you can create paradoxes: statements that are both valid and invalid. The problem with Russell and Whitehead's strict prohibition of self reference is, as Kurt Gödel showed, you can end up with statements that are neither true nor false, thus casting philosophical doubt on law of the excluded middle.
Question 4: This is a very tricky question! It looks like the "if" makes "Neko drives" the antecedent and "Neko goes to the movies" the consequent (see Question 1). But "only if" (as opposed to "if and only if") is a tricky grammatical way to introduce the consequent. The question should be read as, "If Neko goes to the movies, then she drives." You can then apply the same sort of analysis as described in question 1: Only the contrapositive is valid; the inverse and converse are not valid.
Question 5: This question is meant to distract you with irrelevant information, so I'll just give you a couple of hints: The total number of socks and shoes is irrelevant; the answer depends only the number of different colors. Second, since the question asks for the fewest number of each to guarantee pair, you need only consider the worst case scenario; it's irrelevant that even fewer choices might give a pair, or that the correct answer might yield more than one pair.
Two of the next three questions are tricky and get into quantified logic and sets. I'll address them in another post.
Question 1: There's an easy way and a hard way to solve this problem. The easy way is to examine each answer in turn and see if it satisfies the conditions in the question: If Ralph is 60, is he four times as old as Frank is at 15? If so, add 20 years to each age: When Ralph is 80, will he be twice as old as Frank at 35?
There's an additional shortcut. In a question like this, typically all the answers will satisfy the simpler (usually first) condition, but only one will satisfy the more complicated. Check the more complicated condition first for each answer; check the simpler condition only if the answer satisfies the first condition.
You can solve questions like this the hard way by solving simultaneous algebraic equations. Note: You should learn basic algebra, it's very useful.
There are two variables, Ralph's and Frank's present ages, so there have to be two conditions (There are an infinite number of ways for two variables to satisfy one condition). Write down the conditions algebraically (R means Ralph's present age, F means Frank's):
- R = 4 * F
- (R + 20) = 2 * (F + 20)
- R = 4 * F
- R = (2 * (F + 20)) - 20 = 2 * F + 2 * 20 - 20 = 2 * F + 20
- 4 * F = 2 * F + 20
You can also solve this problem using graphs. Each condition specifies a line; the answer is the point where the lines intersect. I've left the numbers off the graph so as to not spoil the question.

Question 2: This question tests your understanding of material implication. The key words are "if ... then ..." and the absence of the key word "only". (See question 4). What follows the "if" is usually called the antecedent (often labeled as p), and what follows the "then" is called the consequent (q); we can express a material implication in logical notation as p -> q. Sometimes the test taker will try to trip you up by reversing the order in the question (see question 4).
There is only one valid way of transforming a material implication by changing the order of the antecedent and consequent and/or adding "not" to either or both: The contrapositive: reverse the antecedent and consequent, and add not to both: if not q then not p (~q -> ~p). All the other combinations are invalid. The two most common invalid combinations have special names: The converse: q -> p and the inverse: ~p -> ~q. All the other combinations (such as ~p -> q) are invalid, but don't have names.
Question 3: This is a fairly easy question to solve; I think most people will get this one right by intuition. It's interesting, though, because it's self-referential: All the possible answers refer to themselves: Each statement refers to a set of statements, and the set includes each statement. To devotees of Russell and Whitehead, this question is not, strictly speaking, meaningful. If you allow this sort of self-reference, directly or indirectly, you can create paradoxes: statements that are both valid and invalid. The problem with Russell and Whitehead's strict prohibition of self reference is, as Kurt Gödel showed, you can end up with statements that are neither true nor false, thus casting philosophical doubt on law of the excluded middle.
Question 4: This is a very tricky question! It looks like the "if" makes "Neko drives" the antecedent and "Neko goes to the movies" the consequent (see Question 1). But "only if" (as opposed to "if and only if") is a tricky grammatical way to introduce the consequent. The question should be read as, "If Neko goes to the movies, then she drives." You can then apply the same sort of analysis as described in question 1: Only the contrapositive is valid; the inverse and converse are not valid.
Question 5: This question is meant to distract you with irrelevant information, so I'll just give you a couple of hints: The total number of socks and shoes is irrelevant; the answer depends only the number of different colors. Second, since the question asks for the fewest number of each to guarantee pair, you need only consider the worst case scenario; it's irrelevant that even fewer choices might give a pair, or that the correct answer might yield more than one pair.
Two of the next three questions are tricky and get into quantified logic and sets. I'll address them in another post.
Friday, July 13, 2007
Probability and the anthropic principle
In the comments to his review of Bede Rundle's "Why there is Something rather than Nothing", [Update: Law has pulled the examples into their own post] Stephen Law attempts to undermine the anthropic principle with a number of counterexamples:
The Weak Anthropic Principle[1] states:
It is dangerous and misleading for anyone, even a professional statistician (as I have been told by more than one professional statistician), to trust one's own superficial intuition about probability. We have apparently evolved and learned probabilistic intuitions that, while useful in the special circumstances of ordinary life, are wildly off-base in other contexts.
In order to understand any probabilistic argument, it is, in my opinion, always necessary to rigorously quantify the underlying numbers to be explicit about precisely what one asserts some probability. It is more important to be rigorous than realistic: a rigorous quantification over fictions or counter-factuals such as possible worlds or experiments not performed is preferable to no quantification at all. If the quantification is rigorous, I can at least apprehend the meaning of the probability; any metaphysical argument over the unreality of the quantification can at least be an argument about something well-defined.
In the Frequentist interpretation of probability[2], a probability is the ratio between the count of some subset of a population and the count of the entire population. A frequentist probability is usually estimated by physically computing the the ration between count of a subset of a sample and a count of the entire sample. For instance, if I want to estimate the probability that a white male 18-35 has watched episode CABF08 of the Simpsons, I can randomly select some white males 18-35 and ask them if they watched that episode. In this case,
It's critically important to note that the population is defined by how you choose the sample. In this example, I might call people at random, ask them their race, sex and age, and accept them only if they say they are white, male, and 18-35; otherwise I will not include them in the sample. Since race, sex and age are criteria for choosing the sample, they define the population.
However, it is not just explicit criteria that define the sample, and thus the population; implicit and unconscious criteria also define the sample. For instance, if I were to ask some of my friends who are white males 18-35, then my population would be white males 18-35 who are friends of Larry. If I call people at random from the phone book, then my sample, and thus my population, are white males 18-35 who live in my city. Whenever the definition of sample—implicit or explicit—does not match the definition of the population, a statistical argument becomes vulnerable to a counterargument of selection bias. (As an exercise, examine the selection criteria of Testing Major Evolutionary Hypotheses about Religion with a Random Sample for sampling biases; I've detected two. If Wilson were actually computing statistics, would these biases affect his results?)
I say "vulnerable to" instead of "guilty of" because, of course, every sample has an inherent selection bias: One could reasonably say that the "population" is every white male 18-35 included in the sample. One must make the positive argument that the criterion included in the sample is uncorrelated to the statistic being measured. (I'll discuss in a further essay how to make such an argument.)
The weak anthropic principle, whether applied to the fine tuning argument or the more prosaic examples above is best understood not as a positive argument in itself but rather as a rebuttal to a statistical argument on the grounds of selection bias. Thus, to understand whether or not the weak anthropic principle undermines some argument, it is necessary to quantify the probabilities and determine whether survival or existence is biasing the sample. We then have a basis for evaluating the validity of the weak anthropic principle with regard to Law's examples.
In the first example
The first probability is the probability of surviving the exercise. In this case, there are several ways of quantifying the population: all the possible worlds in which the guesser guesses accurately or inaccurately, the population of arrangements of cards and guesses, the number of counter-factual statements about guesses, etc. They all give the same numbers; you pick whatever suits your metaphysical preferences. It's important to note that survival is not a criterion for defining the population: The population includes possible worlds/arrangements of cards/counter-factual statements in which the guesser dies. So we can meaningfully say that the probability of guessing 52 cards correctly is very low, without regard to the fact that the guesser dies if he guesses wrong, since we're including cases where the guesser guesses wrong (and dies) in the population.
However, we can ask a different question, with a a different population: Given this game exists, what is the probability that I would speak to (and, more importantly, receive an oral response from!) a guesser who has survived? Having survived the game is part of my selection criteria; my population is those people who have played the game and survived. I may be legitimately astonished that he survived the game, but, I'm not statistically entitled to be astonished that I'm talking to a survivor.
There is, of course, the observation that we shouldn't be talking to a survivor at all. But even this observation depends on a rigorous quantification: There have been only a countable number human beings ever alive, and the probability that anyone would have survived is very low, and we know this number independently of the details of the game. We can use a possible worlds or other abstract or fictional quantification to evaluate the probability of anyone surviving.
The weak anthropic principle does not itself argue for a chance rather than causal explanation for some low-probability event. It argues only that we're not entitled to compute some probabilities about some populations, because our sample is biased, and does not represent the population we're trying to draw conclusions about.
[1] The strong anthropic principle states, in contrast, that X is true because we are here to observe it, e.g. that human intelligence caused the physical universe to come into existence or to have its particular properties. This is not as science-fictional a principle as it might first appear: The strong anthropic principle has been offered as a serious explanation to measurement problem and the role of the observer in quantum mechanics.
[2] It's not my intention in this essay to deny the Bayesian interpretation of probability. Even the staunchest Bayesian admits that the frequentist interpretation has validity and meaning; she denies that only frequentist probabilities have meaning.
Here's a Swinburne type illustration. Suppose I am asked to guess each one of 52 cards, one by one. If I ever get one wrong, my brains will be blown out.From the context, I suspect that Law is not really trying to argue against the weak anthropic principle itself, but arguing rather against naive conceptions which appear to undermine the Rundle's probabilistic argument. I think Rundle's argument fails for other reasons (which I might address later), but the arguments Law reproduces are good counterexamples for the weak anthropic principle and deserve rebuttal on their own right.
I start guessing, and amazingly, I get all 52 cards correct. Now you may say, "What's so improbably about that? After all, the probability of you getting them all right is 1, as you wouldn't be here otherwise would you?"
But of course, there's a sense in which something deeply improbably has happened. So improbable, in fact, that it would be reasonable for me to suspect this result wasn't just a matter of chance.
As a condemned spy, you are put before a firing squad of twenty expert marksmen, who load aim, and fire at your heart from close range.
Amazingly, they all miss. You feign death, and survive.
Pure luck that they all missed? Possibly. But highly unlikely.Far more likely that the miss was deliberately arranged.
It won't do to now say "But their all missing is not amazing at all. It's wholly unremarkable. After all, had they not all missed, I would not be here to ponder my luck!"
The Weak Anthropic Principle[1] states:
WAP: We observe X because if not-X, then we would not be here to observe it.The weak anthropic principle is the strongest response to the Fine Tuning Argument for the existence of God, but, as Law notes, it can be discussed in more prosaic contexts.
It is dangerous and misleading for anyone, even a professional statistician (as I have been told by more than one professional statistician), to trust one's own superficial intuition about probability. We have apparently evolved and learned probabilistic intuitions that, while useful in the special circumstances of ordinary life, are wildly off-base in other contexts.
In order to understand any probabilistic argument, it is, in my opinion, always necessary to rigorously quantify the underlying numbers to be explicit about precisely what one asserts some probability. It is more important to be rigorous than realistic: a rigorous quantification over fictions or counter-factuals such as possible worlds or experiments not performed is preferable to no quantification at all. If the quantification is rigorous, I can at least apprehend the meaning of the probability; any metaphysical argument over the unreality of the quantification can at least be an argument about something well-defined.
In the Frequentist interpretation of probability[2], a probability is the ratio between the count of some subset of a population and the count of the entire population. A frequentist probability is usually estimated by physically computing the the ration between count of a subset of a sample and a count of the entire sample. For instance, if I want to estimate the probability that a white male 18-35 has watched episode CABF08 of the Simpsons, I can randomly select some white males 18-35 and ask them if they watched that episode. In this case,
- The population is all the white males 18-35
- The probability is the number of white males 18-38 who watched that episode divided by the total number of all white males 18-35
- The sample is all the white males 18-35 who I actually asked
- The estimate of the probability is the number of white males 18-35 who actually told me they watched that episode divided by the number of white males 18-35 in the sample
It's critically important to note that the population is defined by how you choose the sample. In this example, I might call people at random, ask them their race, sex and age, and accept them only if they say they are white, male, and 18-35; otherwise I will not include them in the sample. Since race, sex and age are criteria for choosing the sample, they define the population.
However, it is not just explicit criteria that define the sample, and thus the population; implicit and unconscious criteria also define the sample. For instance, if I were to ask some of my friends who are white males 18-35, then my population would be white males 18-35 who are friends of Larry. If I call people at random from the phone book, then my sample, and thus my population, are white males 18-35 who live in my city. Whenever the definition of sample—implicit or explicit—does not match the definition of the population, a statistical argument becomes vulnerable to a counterargument of selection bias. (As an exercise, examine the selection criteria of Testing Major Evolutionary Hypotheses about Religion with a Random Sample for sampling biases; I've detected two. If Wilson were actually computing statistics, would these biases affect his results?)
I say "vulnerable to" instead of "guilty of" because, of course, every sample has an inherent selection bias: One could reasonably say that the "population" is every white male 18-35 included in the sample. One must make the positive argument that the criterion included in the sample is uncorrelated to the statistic being measured. (I'll discuss in a further essay how to make such an argument.)
The weak anthropic principle, whether applied to the fine tuning argument or the more prosaic examples above is best understood not as a positive argument in itself but rather as a rebuttal to a statistical argument on the grounds of selection bias. Thus, to understand whether or not the weak anthropic principle undermines some argument, it is necessary to quantify the probabilities and determine whether survival or existence is biasing the sample. We then have a basis for evaluating the validity of the weak anthropic principle with regard to Law's examples.
In the first example
Here's a Swinburne type illustration. Suppose I am asked to guess each one of 52 cards, one by one. If I ever get one wrong, my brains will be blown out.there are two quantifiable populations, about which we can discuss two distinct probabilities.
I start guessing, and amazingly, I get all 52 cards correct. Now you may say, "What's so improbably about that? After all, the probability of you getting them all right is 1, as you wouldn't be here otherwise would you?"
The first probability is the probability of surviving the exercise. In this case, there are several ways of quantifying the population: all the possible worlds in which the guesser guesses accurately or inaccurately, the population of arrangements of cards and guesses, the number of counter-factual statements about guesses, etc. They all give the same numbers; you pick whatever suits your metaphysical preferences. It's important to note that survival is not a criterion for defining the population: The population includes possible worlds/arrangements of cards/counter-factual statements in which the guesser dies. So we can meaningfully say that the probability of guessing 52 cards correctly is very low, without regard to the fact that the guesser dies if he guesses wrong, since we're including cases where the guesser guesses wrong (and dies) in the population.
However, we can ask a different question, with a a different population: Given this game exists, what is the probability that I would speak to (and, more importantly, receive an oral response from!) a guesser who has survived? Having survived the game is part of my selection criteria; my population is those people who have played the game and survived. I may be legitimately astonished that he survived the game, but, I'm not statistically entitled to be astonished that I'm talking to a survivor.
There is, of course, the observation that we shouldn't be talking to a survivor at all. But even this observation depends on a rigorous quantification: There have been only a countable number human beings ever alive, and the probability that anyone would have survived is very low, and we know this number independently of the details of the game. We can use a possible worlds or other abstract or fictional quantification to evaluate the probability of anyone surviving.
The weak anthropic principle does not itself argue for a chance rather than causal explanation for some low-probability event. It argues only that we're not entitled to compute some probabilities about some populations, because our sample is biased, and does not represent the population we're trying to draw conclusions about.
[1] The strong anthropic principle states, in contrast, that X is true because we are here to observe it, e.g. that human intelligence caused the physical universe to come into existence or to have its particular properties. This is not as science-fictional a principle as it might first appear: The strong anthropic principle has been offered as a serious explanation to measurement problem and the role of the observer in quantum mechanics.
[2] It's not my intention in this essay to deny the Bayesian interpretation of probability. Even the staunchest Bayesian admits that the frequentist interpretation has validity and meaning; she denies that only frequentist probabilities have meaning.
Subscribe to:
Posts (Atom)