Sunday, December 04, 2005

Bayesian Reasoning




Currently, I am trying to understand a form of reasoning called "Bayesian logic." this is important to me because I'm trying to understand the work of a scientist named Eric Schadt. Eric has developed a very important new way of correlating mRNA expression data with the genes responsible for controlling patterns of mRNA expression.

Eric's work allows one to look for patterns of gene expression and compute the site in the genome that might control specific patterns. If a specific pattern appears in a population of people, the obvious question is whether a site in the genome can explain the variations in the pattern. If the variations in pattern correlate with some disease pattern associated with a specific subset of people, the implication is very strong that a mutation is present at the indicated site in the genome.

Eric's contribution is to use Bayesian logic to connect gene expression patterns to specific sites in the genome. The concept is very simple. (How many times have you heard that before?) Nowadays we have strains of inbred mice. We also have a map all the locations all of the genes in all these mice. So if we look at several strains of mice and find that these mice all have the same pattern of gene expression for some specific group of genes, and that these mice all have the same set of genes in a specific location, it is likely that genes in that location are controlling the expression pattern! This probability can be strengthened if we know something about the rules regulating the control of expression of mRNAs by specific genes. Those rules, of course, depend on the metabolic pathways leading from a gene to its products to the interactions of those products to the influences of the gene products on each other and finally back to the effects of the gene products on the genes responsible for expressing the mRNAs seen in the expression patterns.

Of course each of these interactions between gene products is probabilistic. The mathematical models developed by Eric and his colleagues allow them to create a network of such probabilities.

I am still trying to understand the most basic concept that is the concept of a Bayesian relationship itself. So here is my first attempt based on an explanation I found on the web. The idea is that I have green and red marbles. While blindfolded I dropped three of them into a bucket. I can't see inside the bucket. So I reach in three times and each time pull ot of on marble. Each time I pull out a red marble. The question is can I quantify the probability that all three marbles are red?

Bayes' Theorem, as shown here, allows me to do this. p(AB) is the probability that all three marbles or read given that I found a red marble three times. P(B), is simply the probability that all three marbles will be read by chance without any evidential data at all. p(A+B) is the probability that all balls are red and that all the selections will be red I must admit I find this last formula less than intuitive. I think it means the total of all conditions where the selected balls will be red and all the balls are red is only 1/2 of all the events where all the selections are RED.

p(AB)= (the probability that all of the balls are RED and our experiments come out RED/ the probability that all the balls are RED).

Put another way: the probability that something is true based on the available evidence= the probability that we would expect to see it as true assuming it is true /divided by the probability that it is true.
span.fullpost {display:inline;}

1 comment:

SM Schwartz said...

Take a look here for a great example of the use of Bayesian statistics:

http://www.abelard.org/briefings/bayes.htm