In Appendix H of *Intermediate Physics for Medicine and Biology*, Russ Hobbie and I briefly mention the severe lung disease cystic fibrosis. Analyzing this disease provides an opportunity to examine the prevalence of a genetic disorder. I’ll do this by creating a new homework problem.

Appendix H

Problem 5 ½. About 1 in every 2500 people is born with cystic fibrosis, anautosomal recessive disorder. What is the probability of the gene responsible for cystic fibrosis in the population? What fraction of the population are carriers of the disease?

To answer these questions, first we must know that an “autosomal recessive disorder” is one in which you only get the disease if you have two copies of a recessive gene. To a first approximation, there are often two variants (or alleles) of a gene governing a particular protein: dominant (*A*) and recessive (*a*). In order to have cystic fibrosis, you must have two copies of the recessive allele (*aa*). If you have only one copy (*Aa*), you are healthy but are a carrier for the disease: your children could potentially get the disease if your mate is also a carrier. If you have no copies of the recessive allele (*AA*) then you’re healthy and your children will also be healthy.

Let’s assume the probability of the dominant allele is *p*, and the probability of the recessive allele is *q*. Since we assume there are only two possibilities, we know that *p* + *q* = 1. Our goal is to find *q*, the probability of the gene responsible for cystic fibrosis in the population.

When two people mate, they each pass on to their offspring one of their two copies of the gene. The probability that both parents are dominant (*AA*), so the child is normal, is *p^*2. The probability that both parents are recessive (*aa*), so the child has the disease, is *q^*2. There are two ways for the child to be a carrier: *A* from dad and *a* from mom, or *a* from dad and *A* from mom. So, the probability of a child being a carrier (*Aa*) is 2 *pq*. There are only three possibilities or genotypes: *AA*, *Aa*, and *aa*. The sum of their probabilities must equal one: *p^*2+ 2 *pq* + *q^*2= 1. But this expression is equivalent to ( *p* + *q*)^2= 1, and we already knew that *p* + *q* = 1, so the result isn’t surprising.

The only people that suffer from cystic fibrosis have the genotype *aa*, so *q*^2 is equal to the fraction of people with the disease. The problem states that this fraction is 1/2500 (0.04%). So, *q* is the square root of 1/2500, or 1/50 (2%; wasn’t that nice of me to make the fraction be the reciprocal of a perfect square?). One out of every fifty copies of the gene governing cystic fibrosis is defective (that is, it is the recessive version that can potentially lead to the disease). If *q* is 1/50, then *p* is 49/50 (98%). The fraction of carriers is 2*pq*, or 3.92%. The only reason this result is not exactly 4% is that we don’t count someone with the disease (*aa*) as a carrier, even though they could pass the disease to their children (a carrier by definition has the genotype *Aa*). If we are rounding off our result to the nearest percent, then 1 out of every 25 people (4% of the population) are carriers.

This calculation is based on several assumptions: no natural selection, no inbreeding, and no selection of embryos based on genetic testing. Cystic fibrosis is such a severe disease that often victims don’t survive long enough to have children (modern medicine is making this less true). The untreated disease is so lethal that one wonders why natural selection didn’t eliminate it from our gene pool long ago. One possible reason is that carriers of cystic fibrosis might be better able to resist other diseases-such as cholera, typhoid fever, or tuberculosis -than are normal people.

*Originally published at **http://hobbieroth.blogspot.com**.*