As you know by now, random mutations occasionally occur within any population of organisms. Each mutation makes the organism in which it occurs more or less fit — more or less suited to its particular environment. Mutations may be beneficial, neutral, or disadvantageous depending on the selective pressure put on the organism by the environment in which it is living. Due to that mutation, the organism may be better or worse able to compete for resources, or in other words, natural selection is acting on that organism. Recall that during the coal-powered Industrial Revolution in England, the melanistic Peppered Moths were better camouflaged against the sooty, black, city tree trunks, and thus, lived to reproduce. Perhaps, however, they were initially less fit with respect to some of their other genes. Over time, more mutations would occur and be acted upon by natural selection such that, over time, the gene pool of the city-dwelling Peppered Moths would gradually change to “improve” the reproductive potential of the black moths, to make them more fit in/for that particular environment.
That’s the idea behind adaptation. While disadvantageous mutations are “weeded out,” other, beneficial and neutral mutations are retained, resulting in increased variation within the gene pool of the population. The beneficial and neutral mutations are passed on to the next generation through sexual reproduction, resulting in an overall change in the percentages of the various alleles within that population.
Because different populations are under different selective pressures, different alleles, both “good” and “bad,” are present in different populations in different percentages. In humans, consider the uneven distribution within the various ethnic groups of skin-color alleles, eye-color alleles, or alleles for genetic disorders like sickle-cell or Tay-Sachs.
Given a suitable environment and enough elapsed time following the original mutation, the gene pool of a specific population will usually reach some equilibrium point where the proportions of the various alleles become stable. Thus, initially, we need to make several background assumptions when examining the percentages of alleles in a population:
If all of these conditions are fulfilled, then it is possible to calculate/predict the frequency of each allele in a population. The Hardy-Weinberg Law (1908) says that, for some gene with alleles A and a, if the frequency/proportion of A is “p” and the frequency/proportion of a is “q,” then:
However, if all of these conditions are not fulfilled and there are other factors like mutation, migration, or natural selection influencing the gene pool, then those factors must also be taken into account (as explained below).
For PTC (phenylthiocarbamide) paper, about 70% of the
population can taste the paper (the dominant trait) and about 30% cannot.
This means that genotypes TT and Tt (together since we can’t tell them
apart) make up about 0.70 of the population, so we can say that
p2 + 2pq = 0.70, and genotype tt makes up about 0.30 of the
population, so we can say that q2 = 0.30.
If q2 = 0.30,
then by taking the square root of both sides, q = 0.55.
Since p + q must = 1, then p = 1 – q = 1 – 0.55 = 0.45.
This means that p2 = (0.45)2 = 0.20
and 2pq = 2 × 0.55 × 0.45 = 0.50.
Note that p2 + 2pq + q2 = 0.20 + 0.50 + 0.30 = 1.
Then, the probability of a dominant phenotype
(taster) equals the sum of all possible ways to have a dominant genotype
(TT and Tt), so the probability of a taster is equal to
the probability of TT + the probability of Tt,
or 0.20 + 0.50 = 0.70
(to double check, note that this is
equal to the 70% statistic noted above). Suppose you want to determine the
probability of two tasters having a child who is a non-taster. For the child
to be a non-taster, his/her genotype must be tt, and the only way this could
be is if both parents are genotype Tt. Out of all people with dominant
phenotypes (TT and Tt), 0.20/0.70 = ²⁄₇ of them have the
genotype TT and 0.50/0.70 = ⁵⁄₇ of them have genotype Tt. Thus, for this population,
the probability of both parents being Tt, is
⁵⁄₇ × ⁵⁄₇ = ²⁵⁄₄₉ ≈ ½.
If both parents are Tt, then from a regular Punnett square, remember that
the probability of them having a tt child is ¼. This means that the
probability of two tasters having a non-taster child is
~½ × ¼ = ~⅛ = 0.128 or 12.8%.
The Hardy-Weinberg Law can also be used to determine if breeding is, indeed, random.
For the human MN blood group, a person can have either type M
(genotype MM) or type MN (genotype MN) or type N (genotype NN). From a
population in Australia there were 320 type M people, 800 type MN, and 880
type N, so:
320 | MM | therefore | 640 | M alleles |
800 | MN | therefore | 800 | M alleles |
and | 800 | N alleles | ||
880 | NN | therefore | 1760 | N alleles |
2000 | people | therefore | 4000 | alleles |
Note that these 4000 alleles include 1440 M alleles and
1440/4000 = 0.36 = 36% of the alleles.
Similarly, there are 2560 N alleles, so 2560/4000 = 0.64 = 64% of the
alleles.
Thus, p = 0.36 and q = 0.64.
If p and q have these values and there is random breeding, then we
should expect to see
p2 = 0.13, 2pq = 0.46, and q2 = 0.41.
This means that in a population of 2000 individuals we should then expect to see
0.13 × 2000 = 260 type M people,
0.46 × 2000 = 920 type MN people, and
0.41 × 2000 = 820 type N people.
A χ² comparison with the actual numbers indicates that there
is not random breeding (or the two sets of numbers would be more nearly the
same):
O | E | (O–E)2/E | |
---|---|---|---|
320 | 260 | 13.846 | |
800 | 920 | 15.652 | |
880 | 820 | 4.390 | |
Σ = 33.888 | = χ²calc |
The degrees of freedom (df) = 3 – 1 = 2, so χ²tab at the 0.05 level = 5.991. Thus, χ²calc is greater than χ²tab, so the null hypothesis can be rejected: there is not random breeding.
As an example of non-random breeding, consider plants that are self-pollinating. In this case, AA can only mate with AA, Aa can only mate with Aa (producing AA, Aa, and aa offspring), and aa can only mate with aa. Thus, over time, the number of AA and aa individuals in the population increase relative to the number of Aa individuals.
The Hardy-Weinberg Law can also be used in cases of sex-linked genes.
In humans, red-green colorblindness is a sex-linked recessive
trait. A colorblind male has the genotype XbY and a normal male
has the genotype XBY, thus for males, the genotype and phenotype
frequencies are equal. Females could be XBXB (normal),
XBXb (carrier), or XbXb
(colorblind), thus the normal Hardy-Weinberg Law applies. If 10% (0.10) of
the males are colorblind and 90% (0.90) are normal, then p = 0.90 and
q = 0.10 because each male has only one allele for this gene. From this the
phenotype frequencies for the females can be calculated:
p2 = (0.90)2 = 0.81 (or 81%) normal,
2pq = 2 × 0.90 × 0.10 = 0.18 (or 18%) carriers, and
q2 = (0.10)2 = 0.01 (or 1%) colorblind.
A slightly more complicated case is that of multiple alleles.
In the human ABO blood group, there are three alleles for this
gene: IA, IB, and i. A person can have any two of the
three alleles, so
IAIA or IAi make type A blood,
IBIB or IBi make type B blood,
IAIB makes type AB blood, and
ii makes type O blood.
Let p represent the frequency of IA,
q represent the frequency of IB, and
r represent the frequency of i.
Remember that p + q + r = 1.
The Hardy-Weinberg Law says that at equilibrium,
(p + q + r)2 = p2 + q2 + r2 + 2pq + 2rp + 2qr = 1
where p2 is the probability of IAIA and 2pr
is the probability of IAi
(thus probability of type A = p2 + 2pr),
q2 is the probability of IBIB and 2qr is the
probability of IBi
(thus probability of type B = q2 + 2qr),
r2 is the probability of ii
(thus probability of type O = r2),
and 2pq is the probability of IAIB
(thus probability of type AB = 2pq).
If out of 2000 people, 37.8% are type A, 14.0% type B, 4.5% type AB, and
43.7% type O,
then r2 = 0.437 so r = 0.661.
Temporarily “ignoring” IB, the frequency of type A blood +
frequency of type O blood =
[p2 + 2pr] + r2 = (p + r)2.
We know that there are 37.8% with type A blood (p2 + 2pr) and
43.7% with type O blood (r2),
so (p + r)2 = 0.378 + 0.437 = 0.815.
Then p + r = √0.815 = 0.903,
so p = 0.903 – r = 0.903 – 0.661 = 0.242.
Also, the frequency of type B + frequency of type O =
q2 + 2qr + r2 = (q + r)2 = 0.140 + 0.437 = 0.577,
so (q + r) = √0.577 = 0.760.
Therefore, q = 0.760 – r = 0.760 – 0.661 = 0.099.
To check, p + q + r = 0.242 + 0.099 + 0.661 = 1.0, so that is correct.
Also, p2 + q2 + r2 + 2pq + 2pr + 2qr = 0.059 + 0.010 + 0.437 + 0.048 + 0.320 + 0.131 = 1.005,
so that is correct.
The Hardy-Weinberg Law also applies for multiple genes — either linked or non-linked.
For a gene with alleles A and a,
pA + qa = 1 and
p2A + 2pAqa + q2a = 1
and for some other gene with alleles B and b,
pB + qb = 1 and
p2B + 2pBqb + q2b = 1.
Note that, assuming these are not linked genes, pA and
qa are totally unrelated to and independent of pB and
qb.
If, for example, it is desired to find the frequency of AaBb, multiply the
frequencies needed (2pAqa × 2pBqb).
If pA = 0.2 (thus qa = 0.8) and if pB = 0.1
(thus qb = 0.9),
then the probability of AaBb should be 2 × 0.2 × 0.8 × 2 × 0.1 × 0.9 = 0.058.
The effects of mutation can also be calculated. If mutations
of allele A to allele A' (not necessarily A to a — could be a to A) recur at
a given rate, and if this is the only mutation of A that takes place, then
the frequency of A can go from 100% down to 0% while the frequency of A' goes
from 0% to 100%. We can symbolize the mutation rate as u, and for
this example, let it be 1 × 10–6. There is a concept that
is sort of like the idea of half-life in radioactive isotopes — the number of
generations (t) needed to reduce the frequency of A to 0.5 of its
original value. The equation for this is
t = –ln(0.5)/u, where ln(0.5) = –0.693,
so if u = 1 × 10–6,
t = –(–0.693)/(1 × 10–6) = 6.93 × 10–1+6 = 6.93 × 105 = 693,000
generations. Thus, if pA starts out at 0.96, after 693,000
generations, pA will be 0.48 and after another 693,000 generations
(= 1,386,000 total) it will be 0.24.
If the reverse mutation also occurs, eventually, the population will reach an equilibrium point where the frequencies of A and A' alleles are stable.
For a gene with alleles A and a, p is the frequency of A and
q is the frequency of a. Let u be the rate of mutation from
A → a and v be the rate of mutation from a → A.
For allele a, the net rate of change in frequency,
Δq = gain – loss = up – vq. When equilibrium is established,
gain = loss, or Δq = 0 and up = vq. Thus, at equilibrium:
vq | = | up |
= | u(1 – q) | |
= | u – uq | |
vq + uq | = | u |
q(u + v) | = | u |
q | = | u/(u + v) |
The effects of migration can also be calculated. If immigration is sporadic, we say there is gene exchange between populations. If it happens more often, there is gene flow.
If we let qinit be the q value of the
original population under consideration for a certain allele,
qm be the q value for the migratory population,
qfin be the frequency after immigration has taken place
and m be the fraction of the total population which are new migrants
, then
qfin is comprised of the probability of that allele in the
migrants (qm) times the proportion of migrants (m) plus the
frequency of that allele in the non-migrants (qinit) times the
proportion of non-migrants (1 – m), or:
qfin | = | mqm + (1 – m)qinit |
= | mqm – mqinit + qinit | |
= | m(qm – qinit) + qinit. |
Δq | = | qfin – qinit |
= | m(qm – qinit). |
Natural selection affects different genotypes (AA, Aa, aa)
differently — selection will be for or against each one of these genotypes.
Let w = adaptive value and s = selection coefficient, such
that w + s = 1. Note that as w increases, s decreases and visa versa. These
values are assigned to each genotype (irrespective of the other genotypes).
W = 1 if the genotype is the most adaptive (able to adapt to new conditions).
If w is less than 1, there is selection against that genotype. Also,
the larger s is, the more selection will take place. For example, if we
have:
AA | Aa | aa | |
---|---|---|---|
w = | 1.00 | 1.00 | 0.99 |
AA | Aa | aa | |
---|---|---|---|
w = | 0.90 | 0.90 | 1.00 |
Three different cases are possible:
If the initial value of p (pi) = 0.8, the initial
value of q (qi) = 0.2, and s = 0.1, over the course of a number
of generations, p will decrease until it becomes 0 and q will increase until
it reaches 1. How long this takes depends on what s is — if s = 1, then AA
and Aa will not produce any offspring. For example, a population could
have:
AA | Aa | aa | |
---|---|---|---|
w = | 0.90 | 0.95 | 1.00 |
s = | 0.10 | 0.05 | 0.00 |
In this case, the general formula would be:
AA | Aa | aa | |
---|---|---|---|
w = | 1.00 | 1.00 | 1–s |
AA (0.176) | Aa (0.823) | |
---|---|---|
AA (0.176) | AA × AA 0.176 × 0.176 = 0.031 | AA × Aa 0.176 × 0.823 = 0.145 |
Aa (0.823) | Aa × AA 0.176 × 0.823 = 0.145 | Aa × Aa 0.823 × 0.823 = 0.677 |
The frequencies/probabilities of the possible progeny from
each of these matings would be:
AA | Aa | aa | |
---|---|---|---|
AA × AA | 1.000 × 0.031 = 0.031 | –none– | –none– |
AA × Aa | 0.500 × 0.290 = 0.145 | 0.500 × 0.290 = 0.145 | –none– |
Aa × Aa | 0.250 × 0.677 = 0.169 | 0.500 × 0.677 = 0.338 | 0.250 × 0.677 = 0.169 |
If s = 1.00 against aa (all of them die) and we start with qi = pi = 0.50, the following chart can be constructed:
gen. | qa init | qa2 | pA2 | 2pAqa | ratio of Aa:aa |
proportion of Aa left (2pAqa)/(pA2 + 2pAqa) |
prob Aa×Aa (Aa left)×(Aa left) |
probaa offsp = (qa fin)2 (probAa×Aa)×0.25 |
qa fin |
---|---|---|---|---|---|---|---|---|---|
1 | 0.50 | 0.50 × 0.50 = 0.25 |
0.50 × 0.50 = 0.25 |
2 × 0.50 × 0.50 = 0.50 |
0.50:0.25 = 2:1 |
0.50/(0.50 + 0.25) = 0.67 |
0.67 × .067 = 0.44 |
0.25 × 0.44 = 0.11 |
√0.11 = 0.33 |
2 | 0.33 | 0.11 | 0.44 | 0.44 | 4:1 | 0.50 | 0.25 | 0.06 | 0.25 |
3 | 0.25 | 0.06 | 0.56 | 0.38 | 6:1 | 0.40 | 0.16 | 0.04 | 0.20 |
4 | 0.20 | 0.04 | 0.64 | 0.32 | 8:1 | 0.33 | 0.11 | 0.03 | 0.17 |
5 | 0.17 | 0.03 | 0.69 | 0.28 | 10:1 | 0.29 | 0.08 | 0.02 | 0.14 |
n | 1 (n+1) |
1 (n+1)2 |
n2 (n+1)2 |
2n (n+1)2 |
2n:1 | 2 (n+2) |
4 (n+2)2 |
1 (n+2)2 |
1 (n+2) |
100 | 9.9 × 10–3 | 9.8 × 10–5 | 9.8 × 10–1 | 1.96 × 10–2 | 200:1 | 1.96 × 10–2 | 3.84 × 10–4 | 9.61 × 10–5 | 9.8 × 10–3 |
Note especially the column for the ratio of Aa:aa. This means
that for every aa that is produced (then dies or becomes sterile), there are
2n number of Aa produced that could have aa offspring (for example, for the
100th generation, 200 Aa are produced for every aa), thus the more
generations that occur, the less effective continued selection actually
is.
If, however, s = 0.10 against aa, then:
gen | pi | qi | p2 = # of AA |
2pq = # of Aa |
q2 = # of aa |
p2×1 = # of A gametes from AA |
2pq×1 = # of A & a gametes from Aa |
q2(1–s) = # of a gametes from aa |
total gametes =1 – sq2 |
# of A gametes |
# of a gametes |
fract. of A gametes (pf) |
fract. of a gametes (qf) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.5000 | 0.5000 | 0.2500 | 0.5000 | 0.2500 | 0.2500 | 0.2500 of A 0.2500 of a |
0.2500 × 0.9000 = 0.2250 |
0.9750 | 0.5000 | 0.4750 | 0.5128 | 0.4872 |
2 | 0.5128 | 0.4872 | 0.2632 | 0.4997 | 0.2372 | 0.2632 | 0.2498 of each |
0.2134 | 0.9763 | 0.5130 | 0.4633 | 0.5254 | 0.4745 |
3 | 0.5254 | 0.4745 | 0.2761 | 0.4987 | 0.2252 | 0.2761 | 0.2493 of each |
0.2027 | 0.9775 | 0.5255 | 0.4520 | 0.5376 | 0.4624 |
Note that with this smaller value for s, less selection takes
place. The general expressions for the gametes produced are:
AA genotypes contribute p2 × 1 of the gametes, all of which are A.
Aa genotypes contribute 2pq × 1 of the gametes, which are half A and half a.
aa genotypes contribute q2 × (1 – s) of the gametes, all of which are
a.
Note that this means there are p2 + pq of the A
gametes produced
and pq + q2 × (1 – s) = pq + q2 – sq2 of the
a gametes produced.
Then, the total number of gametes produced is
(p2 + 2pq) + q2(1 – s) = p2 + 2pq + q2 – sq2.
Remember that p2 + 2pq + q2 = 1, so
p2 + 2pq + q2 – sq2 can be simplified to
1 – sq2.
Thus, q for the next generation (n + 1) can be calculated
from the number of a gametes produced divided by the total
number of gametes produced or:
qn+1 | = | [qn2(1 – s) + 0.5×(2pnqn)]/(1 – sqn2) |
= | [qn2 – sqn2 + pnqn]/(1 – sqn2) | |
= | [qn2 – sqn2 + (1 – qn) × qn]/(1 – sqn2) | |
= | [qn2 – sqn2 + qn – qn2]/(1 – sqn2) | |
= | [qn – sqn2]/(1 – sqn2) |
We can, then, calculate the change in q (Δq) as a
result of selection:
Δq = qn+1 – qn | = | [qn – sqn2]/(1 – sqn2) – qn |
= | [qn – sqn2]/(1 – sqn2) – [qn(1 – sqn2)]/(1 – sqn2) | |
= | [qn – sqn2]/(1 – sqn2) – [qn – sqn3]/(1 – sqn2) | |
= | [qn – sqn2 – qn + sqn3]/(1 – sqn2) | |
= | [sqn3 – sqn2]/(1 – sqn2) | |
= | –sqn2(1 – qn)/(1 – sqn2) where, if it helps, 1 – qn = pn |
So, for the above example,
gen | q = | Δq = qf – qi |
---|---|---|
1 | 0.500 | –0.0128 |
2 | 0.487 | –0.0125 |
3 | 0.475 | –0.0121 |
In this example, Aa is superior to either AA or aa, and this
is called overdominance or balanced polymorphism. An example
of this would be the gene for sickle-cell anemia and resistance to malaria
in humans. Thus, for example:
AA | Aa | aa | |
---|---|---|---|
w = | 1 – sA | 1.00 | 1 – sa |
After a generation of selection, there will be
p2(1 – sA) | from AA, | |
+ | 2pq | from Aa, and |
+ | q2(1 – sa) | from aa, so |
= | 1 – sAp2 – saq2 | total gametes produced. |
gen | pi | qi | p2 | q2 | 2pq | p2(1–sA) | q2(1–sa) | total | # A p2(1–sA) + pq |
# a q2(1–sa) + pq |
% of A | % of a |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.500 | 0.500 | 0.250 | 0.250 | 0.500 | 0.200 | 0.100 | 0.800 | 0.450 | 0.350 | 56.250 | 43.750 |
2 | 0.562 | 0.438 | 0.316 | 0.191 | 0.492 | 0.253 | 0.077 | 0.822 | 0.499 | 0.323 | 60.740 | 39.260 |
3 | 0.607 | 0.393 | 0.369 | 0.154 | 0.477 | 0.295 | 0.062 | 0.834 | 0.534 | 0.300 | 64.004 | 35.996 |
Eventually, the population will reach equilibrium. The number
of generations required will be determined by sA and sa.
At equilibrium,
pn+1 | = | pn |
so p | = | [p2(1 – sA) + ½(2pq)] / [p2(1 – sA) + (2pq) + q2(1 – sa)] |
and q | = | 1 – p |
so p | = | [p2(1 – sA) + p(1 – p)] / [p2(1 – sA) + 2p(1 – p) + (1 – p)2(1 – sa)] |
1 | = | [p(1 – sA) + (1 – p)] / [p2(1 – sA) + 2p(1 – p) + (1 – p)2(1 – sa)] |
[p(1 – sA) + (1 – p)] | = | [p2(1 – sA) + 2p(1 – p) + (1 – p)2(1 – sa)] |
0 | = | (p2 – p)(1 – sA) + (1 – p)(2p – 1) + (1 – p)2(1 – sa) |
0 | = | –p(1 – sA) + (2p – 1) + (1 – p)(1 – sa) |
0 | = | –p + psA + 2p – 1 + 1 – p – sa + psa |
0 | = | psA – sa + psa |
sa | = | p(sA + sa) |
thus, p̂ | = | sa/(sA + sa) |
and similarly, q̂ | = | sA/(sA + sa). |
AA | Aa | aa | |
---|---|---|---|
w = | 1.0 | 1.0 | 0.8 |
AA | Aa | aa | |
---|---|---|---|
w = | 0.6 | 1.0 | 0.8 |
Several interactions between these two factors are possible.
Consider the following examples:
AA | Aa | aa | |||||
---|---|---|---|---|---|---|---|
1. | w = | 1 – s | 1 – s | 1 | and | ![]() | selection & mutation work in same direction |
2. | w = | 1 – s | 1 – s | 1 | and | ![]() | selection & mutation working against each other — eventually, equilibrium will be established |
3. | w = | 1 | 1 | 1 – s | and | ![]() | selection & mutation working against each other here too |
Recall that, due to mutation, Δqmut = up – vq and because of
selection, Δqsel = –sq2(1 – q)/(1 – sq2).
Equilibrium will be reached when the effects of mutation and selection
cancel out each other — when they are working in exactly the opposite
direction, or Δqmut = –Δqsel so:
up – vq | = | –[–sq2(1 – q)/(1 – sq2)] |
u(1 – q) – vq | = | sq2(1 – q)/(1 – sq2) |
u(1 – q) | ≃ | sq2(1 – q) |
u | ≃ | sq2 |
There are also several cases of mutation from recessive to
dominant with selection against the dominant. A semidominant lethal
refers first to the expression of the gene and secondly to the vitality of
the organism. A gene is semilethal if the organism is born alive,
lives for a while, then dies. A gene is subvital if the organism is
not as viable. In general, these can all be expressed as:
AA | Aa | aa | |||
---|---|---|---|---|---|
4. | w = | 1 – s | 1 | 1 |
AA | Aa | aa | |||
---|---|---|---|---|---|
4a. | w = | 0 | 1 | 1 |
AA | Aa | aa | |||
---|---|---|---|---|---|
4b. | w = | 0 | 1 – s | 1 |
AA | Aa | aa | |||
---|---|---|---|---|---|
4c. | w = | 0 | 0 | 1 |