Today (when I started this post many months ago) I compared the results of my genetic sequencing with my parents (predictable to be honest). We share a lot--and my parents share none between the two of them.
I'll be transparent. Much of what I learned in 10th grade biology was beyond my ability to apply practically here, and despite doing significant work in mechanobiology and TA'ing a biomolecular engineering class in grad school I was floundering trying to come up with an explanation for completely identical, half identical, and nonidentical segments.
The completely identical and nonidentical were simple concepts. If you compare with a sibling, then the genetic material you received from your parents is the same on both "halves" of the chromosome for identical segments. Nonidentical segments are when neither of the chromosome halves match up.
I highly recommend
this post from Barry Starr at Stanford (
Ask A Geneticist). He actually explains what most articles fail to address on a meaningful level.
My goal is to briefly recap Barry's explanation, and expound on the possibilities.
Let's start with two parents:
Mom Dad
They each have genetic code inherited from their parents. We'll identify genetic material in their offspring as coming from a particular chromosome using these colors. On the left, is a chromosome pair of one parent (yellow & orange), and on the right is a chromosome of the other parent (blue & red).
These parents have two children:
Kid A Kid B
Each child inherits material from one parent on one half of the chromosome, and material from the other parent on the other half.
If you consider Kid A, the left chromosome of the pair comes completely from dad, so that whole segment is "half identical" with dad. The right chromosome of Kid A's pair is "half identical" with mom because it contains genetic material entirely from mom.
Ok, now let's compare the siblings to each other.
The following segments are "completely identical" between the two kids:
Kid A Kid B
In the boxes you can see that the colors of each pair match--this is a segment that is "completely identical" because it contains the exact same genetic material from each parent.
Results from the site 23andMe display full siblings the following way (7 chromosomes shown):
Consider chromosome #2 above. The purple color refers to the boxed portion of our siblings above--this part of the chromosome is shared identically by both siblings.
There are large segments in purple, but there is significantly more in the pink color. Pink is referring to "half identical" segments.
The following segments are "half identical" between the two kids:
Kid A Kid B
In the top box the kids share DNA from Dad (red), in the second box the kids share DNA from mom (yellow), in the third box the kids share DNA from Dad (blue), in the fourth box the kids share DNA from data (blue), and in the fifth box the kids share DNA from mom (orange). In each case, the second non-matching chromosome in the pair still contains genetic material from the other parent, but it comes from a different half of the parent's chromosome compared to their sibling.
That leaves "nonidentical" segments. In the 23andMe manner of displaying the comparison it's represented with light grey lines.
The following segments are "nonidentical between the two kids:
Kid A Kid B
None of the boxed segments have the same DNA between the two kids (though if you look closely you can see that I goofed a smidge, and the yellow/orange transition in the middle box doesn't line up perfectly).
Ok, so hopefully the terms are more meaningful, but what of incidence?
Let's isolate small segment and look at the possibilities.
The left half of each chromosome can come from either of Dad's pair, and the right half of each can come from either of Mom's pair, so each of the four above can have two possibilities--this means there are 2 × 2 × 2 × 2 = 16 possibilities.
Let's attribute them all among the three categories:
Completely identical (four)
Red Yellow Red Yellow
Blue Yellow Blue Yellow
Red Orange Red Orange
Blue Orange Blue Orange
Half identical (eight)
Red Yellow Red Orange
Red Orange Red Yellow
Blue Yellow Blue Orange
Blue Orange Blue Yellow
Blue Yellow Red Yellow
Blue Orange Red Orange
Red Yellow Blue Yellow
Red Orange Blue Orange
Nonidentical (four)
Red Yellow Blue Orange
Red Orange Blue Yellow
Blue Yellow Red Orange
Blue Orange Red Yellow
These give us rough proportions to expect. This is a VERY simplified set of proportions, but they get us in the ballpark.
Keep in mind that you can basically do this with billions of segments if we think of the above diagrams as small numbers of base pairs. According to
wikipedia:
The centimorgan
is also often used to imply distance along a chromosome, but the number
of base pairs it corresponds to varies widely. In the Human genome, the
centimorgan is about 1 million base pairs.
Over numbers that large, you're going to get reasonably close to an
expected value.
When I compared the ratio of completely identical to half identical segments between my brother and I, sure enough, I got about 0.33. When you take the "completely identical" possibilities and divide by the sum of "half identical" and "completely identical" segments you get 0.33. The reason I do this is because it appears that 23andMe includes "completely identical" in the "half identical" segments portion from how they're counted in the analysis. Both are referred to as "shared."
The following are some of my numbers to show how close it can be to 50%:
Reference the
following from the ISOGG wiki:
Neither 23andMe nor Family Tree DNA distinguishes between fully identical regions and half-identical regions when computing the number of shared centiMorgans
(cMs). Thus, for example, the shared cM for full-siblings will on
average be 75% of the total length of the genome, of which on average 50
percentage points are half-identical and 25 percentage points are fully
identical.
**Different companies do the calculations in
different ways, using different methods, so results can't be compared directly, unless you go through some work to put the numbers on the same basis.**
If we reconsider our rough proportions from the "16 possibilities" this agrees nicely. The "Completely identical" section is counted twice in the last segment because both "sides" (one from each parent) are the same--see the diagrams from "Completely identical" above.
Notice in my example that (13.1% + 26.1%) x 2 = 78.4%--pretty close to 75%.
All that being said, siblings share
about 50% of dna with each other, and
about 50% with each parent.
23andMe list a table of average shared DNA by relationship:
here.
The International Society of Genetic Genealogy also
weighs in (again, more information).