Molecular analysis of Hugh Wilson's DNA

    Molecular Biology is well known, at least to those in my professional position, for excessive hype and massive positive projections that tend to generate public interest and governmental funding; but rarely come true.  Thus, the huge public expenditure to sequence quite a few genomes has produced little in the way of real, productive results.  So, the first rule of  'genetic genealogy', the application of molecular biology to track human lineages, is to assume that wild claims are probably exaggerations that will change through time, and also that all the excitement needs to be taken with a large grain of salt. 

    On the other hand, its also safe to assume that analysis of human DNA will eventually be the primary method used to sort out the human story and track infraspecific lineages.  So, I am in the process of having my DNA subjected to various types of DNA analysis.  Progress with this enterprise, which logically extends to the family group, at least to some extent, will be tracked on this page.  This page can also be used to display info generated by DNA analysis of other members of the immediate clan. 

    There can be no doubt that Wilson cultural roots carry a heavy dose of religion, and much of that is extreme Protestant.  The Alliance, Ohio males (John, Dan, Elvin) were much involved with the local Masonic Lodge; John Wilson's Mother, Charlotte Borton Wilson, came from a  long line of noted Quakers, and his Grandmother (Elizabeth Dungan Wilson) was descended from Baptist royalty (The Rev. Thomas Dungan).  Thus, the 'P' element of the descriptive term 'WASP" is applicable to the Wilson line, at least with regard to cultural tradition.  The 'White Anglo Saxon' part is a bit more complex because the description is more biological than cultural.  Are we White?  Are we part of an 'Anglo-Saxon' lineage?  Analysis of the DNA that - in reality - connects us to the past, provides information regarding the path that our ancestors have followed, and the larger branches of the human phylogenetic tree to which we belong. 

The Y Chromosome

    Recent analysis of Wilson DNA suggests that we are indeed 'white' (Caucasian/European) as reflected by those DNA markers that have been examined on the Wilson 'Y' chromosome, which reflect that small portion of our genome that is fully concordant with the surname.  Suggestions that we are not completely white, such as Elvin Wilson's "1/2 Indian" comment for an ancestor of Charlotte Borton, cannot be explored through the Y chromosome, which is purely paternal; a unique genetic gift from Father to Son (overview).  It has several attributes that make it a prime target for those interested in charting family histories through DNA analysis; it is fairly stable or resistant to change through time and it carries various types of markers - especially short tandem repeats ('STRs' - overview) - that are easy to define and compare (more info).  It also appears that the tag 'Anglo-Saxon' also fits, at least with regard to the Y Chromosome.

    A sample of Wilson DNA was sent to Family Tree DNA (FTDNA) in the Spring of 2006 for an initial analysis of 12 Y Chromosome markers and this was later expanded to 37 markers.  Clients using this company are assigned a 'results' page that provides various ways to look at their data, which is a 'value' for each of the markers examined.  The set of values for all 37 markers is known as the 'haplotype' and this combined string of marker values provides an identity for the Alliance, Ohio 'Wilson' Y Chromosome.  As of June, 2013 there is a single match within the FTDNA database for all 37 markers derived from Hugh Wilson's  Y-DNA, and  there are, four individuals that match values at 36 of the 37 markers.  The matching surnames are Crawford, Bell, Sexton, and Montgomery.  The surname 'Bell' occurs twice among the 9 individuals that have 35 matching values to those of the 37 'Wilson' Y-DNA markers, as does the surname 'Armstrong'.  Related haplotypes, i.e., those that differ by only a few markers, are considered 'haplogroups' or genetically related lineages with similar evolutionary histories with regard to both space and time.  Thus, ancient biological connections might be evident via DNA that are not evident via cultural connections, such as surnames.  Surnames came into general use among human populations in the 1300s.  Before that time, folks that would later be called 'Crawford',  'Bell', 'Armstrong', and 'Wilson' were quite likely moving along the same path in terms of gene flow and geography.

    It is also possible to link up with 'projects' that are managed by other FTDNA clients and thereby add your data to a larger, collective enterprise.  The Wilson DNA results were added to the Wilson Surname Project and reference to 'results, classic' (go to Y DNA results,use your browser's 'find in page' and enter the user number: 55926) demonstrates that Ezekiel Wilson's DNA (assuming minimal change through 6 generations to Hugh Wilson) is not closely related to other Wilson Y-DNA samples associated with this project.  There is no strong matching with others that carry the surname 'Wilson'.  Among other Wilson samples involved with this project, we have the only sample that is assigned to Y-DNA haplogroup R1b1a2a1a1a3.  This fairly long haplogroup tag represents our position on the 'Tree' (see Wiki overview here) and this helps to locate our DNA identity in terms of both space and time, i.e., a track to the path that our DNA has followed for the past 50k years or so.  So, no other Wilson involved in the 'Wilson Project' is linked to the specific haplogroup R1b1a2a1a1a3.   I would guess that this is a temporary situation that results from the relatively small 'Wilson' sample size.  Eventually, one can assume that more Wilsons with a genetic connection to our haplogroup clan will come into the sample.  However, at this time Nov, 2013), our closest connections are to surnames other than Wilson and it is possible to find matches to this set of Y-chromosome mutations elsewhere among other Family Tree DNA 'Projects'.

An Odd Bit of Wilson DNA

    The Y chromosome handed down to Wilson males from Ezekiel Wilson carries a 'mutation' that is currently unknown among other Wilson samples represented in the FTDNA data set, at least those that are part of the 'Wilson Project' mentioned above.  This unusual feature, initially known as the 'Null DYS 439' is relatively rare and potentially interesting, at least with regard to tracking the history of Wilson males.  The genetic basis of this 'null' feature is indicated by the comment from FTDNA for Y Chromosome marker DYS439;  "** 55926 has  an imputed value for DYS439 as the actual repeat count could not be measured by the FTDNA primers".  Markers examined by these commercial services are DNA Y Chromosome Segments (DYS) that are located at a specific point (439) on the chromosome.  As indicated above, the segments are characterized by the presence of Short Tandem Repeats (STRs) of nucleotides (ATG and C) and the variation observed among samples reflects different numbers of STRs at a given DYS.  The DYS is defined by the presence of 'flanking regions' (on either side of the series of STRs) that are characterized by a specific sequence of nucleotides.  FTDNA uses machines that find these flanking regions, via recognition of the unique flanking strings of nucleotides, amplify the DNA between the flanking regions, and determine the number of STRs which is the 'value' of a given marker.  This works most of the time but - in rare cases - the machines are unable to recognize the flanking region because one nucleotide of the standard flanking sequence has changed.  This is called a single nucleotide polymorphism (SNP).  These mechanics are overviewed at the Null DYS439 Project.

   NOTE:  As of early 2013, FTDNA changed its PCR methods and, as a result, the mutation causing the DYS439 null is no longer detected by their screen.  However, the SNP that caused the PCR reading error is present and detectable as an SNP, which is known 'L1' or 'S26'.  Thus, the FTDNA group previously known as the 'Null DYS439 Project' is now the 'R1b-L1/S26 Y-DNA Haplogroup Project' and I remain a member.  Again, if I am a member - it is reasonable to assume that all males of the immediate Wilson clan are also members because we all carry the sameY-chromosome.

    Data from the Wilson DNA was submitted to the Null DYS439 project in 2006 (again, 'find' sample tag 55926 on the results pages).  The data listings at this site demonstrate a handy feature of the SNP L1, i.e., its presence allows firm connection to a Y-DNA haplogroup, known, at one time, as R1b1c9a.  The string of 37 'alleles' found on that portion of the Wilson Y-chromosome examined so far represent what is called a haplotype (short for a 'haploid genotype').  Related haplotypes form haplogroups and haplogroups represent - in theory - genetic lineages.  Thus, the current Y DNA Haplogroup Tree from the International Society of Genetic Genealogy depicts current haplogroups, starting with 'A' and ending with 'R'.  Ezekiel Wilson DNA is firmly linked with the 'R' group.  Examination of haplogroup R shows a split into R1a and R1b.  There is no doubt that the Ezekiel Wilson DNA is associated with the general R1b lineage and, due to the 'L1' mutation and another SNP, 'deep clade' analysis provided by Family Tree DNA, further refined to subgroup R1b1a2a1a1a3a1a.  If these alignments are correct, those carrying the Ezekiel Wilson Y chromosome migrated northward through Europe with the glacial retreat (ca. 12k years ago), ending up in England or southern Scotland, probably via the Anglo-Saxon invasion during the 5th and 6th centuries.  The lineage was apparently coherent prior to the use of surnames, about 1300 years ago.  Thus, those (of the current sample) showing highest affinity to the Ezekiel Wilson Y chromosome haplotype have different surnames; Moffate, Brown, Graham, Bell (cluster 2 in the tree below [green]):

The R1b-L1/S26 Y-DNA Haplogroup Project site no longer produces trees (as above from 2006), but the 'Wilson' sample should group as part of a lineage that includes Bells and often Moffatts

  The current data set for the L1 SNP project (2013) links the Wilson sample with surnames: Graham, Armstrong, Nelson, and - again - BellThus, we have a fairly consistent connection (linkage via the full set of 37 markers and the 'L1' mutation) with folks that carry the surname 'Bell' and 'Armstrong' at this point in time.  Again, this genetic linkage extends back in human history to a time prior to the use of surnames.  In fact, there are no other Wilsons in the L1/S26group data set, but there are many samples representing haplogroup R1b1a2a1a1a3a1a, a haplogroup missing from samples of the 'Wilson Project' data set.  This situation places us in what appears to be an interesting situation with regard to future developments in Y-DNA analysis in that our haplogroup is well defined and it can only become better defined as data accumulate.

    The haplogroup
R1b1a2a1a1a3a1a is closely linked with ancient inhabitants of England, the Anglo-Saxons, and I have seen no reason - from various analytical options available for Y-DNA data - to question that linkage.  However, DNA genealogy is an emerging enterprise that appears to be permeated by levels of hype and hubris that is associated with molecular phylogeny in general.  One must assume that constructs and scenarios generated at this time will appear archaic and, perhaps, silly in the not too distant future.  So, a 'grain of salt' is needed but - the fact is - those generating trees, maps, and theories of relationships are playing with real data and there is no doubt that the process will be refined and improved as time passes. 

    Perhaps the most interesting aspect of the Ezekiel Wilson y-chromosome is its stability, i.e., its content at the molecular level - which carries some biological weight in the general area of 'maleness' - has been relatively (to other chromosomes) constant through time.  It is tempting to think of it as a physical, ancestral bit of material that is passed (with typical Wilson style and enthusiasm) down the generational path, i.e., extant y-chromosomes of Wilson males are probably identical to that carried by Ezekiel. 
This biochemical reality carries a hint of immortality in that we (Wilson males) are, at least with regard to this bit of biological material, extensions of Ezekiel Wilson. 

    There is also the Darwinian reality of its continued existence.  It has survived early human migrations out of Africa and the subsequent wide range of harsh selective events (war, pestilence, etc.) faced by the human family.  While all extant human males share this distinction of survival, many lineages passed into history and oblivion.  The tentative nature of this dynamic is exemplified by events of the past several Wilson generations.  Ezekiel Wilson was apparently orphaned by the Small Pox epidemic of 1793 in Philadelphia.  Both parents died and Ezekiel was permanently separated from his only sibling, a sister.  Thus, he alone was left to pass on his father's y-chromosome.  Ezekiel Wilson and Elizabeth Dungan had 12 children from 1815 to 1836, but only 3 males; George, John, and Amos.  It appears that George and John stayed in the Bucks County, PA area, they both married, and they lived into their late 50s.  Its likely that they produced male children, but we have no information.  Amos Wilson and Charlotte Borton had 6 children but only 2 survived to reproductive maturity and only one, John, carried the y-chromosome.  John had four children, but only one male, Dan.  Dan had five children, three of which were males, but only one - Elvin - produced grandchildren.  Thus, survival of the Wilson y-chromosome through three generations (John, Dan, Elvin) was based on the reproductive success of a single individual at each generational node.  A single premature death would have ended the line, i.e., brought about extinction.  Thanks to Elvin, the Wilson y-chromosome is now carried by Gary, Hugh, Derek, Christopher Donaldson, Gary Hale, Quentin, Christopher Derek, Truman, and Tate.  So, in terms simple survival into the future, the Wilson 'Y' appears to be on firm ground.

Mitochondrial DNA

    Family Tree DNA also did an analysis of Hugh Wilson's mitochondrial DNA (mtDNA).  Mitochondria are present in our cells as 'organelles' or intracellular elements that do several jobs relating - mostly - to energy flow.  The 'endosymbiotic' theory of cellular evolution places them as, at one time in the distant past, free-living bacteria that were brought into modern cells.  As in the case with bacteria, they have only one, circular chromosome.  They do not occur in sperm; only in eggs.  Thus, this bit if circular DNA, which is really quite abundant in the body because there are many cells and each has many mitochondria, is inherited maternally; in my case from Fern Donaldson Wilson.  Analysis of markers at the two 'variable regions' of  Hugh Wilson's mtDNA (and that of Fern's siblings and their progeny) place it in the mtDNA haplogroup 'T2', an interesting European lineage.  Sample mtDNA data can be obtained by going to the FTDNA Wilson Project, selecting mtDNA results and, as above, use your browser's 'find in page' option to search for '55926'.  As of November, 2013, two full matches to the marker set at both of my mtDNA variable regions that have been documented by Family Tree DNA.  The known maternal surname path from Hugh Wilson goes:  Donaldson, Fairall, Henry, Kittle.

Beyond the y-chromosome and mtDNA

    A 'deep clade' analysis was accomplished recently (May 2012) on this DNA sample by Family Tree DNA.  This apparently focuses on a suite of single nucleotide mutations (SNPs) to produce a more refined determination of haplogroup.  This analysis confirms that 'R1b1a2a1a1a3a1a' represents the Wilson haplogroup.  Thus, it appears that two separate lines of evidence, the L1/S26 SNP  mutation and the 'deep clade' array of SNPs, support placement of the sample in this Y-DNA haplogroup.  A Google search for this haplogroup (try it here) will pull from the internet current notions regarding its evolutionary significance. 

    Also, a test was conducted (May 2012) on Hugh Wilson's DNA by Ancestry.Com.  This is evidently an 'autosomal' test (all 23 chromosome pairs) that appears to be fairly detailed with regard to data points (SNPs) across the entire genome (see a current review here). 

ethnicity map

    The initial results (screen grab above) indicate that Hugh Wilson's total DNA is 77% Central European, 16% British Isles, and 7% 'uncertain'.  This, like the Y-DNA and mtDNA tests from Family Tree DNA, will probably become more informative as their procedures evolve and the sample size increases. 

    An added element of the analysis involves the ability, via their on-line system, to link up with family tree info that have been uploaded by other participants.  Thus, when a match occurs, the alert from includes tree info from the matching individual in the form of a listing of common surnames found in the two trees.  A recent 'close' match (2nd or 3rd cousins) included the surname 'Henry'  and the location "Marshall, Illinois"; both links to my Grandmother, Mable Fairall Donaldson.  This connection between genetic and traditional family history data is made possible by the combination of genetic and family tree data at  Later interaction with the person responsible for the "Fanberg" tree on revealed that the matching individual, John Henry, is linked to Fern Wilson's maternal Grandfather, John Henry, the Civil War hero, via Daniel Burton Henry, Clara Henry Fairall's brother.  This represents the first local instance of firm genealogical linkage established by my DNA sequence. provided, in the Fall of 2013, an 'upgraded' view of my DNA 'ethnicity' (which, in my view is a confusion of terms, biological vs. cultural), which provides an initial break not present in the initial results, i.e., 99% European and 1% Asian.  The 'trace' Asian elements are East and Central Asia, as mapped.  No indication that this 'Asian' component is Native American.  The European element is 75% 'Great Britain' and 5% 'Europe West'.  The remaining 19% is:  Italy/Greece 8%, Iberian Peninsula 4%,Scandinavia 3%, Europe East 2%, Ireland 2%, and
 Finland/Northwest Russia < 1%.