The Genealogical Proof Standard: an introduction

About two and a half years ago, in my National African American Genealogy Examiner column, I wrote a post called, “What is the Genealogical Proof Standard?

The GPS recognizes, as you will discover in your own research, that genealogy research often leaves unanswered, and unfortunately unanswerable, questions. Not every fact can be proven with a simple statement on a document. However, through the use of the GPS, and indeed through practice, you can be sure that your conclusions are as close as possible to the truth.[1]

The Genealogical Proof Standard is a set of guidelines by which researchers can judge the thoroughness of their research and analysis, and the reliability of their conclusions.

My understanding of the Genealogical Proof Standard has grown further in the past few years. Over the next week or so, I would like to discuss the Standard as well as how to apply it to your research. Each post (and occasionally more than one) will discuss one or more of the conditions of the Genealogical Proof Standard:

1. Conduct a “reasonably exhaustive search” for all information that is or may be pertinent to the question for which you are seeking an answer.

2. Completely and accurately cite every source of information discovered in this search.

3. Analyze and correlate the collected information to assess its quality as evidence.

4. Resolve any conflicts caused by contradictory items of evidence or information contrary to your conclusion.

5. Arrive at a “soundly reasoned, coherently written conclusion.”

I hope you enjoy the coming posts.


[1] Michael Hait, “What is the Genealogical Proof Standard?,” in National African American Genealogy Examiner, posted 15 May 2009 ( : accessed 20 Nov 2011).

Do I have a citation obsession?

I discuss source citations in this blog a lot. I know. I just can’t help it.

But academics in other fields are not above obsessing over citations either.

Kurt Schick, a writing teacher at James Madison University, posted “Citation Obsession? Get Over It!” in the Commentary section of the Chronicle of Higher Education. Mr. Schick agrees with many of my readers, I am sure:

What a colossal waste. Citation style remains the most arbitrary, formulaic, and prescriptive element of academic writing taught in American high schools and colleges. Now a sacred academic shibboleth, citation persists despite the incredibly high cost-benefit ratio of trying to teach students something they (and we should also) recognize as relatively useless to them as developing writers.[1]

Mr. Schick decries the time and energy that universities spend teaching how to cite in specific formats: MLA, APA, Chicago/Turabian, etc. In his opinion citation formats are nearly indistinguishable and relatively simplistic:

Why, then, could we not simply ask students to include a list of references with the essential information? Why couldn’t we wait to infect them with citation fever until they are ready to publish (and then hand them the appropriate style guide, which is typically no more difficult to follow than instructions for programming your DVR)?

In Mr. Schick’s opinion, citation format is unimportant until publication. I have heard this same argument used in the genealogy field on numerous occasions. (Of course, Mr. Schick refers mostly to published sources, whereas we genealogists should be using mostly original record sources.)

Instead of teaching citations, universities and colleges should instead “reinvest time wasted on formatting to teach more-important skills like selecting credible sources, recognizing bias or faulty arguments, paraphrasing and summarizing effectively, and attributing sourced information persuasively and responsibly.” These are all very important skills, I agree. However, in genealogy, why separate the two processes?

To me an accurate source citation is more than just how we know “where we got the information.” It’s more than how a reader can reproduce your research or assess the quality of your sources.

The internal process of a researcher creating an accurate source citation develops certain necessary evalution skills. In order to fully cite a record source–whether a published item, a government record, or an unpublished manuscript–you must understand certain things about the record. Who created it? When and where was it created? Where is it currently stored? How does this record fit into the larger collection of records of which it is a part?

These questions are among the five things you have to know about every record. In other words, taking the time to create a full and accurate citation itself inspires a deeper understanding of that source. I believe that this th reasons that the Genealogical Proof Standard contains the condition about citing your sources separate from the other four conditions, stated after searching for relevant sources and before analysing and correlating the information. Creating the citation allows the researcher to evaluate the source itself, rather than solely focusing on the information that source contains.

This explains my seeming obsession with citations.


[1] Kurt Schick, “Citation Obsession? Get Over It!,” in Commentary, Chronicle of Higher Education, posted 30 October 2011 ( : accessed 13 Nov 2011).

For another response, see also Carol Fisher Saller, “‘Citation Obsession’? Dream On,” in Lingua Franca blog, posted 3 September 2011 ( : accessed 13 Nov 2011).

Does a “reasonably exhaustive search” include online family trees?

There is a lot of junk on the Internet.

More experienced genealogists, both professionals and hobbyists, know this. We repeat it in our blogs, in our research plans, in our conversations with other genealogists. We stay away from the Public Family Trees on and FamilySearch‘s International Genealogical Index. After all, these all just have junk put online by those “shaky leaf” clickers, right?

One should by no means trust an online family tree.

But neither should one trust a death certificate or a 19th-century county history or a federal census record or an obituary.

Just because it’s online does not make it more or less garbage than any other source. You still should evaluate the information the same way you would in any other record. Identify the informant. Determine their involvement in the reported event or the source of their information (if secondary).

Two cases are perfect examples of this philosophy:

Almost fifteen years ago, when “Internet genealogy” barely had an existence, I came across a family tree that contained my then-earliest known ancestor in my male line: Myron Grant Hait, my great-great-grandfather. I contacted the owner, who turned out to be my grandfather’s first cousin. My great-grandfather, who lived in New York, was one of six brothers, all of whom lived in different and distant states: California, Montana, North Carolina, Louisiana, etc. In those pre-Facebook days, distant relatives did not always maintain close contact. When my grandfather moved to Washington, D. C., to work for the federal government, he had even less contact with the extended family. He knew his uncles, but did not know any of his cousins.

This cousin, Linda, just so happened to have quite a number of family records in her possession, including letters to and from my great-grandparents from back in the 1970s when she started researching, and a family history written by my great-great-grandmother in the 1930s. She also put me in touch with another cousin who had in her possession a copy of a family bible, several old family photos, and a collection of Civil War letters!

Of course not all of her research was completely accurate, but much of it was, and of course the original records in the possession of these long-lost (to me) branches of the family were indispensible. Had I ignored this online family tree, I would have never obtained many of these records.

The second case involves a family that I was working on for a client. While searching for records on Ancestry, I discovered a public family tree. Though not a single offline source was cited, the information was extremely specific. I jotted down a few notes from the tree for confirmation, but then went on along my merry research way.

The next day at the Maryland State Archives I happened to run into a friend of mine: also a professional genealogist, member of my APG chapter, and a fellow Certified Genealogist. I knew that she did a lot of research in this particular county, so I asked her if she was familiar with the families I was researching. To make a long story short, the owner of the Ancestry public family tree was her client, who had uploaded the results of her research to the site without any source citations. In other words, though it looked like “junk” because it did not have any sources cited for any of the information, the tree actually reflected the work of a Certified professional genealogist. As I continued to research the family, I was able to confirm all of the information that was in the public tree.

As the first example shows, online family trees are often a great way to identify other descendants of the families you are researching. Some of these distant cousins may have family records passed down in their lines that you do not have access to: items like family bibles, old family photos, etc.

The first condition of the Genealogical Proof Standard is that we conduct a reasonably exhaustive search for all records relevant to our research problem. If you have ignored the search for family records in other lines, have you met this requirement?

The limits of online genealogy research

Rarely do I mention my other columns (though the RSS feeds show up over on the right) on But I wanted to point readers to a series of posts that I wrapped up today.

Since February 2010 I have been working on an online case study concerning the family history of a former slave named Jefferson Clark. I call this an online case study because I specifically chose to use only records available online. My subject was chosen at random from African American families living in Texas in 1870.

I would like to invite you all to read this case study. The techniques that I use throughout the series of posts demonstrate the importance of skillful analysis and correlation of information in your research. When access to records is limited, it is vital to utilize indirect evidence to form conclusions.

Because the subject was chosen at random, the case study also demonstrates how a professional genealogist operates. In beginning this research, I had no family records that had been passed down, no older relatives to interview, and no previous research to consult. I truly had to start from scratch. Many of my client projects begin the same way. In a project I worked on last week, the only information I was provided was a newspaper marriage announcement for the client’s grandparents.

The first post in this series–“The Jefferson Clark family of Leon County, Texas: an online case study (part one)“–appeared on 21 February 2010. Because this was not a client project, and was being conducted strictly for use in my “National African American Genealogy” column, I had to fit research in when I had time.

Today’s article, the final word on this online case study, is entitled “The strengths and limits of online genealogy research.” I may continue this case study, in a more limited capacity, using records not available online.

You can find links to all of the articles in this series under the “Case Studies” section of my webpage. Unfortunately I was unable to edit some of the earlier articles to include links to the later ones, due to a change in Examiner‘s article publishing platform. However, from the “Case Studies” page of my website, you can easily open each article in a new browser tab.

Let me know what you think, either here or on the Examiner pages.

… but we do need Evidence Explained.

[Please read “Why we don’t always need source citation templates …” before reading this post.]

Elizabeth Shown Mills’s 1997 book Evidence! Citation & Analysis for the Family Historian (Baltimore: Genealogical Publishing Company, 1997) contains about 84 total pages of text, not including the Acknowledgment, Introduction, Bibliography, Appendixes, and Index. Of these 84 pages, 25 are contained in the chapter “Fundamentals of citation,” 19 are contained in the chapter “Fundamentals of analysis,” and 40 are contained in the section of “Citation Formats,” which contains templates for over 100 genealogical sources.

The first edition of Evidence Explained (Baltimore: Genealogical Publishing Company, 2007) contains 804 pages, not including the introduction and indexes. Of these 804 pages, 26 pages are contained in the first chapter, “Fundamentals of Evidence Analysis,” and 52 pages are contained in the second chapter, “Fundamentals of Citation.” The remaining chapters are individually identified by broad resource types.

It is important to note that each chapter does indeed contain “QuickCheck Models” (citation templates) but there is no section of this book that is explicitly called “Citation Formats,” or anything of the like. It is also important to note that this book is named Evidence Explained, not Citations Explained.

When this book was first released in 2007, I lugged the 800-plus book on the train every day for a month and read it cover-to-cover, much as I did years before with Evidence! It never occurred to me at the time that other genealogists might consider this book a mere collection of citation templates. I have since become aware that this is exactly how many view the book.

To prove that it is not a mere guide to citations or a collection of templates, let’s look at a sample chapter. I chose Chapter 8, “Local & State Records: Courts & Governance,” at random.

  • The chapter runs from page 371 through page 418.
  • Pages 373-382: QuickCheck Models (10 pages).
  • Pages 383-385: Basic Issues. This section contains such important information about records analysis as the following passage: “Many of the ‘original’ court records you consult at the city and county level are record copies (see 1.27) rather than true originals. Historically, attorneys presented the court with documents critical to the case at hand—contracts, depositions, petitions, etc. Courts then maintained these loose documents in bundles, envelopes, jackets, or packets. Certain items of particular significance from a legal standpoint would be copied into record books, although the original packets would usually be preserved, at least for a certain number of years.” [8.5, page 385] Note that this is just one short example, and that it does not at all concern citation. These three pages contain only five short example citations, demonstrating other issues being discussed.
  • Pages 385-390: Citation Issues. This section discusses specific notes about citing these records. There are several examples in this section, again used to demonstrate the issues being discussed. These notes are insightful, not only for the specific examples being discussed, but for other record groups as well. Take this gem, for example: “Many counties and some cities are no longer functioning jurisdictions or else they have changed their names. Even so, the basic citation pattern remains the same. You would likely add a brief comment to your First Reference Note to explain the situation.” [8.12, page 388]
  • Pages 390-409: City & County Records. This section contains detailed descriptions and summaries of several record groups, as well as citation examples. It includes background information and basic formats for bound volumes, loose case files, and off-site archival records. The record groups discussed include bastardy cases (presentments), bonds [“Historically, bonds have been posted in a variety of matters. In addition to the better-known administration, guardian, and marriage bonds, bonds also guaranteed appearance in court, peaceful conduct toward others, payment of legal obligations, fulfillment of duty as a public officer, financial support for slaves being freed, and much more.” (8.22, page 396)], coroner’s inquests, county commissioners’ records, election certificates and returns, indigent records, insanity hearings, etc. This section provides not only an education in how to cite various city and county records, with examples that demonstrate variations, but also an education in many lesser-known and lesser-used record groups. It also contains other important tips, like, “The ‘source of the source’ cited by databases such as this one could refer to the original numbering scheme of the court that created the record or it could refer to a new number assigned by the archive that created the database.” This is an important distinction to make when analyzing records not only when citing them.
  • Pages 409-418: Colony & State Records. This section contains information about state archival inventories/finding aids, as well as general agencies and record groups: colony-wide courts, state or provincial appellate cases, governors’ papers, legislative petitions, and state pension files. Among the information that does not consist of citation templates, one will find the following passage: “When a case is appealed from a local court to a district, state, provincial, or federal court, the file generated at the local level is transmitted to the higher court, where it is assigned a new docket number or case number. The case name may also be reversed. For example, a case might originate locally as John Brown v. Sam Smith. If the case was decided in favor of Brown, then Smith appealed, the name of the new case before the appellate court would be Sam Smith v. John Brown. Your citation to the appellate case should carry the label and the case number used in the appellate court, not the label and number of the original case at the local level.” (8.39, pages 413-414)

While the 45 pages in this chapter do contain quite a few citation examples, they include only 10 pages of citation templates. Taken individually, there are 223 citation examples in this chapter. However, this quantity counts each individual citation separately, where the same record may be provided in source list entry, first reference note, and short reference note examples, and counted as three separate citations. The actual number of individual record examples cited within the chapter is less than 100.

The citation examples demonstrate variations in how any individual source might have to be cited. But neither the examples nor the templates will cover every single source that one will encounter. There will be major variations even within one record group, depending on whether you are accessing the record at the courthouse or an archives, a microfilmed or digital image copy, an original file or a record copy; depending on how the archives has organized their record groups; depending on whether the record refers to an earlier case or a separate file; and many other factors. While Evidence Explained does indeed address all of these factors, they are not always noted within the section devoted to the record group that you are looking for specifically.

The bulk of Evidence Explained, in fact, does not consist solely of a discussion of citation issues, as the above brief exploration shows. It certainly contains far more than simply citation templates. Those who have not read anything more than the first two chapters, and the citation models and examples, are missing out on the true value of this book.

And of course, as my previous commenter noted, and there are many out there who seem to agree with him, “I can assure you, I will never read it.” If the book were an 800-pound collection of source citation templates, I would agree with you. There would be nothing to read.

In my opinion, Evidence Explained is a much greater work concerned as much with principles of evidence analysis as with source citation. These two aspects of research cannot be separated, though this is a lesson that many still have yet to learn.

Why we don’t always need source citation templates …

A commenter on my previous post, “Why citation software should be avoided,” noted,

Citations are easy, or should be. Simply provide a key at the beginning of how your citations are organized, then include who,what, when, where, and where found. That should be sufficient for anyone to find it and verify it, if possible. Why do we need an 800+ page book for that?

To a certain extent, I completely agree with this statement.

For me, probably based on my experience using and citing many different record groups for close to 40-60 hours a week for a few years now,  citation is easy or “should be.” When I look at Evidence Explained (Baltimore: Genealogical Publishing Company, 2007), it makes sense to me. I look at specific examples only because I cannot remember a small detail concerning that particular record group. But about 97% of the source citations that I write are written without the use of a template.

Ultimately, source citations provide exactly the information my commenter noted: “who,what, when, where, and where found.” And of course, the necessary key to the organization of the citation.

This is precisely the point that I wanted to make with my earlier posts, “Source Citations: Getting it ‘Right,'” parts one, two, three, and four. In these posts, I explain the logic behind why several of the more common citations are organized the way they are.

Take a look at the accepted citation for a book, in reference note format:

Michael Hait, Online State Resources for Genealogy, Version 1.0, e-book (Harrington, Del.: Hait Family History Research Publications, 2011), page 37.

This citation provides all of the necessary details to locate this reference.

Now, look at an example from Evidence Explained selected at random:

Midmar Parish (Aberdeenshire, Scotland), Old Parish Registers, OPR 222/1, p. 65, James Edward baptism (1727); FHL microfilm 993,344, item 1. [Evidence Explained, 1st ed., p. 366]

This citation provides the creator (Midmar Parish), the record (Old Parish Registers), the specific volume and page, followed by “where found” (the FHL microfilm).

Here is another example for a completely different record group, again selected at random:

Passenger list, El Sagrado Corazón de Jesús, 1779; Papeles Procedentes de Cuba, edición 141, legajo 689, folio 414; Archivo General de Indias, Seville, Spain; consulted as microfilm PPC roll 68, Clayton Library, Houston. [Evidence Explained, 1st ed., p. 640]

This one is a little different, but ultimately the same. The first element cited is not the creator, but a specific record contained within a larger record set. Like an article in a journal or a chapter in a book. But otherwise the citation contains the same elements in the same order.

So do we really need an 800-page book of source citation templates?

Not if we “get it.” At least, not on every single citation. You may need to use the templates from time to time to figure out some idiosyncracy of a specific record.

But … (please read on in the next post) … but we need Evidence Explained.

