How Machine-Learning and OCR Are Changing Family History – October 26 2020 – FamilySearch

I saw this on FamilySearch today – How Machine-Learning and OCR Are Changing Family History: I included a brief portion of the article. I do want to caution people that OCR is not foolproof. We use it where I volunteer to scan marriage licenses in the system. The person doing it has to check each index after he does it and correct or add missing information. It probably gets around 90% correct, but it misses some information and wild guesses on some information it includes. For the most part, these are typewritten applications where he only looks at certain fields when doing the OCR. Generally, FamilySearch has two people index each record. If both agree, they are usually accepted. If there are differences between the two indexers, a third person reviews the records. That doesn’t mean that both indexers get it right. I remember seeing where someone indexed maiden name as Ruhr. Looking at the record, it was Unknown.

October 26, 2020  – by  David Nielsen


If this article caught your eye, you probably have an interest in indexing or in online historical records. Maybe you’ve made indexing a part of your weekly or monthly volunteer efforts. If so, keep up the amazing work! You’re making it possible for people around the world to discover their ancestors and learn more about their family histories.

Still, our indexing volunteers have a colossal task in front of them. The world has billions and billions of records waiting to be indexed. Although we have hundreds of thousands of people willing to help out, we’re still outnumbered and it is clear that our volunteers will need help.


Enter optical character recognition—also called OCR, or computer-assisted indexing. Either name works—the more important thing is that the technology works. Thanks to OCR, we’re improving the quality of indexing, increasing the number of indexed records, and accelerating the speed at which historical records become available to the people who visit our website.

The result is more information for people to search and more documents to explore—in short, more opportunities to make that discovery about your family that connects you to your past.


FamilySearch and Computer-Assisted Indexing

So far, FamilySearch has employed optical character recognition to index a whopping 64 million historical records. The project in question involves a collection of Spanish-language records—namely christenings, marriages, burials, and other church documents. When the project is complete, nearly 900 million records will have been indexed and in need of review by an actual person.


About FamilySearch

FamilySearch International is the largest genealogy organization in the world. FamilySearch is a nonprofit, volunteer-driven organization sponsored by The Church of Jesus Christ of Latter-day Saints. Millions of people use FamilySearch records, resources, and services to learn more about their family history. To help in this great pursuit, FamilySearch and its predecessors have been actively gathering, preserving, and sharing genealogical records worldwide for over 100 years. Patrons may access FamilySearch services and resources free online at or through over 5,000 family history centers in 129 countries, including the main Family History Library in Salt Lake City, Utah.

About Wichita Genealogist

Originally from Gulfport, Mississippi. Live in Wichita, Kansas now. I suffer Bipolar I, ultra-ultra rapid cycling, mixed episodes. Blog on a variety of topics - genealogy, DNA, mental health, among others. Let's
This entry was posted in Bloggers, Genealogy and tagged , . Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.