GEDCOM

Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.

misnomer

GEDCOM is an acronym for Genealogical DataCommunications.

That name is an unfortunate misnomer. If you knew nothing but the name, you would probably guess that GEDCOM is some kind of communication protocol, but it is not. GEDCOM isn't some language through which genealogy applications talk to each other. A few such protocols do exist, but GEDCOM isn't one on these. GEDCOM is not about data communications at all, it is just a file format for data exchange.

Unlike many other introductions to GEDCOM, this text is not about the technical details of the GEDCOM data format, but about basic facts and real-world issues.

data transfer

GEDCOM is a data file format, specifically created for the transfer of genealogy data. The idea is that you can transfer data between two genealogy application by exporting it from one application into a GEDCOM file and then importing that GEDCOM file into the other application.

There are several reasons why, in actual practice, GEDCOM does not fully live up to that ideal.

de facto standard

GEDCOM is a de facto standard for data transfer between genealogy applications. GEDCOM is not a de jure standard managed by some official standards body. There is no de jure standard for genealogy, but almost every genealogy application supports GEDCOM.

There are several alternative data formats, but none is as widely supported.

before GEDCOM

rekeying

Users that switched between the first few genealogy applications had to print out their data from the old application, and then rekey all of it into the new one. No one had very big databases yet, so that was doable, but it was also error-prone and cumbersome. Nobody enjoyed rekeying their data, and many early adopters had years of research on paper, so their databases grew quickly.

direct import

Soon, several genealogy applications supported direct import from competing products. This is easiest for the user, so even today, many genealogy applications support direct import from several competing products.

However, the direct import is generally limited to a handful of major products, while there are literally hundreds of genealogy applications on the market. It is impractical for any vendor to support them all, and even if direct import from a particular product is supported, that support does not always include the most recently released version of that product.

GEDCOM versions early versions GEDCOM is strongly associated with PAF, but PAF isn't the first product to support GEDCOM. The Family History System is the only application to ever support GEDCOM 1.0.

The first version of PAF to support GEDCOM is PAF 2.0, which supports GEDCOM 2.0. compatibility The various versions of GEDCOM are not very compatible with each other. That hardly matters anymore, as GEDCOM 5.5 was released 1995, and GEDCOM 5.5.1 in 1999. Practically every genealogy application supports GEDCOM 5.5 or 5.5.1. versions This table shows the main GEDCOM versions. date version brief note 1984 1.0 first version 1985-12 2.0 PAF 2.0 1987-02 2.1 PAF 2.1 1987-10-09 3.0 lineage-linked form 1989-08-04 4.0 1991-09-25 5.0 draft lineage-linked structures 1993-11-04 5.3 Unicode, schema (abandoned),

used draft 1995-12-11 5.5 official standard 1999-10-02 5.5.1 de facto standard 2000-12-18 5.6 unreleased draft 2001-12-28 6.0 abandoned draft There have been additional drafts in between these versions. GEDCOM version 5.3 is a draft, but is included in this table because several applications, most notably several versions of Family Tree Maker, use it anyway. GEDCOM 5.5 versus 5.5.1 Officially, version 5.5.1 is still a draft, but that is only because FamilySearch forgot to make it official. Many applications, including their own PAF application, use GEDCOM features introduced in GEDCOM 5.5.1. GEDCOM 5.6 GEDCOM 5.6 is an unreleased GEDCOM draft from 2000 that surfaced early in 2011. It does not really offer any new features, except that its GEDXML foreshadows GEDCOM XML. GEDCOM XML GEDCOM XML 6.0 is an ill-chosen name. It is not version 6.0 of GEDCOM, but a draft of an intended replacement of GEDCOM, and should really have received another name and version number 0.9. FamilySearch abandoned development of this replacement, and did not resume maintaining GEDCOM. current version Version 5.5.1 introduced on 1999 Jan 2 is still the latest version of GEDCOM and the de facto standard.

GEDCOM

Several genealogy software vendors started talking about a standard for exchanging data and one of them created GEDCOM.

GEDCOM soon enjoyed widespread support among genealogy software, not because it is the best standard for genealogy data, but merely because it was the first one. Once several major vendors supported it, every new genealogy application had to support it.

owner

GEDCOM was created by FamilySearch. At the time they created GEDCOM, FamilySearch was still known as the Family History Department (FHS) of The Church of Jesus Christ of Latter-day Saints (LDS). The LDS is the largest mormon cult, and mormons have an interest in genealogy for religious reasons. FamilySearch is one of the earliest genealogy software vendors; they started selling their application, Personal Ancestral File (PAF) in 1984.

Although GEDCOM is strongly associated with PAF, PAF wasn't the first application to support GEDCOM. PAF 2.0 is the second application to support GEDCOM.

The first application to support GEDCOM, and the only one to ever support GEDCOM 1.0, is the Family History System (FHS), an MS-DOS application created by Phillip Brown. PAF 2.0, the first version of PAF to support GEDCOM, supports GEDCOM 2.0.

The FHS changed name and is now known as FamilySearch. FamilySearch owns and officially maintains the GEDCOM specification, but FamilySearch has been remarkable inactive in its role as keeper of the standard since the release of GEDCOM version 5.5.1.

religious data format

In some sense GEDCOM is perfect. Genealogists tend to think of GEDCOM as genealogical data format, but to the LDS it is a religious data format, a format to exchange data between databases they maintain for religious reasons.

That GEDCOM has shortcomings as a genealogical data format, is because the LDS is not primarily interested in genealogy, but in recording religious rites performed for their ancestors.

problems

Practically all genealogy applications support GEDCOM, but that still does not mean that you can expect a flawless transfer of your data by exporting your data to a GEDCOM file from one product and then importing that GEDCOM file into another product.

insufficient

The GEDCOM specification is far from perfect. There are various known errors and unnecessary limitations that should have been fixed a long time ago, but FamilySearch refuses to fix or update the specification. The most unbelievable shortcoming is that the GEDCOM specification still does not provide a standard for any other partnership type than marriage.

extensions

Vendors are allowed to extend GEDCOM to add support for genealogical data that standard GEDCOM does not support, but other genealogy application may not support these extensions.

The combination of whatever idiosyncrasies and shortcomings that product's GEDCOM files have, and the GEDCOM extensions a product uses is known that product's GEDCOM dialect. Vendors do try to support each other's GEDCOM dialects, but at the same time generally do not bother to document their own GEDCOM dialect.

quality

So, some problems that users encounter are inherent in limitations of GEDCOM specification itself, but many problems are caused by the low quality of vendor's GEDCOM implementations.

The GEDCOM specification allows several character sets to be used. A common problem with old genealogy applications is that they do not support the character sets that they should support, which limits their ability to import GEDCOM files correctly or in fact import them at all.

Another common problem is that implementations provide incomplete support for the GEDCOM standard. In practice, many applications support no more than the application itself uses. For example, a common shortcoming of many genealogy applications is that they allow just one name per individual, while the GEDCOM specification allows more than one.

import log

On import of a GEDCOM file, a genealogy application should produce an import log, a simple text file that provides log of any issues encountered during the import.

What makes many of the GEDCOM import limitations worse is that many genealogy application do not bother to make an import log, or are not honest about the application's limitations. Some vendors will rather lie that your GEDCOM file is wrong than admit to a limitation in their product.

Even with an honest import log, it can be difficult to understand what went wrong, but without an import log the average user is unable to judge how well the import went.

multimedia

GEDCOM does support multimedia. However, this was only added to GEDCOM after several applications had already decided on their own approach. Although the current standard has been around for some time, transfer of multimedia between applications remains problematic, not in the least because the standard is insufficient.

There are two main issues. One is that the multimedia files must be transferred along with the GEDCOM file, but that the standard does not specify any format for packaging all the files together, leaving the user to manage the file transfers themselves.

The second problem is that the specification does not specify where multimedia files should be stored with respect to the database or GEDCOM file; in practice GEDCOM files contain full directory paths that are unlikely to match those of another application on another system.

export

When it comes to GEDCOM support, vendors still tend to focus on GEDCOM import rather than GEDCOM export. Vendors focus on the ability of their application to import GEDCOM files created by other applications. Many vendors even proudly list all the applications that they believe their application to import perfectly in their feature list.

However, what is more important to you as a user is the quality of the GEDCOM export, and how well other applications support the product's GEDCOM dialect. After all, if no other application can import those files, you have been locked into that product, unable to switch to another.

FTW TEXT

Some vendors have taken so many liberties with the GEDCOM specification, that what their application produces isn't GEDCOM at all. Family Tree Maker is rightly infamous for producing an FTW GEDCOM dialect so awful, that it seems deliberately incompatible.

Even worse, several versions of Family Tree Maker default to creating ostensible GEDCOM files that are not GEDCOM files, but FTW TEXT files. The product's dialog boxes are dishonest about this in a way that makes a user who does not know better believe that FTW TEXT is real GEDCOM. The current owner of Family Tree Maker, Ancestry.com, should release a free FTW TEXT to GEDCOM conversion tool, but still has not done so.

GEDCOM alternatives alternatives Over the years, various alternatives to GEDCOM have been proposed, including FamilySearch's own ill-named GEDCOM XML 6.0. The sheer number of available alternatives is an embarrassment of riches.

These alternatives generally offer worthwhile advantages over GEDCOM, yet not one alternative has achieved significant industry adoption. adoption One reason for limited support for alternatives is that most vendors are not eager to support a standard controlled by another vendor.

Some proposals are vendor-independent, but getting any new standard - however good - adopted is difficult. Vendors are unlikely to invest in a format unless it is about to become the new industry standard, but it will not become a new standard unless vendors invest in it.

common extensions

One approach to solving some of GEDCOM's limitations that has been successful is the development of common extensions; a collection of GEDCOM extensions common to a group of products.

GEDCOM 5.5 EL (Extended Location) was developed by a group of German genealogy vendors in collaboration with the Verein für Computergenelaogy e.V. (Society for Computer Genealogy). GEDCOM 5.5 EL is supported by many German genealogy applications and is freely available to other vendors to implement in their product.

Many GEDCOM alternatives have been proposed. Most have been forgotten. None enjoy wide industry support.

replacement

Another approach to deal with GEDCOM's limitations is to create another, better standard, to replace GEDCOM. Many GEDCOM alternatives have been proposed. Most have been forgotten. None enjoy wide industry support.

The GEDCOM Alternatives article provides an overview. Two current developments are FHISO and GEDCOM X.

FHISO

BetterGEDCOM, an informal grassroots project to create a GEDCOM replacement has spawned the creation of the formal Family History Information Standards Organisation (FHISO). FHISO aims to develop modern standards for genealogy data.

GEDCOM X

Late in 2011, FamilySearch's GEDCOM X project was uncovered. FamilySearch opened up the GEDCOM X website early in 2012. The name is likely to cause confusion; like GEDCOM XML, GEDCOM X is not a new version of GEDCOM, but yet another GEDCOM alternative.

GEDCOM is not perfect, and not perfectly supported either, but it is the only widely supported standard.

conclusion

GEDCOM is a data format for genealogical data. GEDCOM allows transferring data from one genealogy application to another, but because of inherent GEDCOM limitations, incomplete specifications, unsupported dialects and poor implementations, that transfer may be less than perfect. On top of that, many applications do not even provide an import log to help you figure out how well the import went.

GEDCOM is not perfect, and not perfectly supported either, but it is the only widely supported standard.

In practice, basic data such as names and vital events transfers just fine, and that is already a large improvement on a world without any standard for genealogy data. A lot of other data such as notes and sources generally transfers successfully as well. Moreover, GEDCOM dialects of popular products tend to be supported by many other products.

Vendors tend to stress the ability of their product to import GEDCOM files created by competing products, but to a user, the more important thing is the quality of the GEDCOM files it exports, as that largely determines the ability of other products to import those GEDCOM files. Only when other applications will import the file can you use a GEDCOM file to do what it was designed to; move your data from one application to another.

updates

2010-11-05: GEDCOM Alternatives

The GEDCOM ALternatives article provides an overview of the many GEDCOM alternatives proposed over the years.

2011-01-07: GEDCOM 5.6

The hitherto largely unknown and never officially released GEDCOM 5.6 draft has surfaced.

2011-12-12: GEDCOM X revealed

FamilySearch GEDCOM X alternative to GEDCOM revealed through the GEDCOM X article.

2012-02-01: Family History Information Standards Organisation

Family History Information Standards Organisation (FHISO) officially introduced.

2012-02-02: GEDCOM X website public

GEDCOM X was known already, the GEDCOM X web site has been made public too.

2014-09-01: GEDCOM 1.0

Article updated regarding GEDCOM 1.0: there is an implementation. See GEDCOM 1.0 article for an example GEDCOM 1.0 file.

links

Copyright © Tamura Jones. All Rights reserved.