Genealogy Using a Computer and the Internet
A Tutorial Document
L. David Roper (
July 2000

"If you don't know where you're from, you don't know where you're going." anonymous
"Those who cannot remember the past are condemned to repeat it." George Santayana - Philosopher 1863-1952

If any links on this page do not work, please e-mail me. (Sometimes links are down temporarily.) I realize that the the many links may make the document hard to read; you can turn off the underlining in your browser if that is distracting. I favor inclusion of much information over ease of reading.

A Lindows/Linux Genealogy Notebook Machine


This tutorial document is not about how to do genealogy, nor is it about how to use computers. Instead, it is about how to use a computer as a tool in doing genealogy, with emphasis on using the Internet. For those who need introductory materials about how to do genealogy, see:

Some good sources for those who need help with computing are the web sites:

When I started doing genealogy in the early 1960s, paper and pencil were the most used technologies, with a typewriter to put the results in a readable form for distribution. One had to write to and visit many relatives, libraries and archival centers to gather the data.

The practice of gathering, organizing and reporting information about one's ancestors and relatives has been greatly changed by the continuing development of powerful computers and software and by the increasing information available on the Internet. We now have this wonderful tool set to help us do genealogy better and faster.

One still must start by contacting relatives and continue to gather data from them, libraries and archival centers through the years. However, every day more of the data that were formerly available only in libraries and archival centers are becoming available on the Internet.

A Computer for Genealogy

Using the Internet for Genealogy 

Back Up Your Data

You must regularly back up your data. If you do not, I guarantee that some day you will be  very sorry! When first starting doing genealogy, floppy disks (1.44 Mbytes) are sufficient for backup. Most genealogy programs have a backup feature that compresses the files so that you can get much more on a disk. Windows 98 also has a backup program that does a good job of compressing files; unfortunately it is not automatically installed when one installs Windows on a machine; you may have to run Windows Setup in the Control Panel to install the Backup program. After a few years of doing genealogy, using floppy disks for backup will get very unwieldy since many floppy disks will be required. Then get an Iomega ZIP drive (100 Mbytes or 250 Mbytes), as described in the computer section above.

Every night I back upcopies of changed genealogy files to two other computers on my home Local-Area-Network (LAN) and zipped copies of my genealogy files to a ZIP disk; once a month I put that weekly disk in a safety deposit box at my bank.

In general you should back up your entire hard disk(s) to a DAT tape. I use a Hewlett-Packard 8-Gbytes Dat tape drive to back up my desktop and laptop (through the LAN)  every week; once a month I put that weekly tape in a safety deposit box at my bank.

There are several web sites that provide free backup disk space. The price you pay is having advertizing display on your computer screen when you do the backups. Some of the web sites are:

To determine whether you have adequately backed up your valuable data, ask yourself the following questions: Can I get to my data if:

Back Up and Share Data on CD-ROMs

It is possible now to create your own CD-ROMs (called CD-RW or CD-R, and which can hold 650 Mbytes), but I find that ZIP disks are quite adequate if I compress the data files. Also, I think that the ZIP disks are less susceptible to handling mechanical damage than are CD-ROMs; however, the reverse may be true for electronic damage.

As a result of studying CD-RW drives for this document, I have started usinga CD-RW drive to make CD-RW and CD-R disks with genealogy data on them. The factors that convinced me to do this are:

There are two ways to create CD-ROMs:

Most CD-RW drives can also create CD-R disks. CD-RW drives' read speed is much less than the fastest CD-ROM drives.

I use a CD-RW drive to create, using my flat-bed scanner and Adobe Acrobat, CD-Rs containing back issues of the newsletter of my Capt. Daniel Little Family as PDF files to be read by the free Adobe Acrobat Reader. Then the newsletter issues are preserved for the long term. I also use it to send large genealogy files to others.

If you don't have a CD-ROM burner, will make CD-ROMs for you at a nominal price.

Genealogy and Privacy

Some people are very concerned about privacy with regard to genealogy data. (Often they are the very ones who are most anxious to get data they want from you.) They do not seem to realize that dates and places of births, marriages and deaths are public records. (I know of one young lady who was incensed that her marriage was published in the newspaper, where her parents first learned about it.) So one could always collect from the public records and be perfectly correct in making those data available to anyone. However, much genealogy data comes from the memory of persons and perhaps this makes some people think that those data for persons still alive should not be made public.

Some genealogist delete all dates and places for people who are thought to be living. There are programs for doing this automatically for gedcom files.

When people object to the fact that I have data about their families and distribute those data freely to others, I try to reason with them as described above and about the value of the data for posterity. I give them the sources of the data so that they will know who to contact if they wish. Finally, if they insist, I remove dates and places for the persons that they request.

Once I had a relative get angry at me for indicating that some of her children are adopted. I agree that one should not emphasize that children are adopted, but I also believe that genealogical records should indicate such facts, which are public records. Nevertheless, I deleted those indicators in this case.

Often the most sensitive fact about a family is a child born out of wedlock, sometimes called a "natural" child in old documents. (PAF4 allows one to put "not married" in the marriage date field for such cases.) I do not recommend using the words "natural" or "bastard" child to describe such a child. Sometimes such a child is just listed as a child of a marriage of one of the parents, with perhaps a note explaining the situation. A judgment often must be made about how much of the truth to record in these cases.

As I was completing this document a person sent me the following message: "Where did you get my family information? Have we ever spoken before or traded info? Just curious; it's a bit unsettling to find someone who has your name on their site, and you don't know or have forgotten who they are..." I answered with details of the many sources where I got the data, told her how to download the entire huge family to which she belonged and asked her for any new data that she might have. I just received a friendly note back from her acknowledging that one of my sources was data she had submitted to World Family Tree.

Many of the Internet sites, that make genealogical data available, delete dates and places for persons that could be still alive. World Family Tree does the unfortunate thing of putting the word Private in the date fields on its CD-ROMs for those who may still be alive; it can involve much work to get rid of that unneeded word. (See above for PAF Pal that can get rid of it easily for PAF files.)

I prominently tell people to only send me data that I can freely give to others. My overall attitude is that all knowledge should be shared, except in cases of embarrassment and danger, which are judgment calls. I sometime do not record events that might be deemed embarrassing.

Authenticity of Genealogy Data

I occasionally get a complaint that there are many errors in the data I have posted on the Internet. In most cases I have obtained those data from other persons either by means of memories, books, letters, e-mail or the Internet. There is no way that I can guarantee that those data are authentic. (I have obtained much data myself from archival documents and family records, but those data are a small amount compared to the huge amount of data I obtained in other ways.)

A genealogist's worst nightmare is to record someone as being dead who is alive. As I wrote this section this nightmare happened to me for the first time in thirty years of genealogy work. The person said "I am sorry to bother you on this matter, it was just a little upsetting to me." I had the wrong sister of hers declared dead. I replied "Please forgive me for any discomfort that this error may have caused you. Thank you for correcting this egregious error." I sent to her a file containing all the data I had for her grandparents and for her ancestors. I also took the opportunity to ask her to help me get correct data for the descendants of her grandparents. As I finished this paragraph I got a kind note back from her.

My general approach is to indicate all places from which I get data in the notes for an individual or a marriage and then compare data that disagree to try to ascertain which is more likely to be correct. Some factors I use in deciding which of two sets of conflicting data I prefer are:

A genealogy data set will never be totally accurate. By comparing archival documents, family records, and family memories one uses reason to try to make the set as internally consistent as possible. One cannot expect more than that. I would guess that the data for about 500,000 people that I have collected are more than 95% correct.

At some point when collecting large amounts of data for a surname enough data will be available in the collection to allow the collector to begin to make many connections between otherwise unconnected data. This is happening for me now for my Roper and Franklin families. I often get net data from someone and can immediately connect the new data to a family I already have in my data set. Often I peruse through one of these two data set looking for connections; I almost always find some.

Recommendations about Collecting and Organizing Data

Although this is not a tutorial on genealogy, here are a few recommendations about collecting and organizing genealogy data:

Genealogical Genetics

Elementary genetics teaches that a human egg becomes a female embryo when a sperm gives it an X chromosome (mostly DNA) and a male embryo when the sperm gives it a Y chromosome (mostly DNA). Thus the Y chromosome is passed down generations only through the male line. You might want to read some articles about the Y chromosome.

So one can determine male paternal lineage by comparing the DNA coding of Y chromosomes. This was recently done for descendants of a slave woman, Sally Hemings, of Thomas Jefferson, with the result that one of her sons had a Jefferson Y chromosome, either from Thomas or one of his near relatives.

I know of one surname for which an extensive project for Y-chromosome DNA testing has been carried out; the Savin family. I am trying to set up such testing for the Roper and Franklin families. (Almost every Franklin family has a tradition that they are related to Benjamin Franklin, statesman; my Franklin family is no exception. Such a project could validate some of those traditions.)

Mitochondria (the energy-producing organelles in the nucleus of cells) DNA is passed down generations only through females. Thus, testing of mitochondria DNA can aid in establishing matrilineal lines. This is more difficult in most European lines because the surnames of females are lost in naming children; the exceptions are Spanish and Portuguese, but only down one generation.

See Alan Savin's short article Introduction to Genetic Genealogy.

Leave Your Data for Posterity

"Record your history so others may benefit from it." Chinese fortune cookie

Do not allow your data collection to die with you! Make arrangements for your data and genealogy files to be sent to some libraries or genealogy data depositories, such as the LDS Genealogy Library in Salt Lake City. I have such instructions in my safety deposit box at the bank, where I keep late copies of the data on a ZIP disk.

The author is a retired physics professor who has been using computers and doing genealogy since the early 1960s.