History and Description
These files of population and death statistics from developing countries were amassed by the Organisation for Economic Co-operation and Development, an intergoverernmental organization representing 25 of the world's democracies with advanced market economies. One result of the project was the UN Model Life Tables for Developing countries. The data contained in these files are for the most part raw census and vital registration data and the data quality for many (probably most) of them was such that the UN could not use them as standards for the model life tables.
For some countries registered deaths are available from the 1920's up to the mid 1970's. Some statistics are available for both urban and rural populations, as well as other subgroups of the population. For many of the countries included in the datafiles, this is almost all of the data that exist. Aside from their historical interest, these data may be useful for illustrating patterns of age misreporting or for teaching methods of adjusting data.
Format of the Data
Each file contains a number of series of populations and deaths. They are arranged by country, alphabetically (French names), with populations (from earliest to most recent) followed by deaths (from earliest to most recent).
Each series begins with a header record (with a 0 in column 1) that identifies the series that follows and gives additional information about the series. The format of these headers is rather loosely structured, making it somewhat difficult to write a general program to process all the data. The header contains a variable number of free-format alphanumeric fields. The general format is:
- "0" in column 1 identifies a header record
- "S" or "D" (S=population; "D"=deaths)
- Country or region identifier, with spaces represented by "_"
- Date (Month/year for populations, year for deaths)
- "VAR=" or "REG=" or "=" followed by the number of age groups; there is also a final group for those with unknown age, which is not included in this count. "REG=" (or just plain"=") indicates that the age groups are of regular width.
- "CLASS=1" or CLASS=5" indicates that the age intervals are in single years or 5-year groups. Series with "regular" widths in Class 5 have a first age interval representing age 0, a second interval representing ages 1-4 and 5-year intervals thereafter, except for the last interval, which represents people with unknown age. If this field is left off, the data are CLASS=5
- For series with variable-width age intervals, a series of fields follows indicating the irregularities involved. These fields have the general structure ">x:w", meaning that above age x, the width of the interval is w years. Later fields may modify earlier ones, e.g.:
>10:10 >30:15 If the field refers to a single age interval, the ">" is sometimes left off. Sometimes the fields are enclosed in parentheses.
- The record is terminated by a "."
There are numerous exceptions to the general format, the chief ones being:
- There is sometimes a field containing "(E)" in between fields 2 and 3 as enumerated above. The meaning of this is not indicated in the documentation, but, based on an examination of the literature regarding the data, it seems to indicate that the series is an estimate derived from a survey or extensive correction of census or vital statistics data.
- If data are available for urban and rural populations separately, they are designated by a "U" or "R" between fields 4 and 5 as enumerated above.
- The is sometimes one or more spaces embedded in field 5 as described above; i.e., the format can be any of the following:
VAR=xx REG=xx =xx VAR =xx REG =xx
Following the header record is a variable number of 91-column records each containing, in columns 2-91, 10 9-digit numbers representing the populations and deaths for the groups described in the header records. Column 1 is blank on the data records. The numbers are entered in fixed format with a decimal point following each number. The first set of records is for the male population, with as many records as needed to accomodate the number of age groups indicated in the header record, plus a final category for those of unknown age. (The documentation doesn't indicate whether every series contains a category for unknown age, so care should be used in making an assumption about the number of following records.) Following the male data, the female data begins on a new record with the same format.
Availability of Data
The data have no restrictions on use. See below for a file list.
- Waltisperger, D. Mortality Project: Annotated Bibliography on the Sources of Demographic Data. Volume I: Africa - Near East. Paris, Organisation for Economic Co-operation and Development. Development Centre. 1978.
- Canedo, A. Mortality Project: Annotated Bibliography on the Sources of Demographic Data. Volume II: Latin America and Caribbean Area. Paris, Organisation for Economic Co-operation and Development. 1979.
- Canedo, A. and Waltisperger, D. Mortality Project: Annotated Bibliography on the Sources of Demographic Data. Volume III: Asia. Paris, Organisation for Economic Co-operation and Development. 1979.
These books (bound together in one volume in the OPR library) contain detailed information about the sources of data, quality of data, and previous analyses or corrections of the data.
Following 7 files are available for downloading:
|File Name||Size (Zipped)||File Type||Access||Note|