Application Series 13


RECORDING THE NEXT GENERATION

Dr Campbell describes the use of SIR in one of the longest running continuous databases known, which is the 'Aberdeen Maternity and Neonatal Databank'.

Background

In 1950 Sir Dugald Baird, the Professor of Obstetrics and Gynaecology at the University of Aberdeen, Scotland, decided that all obstetric and gynaecological events in Aberdeen Maternity Hospital and its associated nursing homes, would be recorded for research purposes. Thanks to the foresight of this man, data relating to all reproductive events (deliveries and gynaecological) for the residents of the geographically defined area of Aberdeen City since that date have been continuously recorded.

Apart from a very small number of cases that are missed (less than 0.1%), this is a complete record for a city with a current population of 250,000. This population is quite stable still despite the oil boom of the 1970s. Approximately 60% of women in their first pregnancy currently have been born and bred in Aberdeen. Such stability means that this unique and valuable databank has the reproductive history for up to three generations of Aberdeen women, sometimes within one family.

Ingenuity of early days

The data were originally recorded on Cope Chat cards, which are ruled cards about 9" by 4" with a series of holes punched round the four sides. Categories were allocated to each hole, and, depending on the data for a case, the hole was converted to a notch by a special punch. Information was recorded by writing on the card as well as the 'notching'.

Analysis was carried out using a long needle pushed through the holes and lifting. Cards which had been 'notched' dropped out. So it was possible to easily collect a subset of cards with a specific characteristic. This ingenious method is a far cry from the power and flexibility of PQL !!!

Computers arrive

When the university mainframe computer became available for use in the 1970s a suite of programs was set up to store, manipulate and analyse the data. These programs became more and more complex over the years as the number of records and the quantity of data recorded for each event increased.

A complicated system of batch command files was created by the small support team to perform routine tasks while any non-routine analyses required specialist programming. The system became more and more unwieldy while also being reliant on the knowledge and expertise of fewer and fewer people.

Crisis!

In late 1984 a crisis point was reached when the last remaining member of the original team left suddenly. The options for coping included abandoning the data completely, attempting to unravel the complicated web of programs; or transferring to a new system.

The last option was chosen as discarding such a priceless data set was unthinkable, but what should the new system be? A new system of programs could be developed but past experience mitigated against this. One of the new relational systems on the market could be used. Alternatively the untried, at Aberdeen at least, system called SIR could be introduced. After discussion and debate between 'database experts', commonsense prevailed and SIR was chosen.

SIR Introduced

The later part of 1985 was used to develop the SIR database in consultation with colleagues and computing advisers. One of the early tasks was to get the data entry functioning in order to maintain continuity of the data collection. As SIR/FORMS was not available on the mainframe at the time, data entry was carried out using a simple micro based card index system. This went live on 1st January 1986 and the programs to transfer the data from the micro into SIR were set up. By April all was working with the new databank and attention now had to be focused on rescuing historical data.

After a number of attempts to work out the complex interactions of the existing suite of programs failed, a simple Fortran program was written which dumped data from the latest set of tapes as raw data files. SPSS was then used to generate the input files for Batch Data Input and thus the data from 1950 was incorporated.

Data Entry

When the university changed mainframes in 1988 SIR/FORMS and then SIR/MFORMS became available and data entry was transferred to these. Data entry is to an independent database in order to maximise response speeds, accommodate alternative indices and allow for data entry of text descriptions which will ultimately be converted to code.

Up to five simultaneous users update this 'input' database from the paper case note records once a woman has been discharged. Once a week a series of SIR programs convert text descriptions to numeric codes; calculate and store various items such as duration of stages of labour, and update the full research database. Any cases with text which cannot be coded are left to be dealt with manually.

The databank in use

At the start of 1995 the databank contained just over 110,000 cases, Ie women, comprising 2.5 million records stored in 35 different record types. It grows at the rate of approximately 5,500 pregnancies/deliveries per year with between two and three thousand new cases added.

The data are used for a number of research related tasks. These range from epidemiological studies within Aberdeen to collaborative work with other hospitals world-wide. When working with external units, anonymous extracts of the data are generally provided, usually as SPSS files.

A popular current theory is to attempt to link adult diseases with the antenatal environment. As this databank covers at least 45 years, we receive more and more requests to link adults to their mothers' records and thus their own birth details. This is achieved by using SIR tabfiles to create secondary indices and matching on whatever identification has been provided by the requester.

Many researchers learn SIR in order to extract their own datasets for analysis, although this is restricted for reason of confidentiality to workers in Aberdeen who already have access to such confidential patient data.

Portability

The power, flexibility and portability of SIR has meant that the databank has survived two machine and operating system changes and responds quickly to changes in data items recorded as medical practice evolves.

Apart from the data entry staff, all maintenance is carried out by a single clinician in research time with some support from a member of the university computing centre staff.

For more information on 'The Aberdeen Maternity and Neonatal Databank' contact:

Dr. D. M. Campbell,
Dept of Obstetrics and Gynaecology,
Aberdeen Maternity Hospital,
SCOTLAND

Back Sir Home