Application Series 7


DSS MASTERS SURVEYS

Stuart Mitchenall, Computing Services Manager of Department of Social Security, describes the skills his Department has developed with complex survey analysis and the essential role of SIR.

Background

Since the early 1980's the Department of Social Security (DSS), a Department of the UK Government providing payments for pensions, disability, unemployment and other (generally) low income groups, had been having problems with the detail available in existing socio-economic and household survey data. The analysts of the department were faced with having to use very small samples of particular client groups (such as single male pensioners over 80 or disabled women in their 50's) on which to base forecasts of significant government expenditure.

Solution

By the end of the decade it had been agreed that the best answer to this problem was to launch a survey based on the needs of the Department with a sample size adequate to gain statistically valid samples of a far higher percentage of interest groups. As some groups of benefit recipients totalled only 4 to 5,000 in the entire population, it was obvious that without specifically targeting groups of the population, adequate samples would not be achieved. For a variety of reasons the survey was targeted at a general sample, and not a specific group. Issues such as confidentiality, problems with using DSS data to identify clients, and the UK Data Protection Acts, all had an effect.

The survey was named the Family Resources Survey (FRS) and the project launched using contracted resources.

The use of Computers

From day one DSS insisted these contractors used computer aided personal interviewing techniques. We had, prior to the award of the contracts, conducted an analysis of the various packages available and had decided to use Blaise, a tool developed and supplied by the Central Bureaux of Statistics (CBS) in the Netherlands. We had looked at various solutions, but had decided that a specific tool able to run on the then available 286 portable computers was better than trying to adapt other software.

Blaise performed best for the type of survey we were conducting, with complex routing and validation features, combined with mathematical capability to validate quite complex individual requirements for survey questions if needed.

SIR the Best option

Where does SIR come in? The UK government had established a pattern of using SIR for surveys with an inherently hierarchical structure, and we had some experience of the use of the database from earlier analytical work. We had also used Ingress and Oracle for specific solutions, but found these products were inefficient when placed against the needs of our analysts. That is not to say SIR was perfect, but our judgement was that it was better. Further, it support reporting tools designed for the analytical environment, and in that sense it was unique. So we decided to use our existing expertise and go with a solution we had some confidence was a good fit to our data.

SIR Co-operates

Next problem was environment. We were obliged at that time to consider open systems implementations very strongly, and certainly SIR was not available on the Unix (we really wanted to use Unix) variants we were using for other purposes. DSS pushed us towards an ICL solution, and we were able to get preferable terms on DRS 6000 equipment. SIR agreed to co-operate in porting SIR to DRS/NX, and ICL agreed to cover the costs of installing equipment to allow the port. From that point all went pretty smoothly. ICL installed the system at the SIR offices, the port was completed to schedule, and made available to us in Central London in time for us to develop the database ready for the first pilot data. How did SIR interact with a CAPI system? Blaise already had extensive output facilities, generating output files suitable for use in import to, amongst others, SAS and SPSS. With contract help we were able to generate a standard SIR export file to import the data into SIR, and had it not been for some predicted, but unaddressed, problems, we would have had a very simple importation system for our database.

The Problems

Firstly, we failed to treat the process as a single task. The companies conducting the survey were also given the task of writing the survey instrument for use in the field, and they wrote the survey without any thought of the output format they were generating or the eventual format of the database. They generated an ideal survey instrument without further consideration, and this was a mistake. Because of the way Blaise formats output, it would have been quite possible to have ensured that the output data structure was block structured in line with the database structure, and that output was consistent case to case. Neither of these were true, unfortunately, so large amounts of time have had to be spent analysing individual cases to determine data value positions.

Secondly, Blaise follows some unique naming conventions relating the recurring incidence of data - income for head of household, spouse, children, lodger, granddad, etc. Thus changes in data position within the output structure also result in changes in name.

Finally, when we embarked upon the survey the portables available were more restricted than those today, and the questionnaire had to be split to enable the programs to run. This meant the questionnaire had to be split. Ideally, had it been split into database record types, problems would have been immensely reduced, but at the same time validation of records would have been complicated across both individuals and households. We ended up with two large records (Household and Benefit Unit) which have to be read into import database records. We then process the import data to correct all the problems and derive the analytical record structure of the database.

Lessons

I leave those to you, but remain confident that using a computerised field instrument in conjunction with a SIR database provides, if properly implemented, a very rapid way of generating large socio-economic database with the majority of the data prevalidated. The principal adopted early, that if you don't collect the right information at source then however you impute data is never entirely dependable, remains true. Had we put more thought to the interface between Blaise and SIR when designing the whole system, I am sure we would have cut our processing times and resources required by an order of magnitude.

For further information about DSS use of SIR please contact:

Stuart Mitchenall
Computing Services Manager
Department of Social Security
Adelphi Building
1-11 John Adam Street
London WC2N 6HT UK

Back Sir Home