Experimental Comparison of a Statistical and a Case History Technique of Attitude Research

Samuel Stouffer
University of Wisconsin

This is a report of an experimental comparison of one statistical and one case history technique of research on social attitudes.

The statistical method investigated is that which has been developed by L. L. Thurstone, professor of psychology at the University of Chicago. The method employs a test of attitudes toward a particular object such as war, the Negro, prohibition, the church, union labor. The test comprises a set of statements, each of which has been assigned a scale value on a linear attitude continuum by an application of the psychophysical method of equal-appearing intervals. A person's attitude score is obtained by merely averaging the scale values of the opinions which he indorses.

If valid, the Thurstone technique promises to be a very important tool of research in the social sciences. Though necessarily somewhat complicated in construction, the Thurstone test is simple in application. The test can be taken in about fifteen minutes. The easy method of computing scores enables a clerk to get several hundred indexes in less time than it would take to analyze a handful of case histories. The test supplies indexes whose relationship with a variety of associated factors, such as sex, age, economic status, home background, education, social participation, etc., can be studied quantitatively and directly. All that is necessary is to get a large number of cases. The sample then may be broken into groups and these groups again into smaller subgroups for the purpose of holding constant experimentally the qualitative background factors which cannot be held constant mathematically by partial correlation. Statistical analysis of these smaller subgroups doubtless can be aided by using some of the new techniques developed by R. A. Fisher and others.

The principal criticism of the Thurstone method, and of other statistical devices for measuring attitudes, is that they do not measure attitudes. Instead, it has been said, they measure opinions, which are thought to be rather capricious indexes, subject to the whims of every vagrant wind. Attitudes, it is believed, are shut up and subjective, perhaps capable of revealing themselves to a student skilled in sympathetic introspection, but not yielding to the open sesame of such a simple device as a test.

On the other hand, the case history methods of studying attitudes have certain rather obvious advantages. The more we know about an individual's background and overt behavior over a period of time, the more accurate we ordinarily should be in interpreting what his attitude is. We see his opinions not in abstraction but in their cultural setting. Of course, the utility of the case method is limited by the time and labor which it takes to collect and analyze the cases. Frequently enough cases cannot be studied in any single investigation to yield

(155) more than preliminary hunches. In a problem in social research there are usually several factors or variables, a thorough analysis of which requires a large number of cases so that sets of adequate subsamples may be obtained in which certain factors are relatively constant. However, even as an exploratory device or as a device to illuminate the "why" of correlations found by quantitative methods, the case method has been called into question, because of lack of objectivity. It has been said, for example, that several interpretations of the same case may differ as widely as several psychoanalytic analyses of the same dream.

Summing up, we have on our hands two ifs. It may be said that an attitude test produces quickly and cheaply a set of indexes of enough people to make possible a study which holds constant a variety of factors associated with attitudes, but that the study may be worthless if the test does not measure what it purports to measure. On the other hand, the case method, so useful in suggesting preliminary hypotheses and in throwing light on the why of correlations quantitatively ascertained because it studies the individual's behavior and feelings in his own cultural setting, also may be dubious if competent investigators fail to agree in their sympathetic introspections.

The experiment here reported is an investigation of these two ifs. Only a few of the findings can be reported here. The detailed methods used and an attempt to appraise theoretical implications of the findings will appear in a forthcoming paper in the American Journal of Sociology.

H. N. Smith's test of attitudes toward prohibition (a test constructed by the Thurstone method) was given to 238 students of the University of Chicago. The test evidently measured something consistently, as two parallel halves of the test yielded a reliability coefficient, using the Spearman-Brown formula of .94, about as high as the reliability coefficients of many intelligence tests.

Each of these 238 students wrote an account, in about a thousand words, of his or her experiences and feelings from childhood to the present day in connection with prohibition laws and in connection with drinking liquor one's self. The students wrote anonymously and their documents were matched with the test sheets by the use of code numbers. The student was not informed of the precise purpose of the experiment, and, of course, no premium was placed on consistency between the test and case history.

Each of the 238 case histories was read carefully by four judges, who were graduate students in sociology at the University of Chicago, selected by a committee of the faculty on the basis of their experience in the interpretation of case materials, their acquaintance with the theoretical literature on social attitudes, and their supposed insight into human nature. Each judge, without knowledge of what another judge had done, made two ratings as to each paper. He did this by checking with a cross, somewhere along a five-inch graphic rating scale, the favorableness or unfavorableness of the subject's present attitude toward prohibition laws; and also by checking with another cross, somewhere along another five-inch graphic rating scale, the favorableness or unfavorableness of the subject's present attitude toward drinking liquor himself or herself.

This numerical score was then transmuted into relative terms by using a standard measure, z =X-M/sin which X was the score, M the mean of all the 238 scores assigned by this particular judge, and s the standard deviation of these 238 scores. The four ratings of an individual's attitude were added to provide a composite index.

These indexes, like the scores on the Smith test, apparently measured something consistently. The judges agreed in their interpretations surprisingly well. The reliability coefficient, using the Spearman-Brown formula, was .96.

It was feared, however, that the conditions of the experiment might have been exceptionally favorable to high agreement. Would two laymen, for example, without knowledge of the theoretical literature on attitudes and with strong personal feelings on prohibition, agree in their interpretation of the documents? We asked the superintendent of the Illinois Anti-Saloon League and the secretary of the Illinois Association Opposed to Prohibition to read a random sample of 99 cases. Again, to our surprise, we found that the ratings by these two men agreed almost as closely with each other as with the ratings of our four judges or as did the ratings of the four judges with one another on the same 99 cases.

It was evident, then, that something had been measured reliably by the case history technique, just as something had been measured reliably by the test. These somethings each had been presumed to be attitudes. But were these two somethings the same?

At least a first approximation to an answer was found by correlating the test scores of the 238 individuals with the composite ratings by four judges as to attitudes toward prohibition laws. The correlation was .81, which became .86 when corrected for attenuation. It is quite apparent that if the judges of the case histories were getting at attitudes the test was too. At least, this would seem true with respect to the particular continuum-favorable to unfavorable-abstracted for purposes of study. Whether use of a different attitude continuum would result in less, or more, agreement, awaits further inquiry.

Several other checks tended to strengthen, rather than weaken, confidence in a conclusion that the two methods were getting at pretty much the same thing. There was no exception. To cite only one example: The correlation between attitudes toward prohibition, as measured by the test, and attitudes toward drinking liquor one's self, as measured by the case histories, was .58. This checks very closely with the correlation between attitudes toward prohibition laws, as measured by the case histories, and attitudes toward drinking liquor one's self, as measured by the case histories, which was .60.

The tentative conclusion, if confirmed by further research, is (1) that the Thurstone method of measuring social attitudes yields indexes which are quite comparable with indexes obtained independently by a case history method, and (2) that different interpreters of cases can agree in their inferences as to attitudes, at least in inferences of the not-too-complicated type made in the present investigation.


