Methods For Measuring Attitudes

Daniel D. Droba
University of North Dakota

Bain (8), Clark (29), Droba (36), Folsom (40), Katz and Allport (56), Lundberg (67), Murphy (69), Rice (77, 79), and Thurstone (98) have given accounts of the methods of measuring attitudes. Each of these reports constitutes a distinct contribution to the measurement of attitudes. In the present review an attempt is being made to go a little further in classifying, analyzing, and evaluating the existing techniques. Certain important phases of the methods will be separated out and followed through the literature. The result is a cross-section of the methods.

The literature reviewed here is limited to studies concerning attitudes in a certain restricted sense. Attitudes refer to a rather definite set of phenomena having a definite, specified object of reference. Hence studies concerning traits, without a definite object of reference, such as introversion, ascendance, aggressiveness, will not be included. Borderline studies may occur, but for purposes of clarity a line must be drawn.


The term " method of absolute ranking " is used in place of the term " questionnaire " which has had a very indefinite meaning and has been used to cover two or more types of methods, e.g Koos (58). In this method degrees of attitude are expressed separately for each indicator. The subject has to decide the degree of attitude in the case of each statement or question in the scale, without reference to other statements, or questions.

To illustrate, Watson (112) used a number of impersonal statements expressing attitudes toward a specific race. The subject ranks his agreement with each statement in five steps. For example, the

( 310) statement " Jews will try to get the best of a bargain even if they have to cheat to do so " is followed by the choices : All, Most, Many, Few, No. The ranking is made separately for every statement with-out any reference whatsoever to another statement. For this reason the method is called " absolute."

Galton may be said to have been the first to have used the method for measuring psychological data. His objects of measurement were images and not attitudes. The method has been employed very extensively since for various purposes. Although Koos (58) includes a few ratings in his review the great majority of the reported investigators have used the method of absolute ranking, and his summary gives a fair picture of the extent of the use of the method outside the domain of attitudes. He has reviewed experiments reported in seven educational periodicals during 1925-26. He found that 143 out of the 438 investigations reported (24.6 per cent) used the method which we have here called " absolute ranking."

The first men to apply the method to the measurement of attitudes seem to have been Moore (68) and Symonds (89) both of whom had the subject rank each indicator separately in two categories: either Yes or No. It has been applied since quite frequently. The writer found altogether thirty-one recent studies in which the method of absolute ranking was used (4, 6, 7, 9, 11, 12, 24, 26, 41, 43, 48, 52, 57, 59, 60, 63, 64, 65, 68, 70, 75, 88, 89, 110, 111, 112, 113, 114, 115, 117, 122).

The disadvantages of this method are: first, the fact that only a narrow range in degrees of attitudes is measured. Only two investigators use as many as 7 steps (60, 70), all the others using only 2 to 5 steps. Since it is hard to rank a statement on too many steps, it seems neither desirable nor practicable to extend the steps beyond five. But the broad variety of attitudes for and against an issue is not represented adequately in such a relatively narrow range of steps.

A second disadvantage is that the method does not offer adequate units of measurement. The number of steps, five in the above example, represented by a number of possible responses or rankings are arbitrary. There is no guarantee that the difference between "All " and " Most " is even approximately the same as the distance between " Most " and " Many."For this reason there is no basis for comparing the scores obtained for different groups. Such scores cannot be represented by a curve of distribution.

An advantage is that it takes a relatively short time to construct a test by the use of this method.

By the " case method " is meant an essay type of description of an attitude consisting of at least a paragraph. For example, Bogardus (22) has asked a number of Americans to describe their attitudes toward the Filipinos. Either oral or written descriptions may be used. If oral indicators are used the method may be called an informal case method or interview, if written indicators are employed it may be called a formal case method. In the bibliography seven references are to the informal case method (1, 23, 25, 61, 66, 78, 120) and twelve to the formal case method (15, 17, 19, 20, 22, 30, 62, 87, 91, 106, 120, 121).

Both the formal and the informal case methods may be subdivided into two types. In the first type an individual describes orally or in writing his own attitude toward the issue in question. In the second type he describes the attitudes of his acquaintances. This latter type has hardly ever been used owing to its inexactness. In our literature only Lasker (62) has used it. All others have employed the self-description type.

Historically this method is the oldest of all. Case studies for various purposes were made long before the questionnaire studies. The application of the case method to the study of attitude measurement is, however, of rather recent origin. Thomas and Znaniecki's Polish Peasant in Europe and America (91) is perhaps the best example of the earlier attempts to study attitudes by this method. Among the later investigators who have applied the method in a more limited and probably more accurate sense Bogardus stands out as the best representative (15, 17, 19-23).

A drawback of the case method is that it is not amenable to quantitative analysis. In a description of an attitude the depth rather than the breadth is taken into consideration. Some writers such as Calkins (25) and Bogardus (23) have tried to analyze attitudes by the use of this method. However, an analysis of this type is subject to crude error since it is made by a single individual. Only Stouffer (87) has used several judges to estimate the degrees of attitudes expressed in the written descriptions.

The only advantage of the method is that it may be employed to explain attitudes that have been measured by a more accurate procedure. The development of attitudes in one individual or in a smaller group can be traced by the use of this method.

In this method the decision of the subject about an indicator is relative to another indicator. The subject may be asked to arrange in order of merit occupations or nationalities so that each occupation is relative to another occupation and each nationality to another nationality. The same procedure can be applied to statements expressing attitudes toward certain topics. Accordingly two variations of this method may be noted. In the first variation items designating the object of attitude, such as nationalities or occupations, are ranked. In the second variation-items indicating the attitudes themselves are put in order of preference. The latter procedure can be applied either in administering or constructing the test. In administering the test the subject checks one or more indicators with which he agrees, and he may not be aware of the fact that the indicators are related to each other. But in scoring this relatedness is brought to light.

The method of relative ranking was used two or three decades ago for various purposes such as the studying of affective values, beliefs, men of science, and shades of gray. Bogardus, in devising his " social distance " test, may be said to have been the first to apply it to attitude measurement (14).

Investigations of this type may be divided into four groups. (a) Investigators in the first group in administering their tests have asked the subjects to rank the object of their attitude, such as teachers and novelists (5, 13, 15, 19, 31, 47, 86, 115, 119). (b) In the second group may be classified an investigation in which the experimenter has requested the subjects to rank statements expressing the attitudes (80). (c) The greatest number of experiments fall under the third type. These differ from the first two in that the statements are first ranked by a special group of judges without expressing their attitudes. Then in administering the test the statements are presented in a ranked form and the subject's task is simply to mark one or more of the items representing attitudes toward the object in question (2, 3, 4, 10, 14, 16, 18, 19, 21, 51, 54, 56, 82, 83, 107, 116, 118, 124). (d) The fourth group might be represented by an attempt to apply an elaborate statistical procedure to the method of construction (94, 95, 99). The ranking of the attitude indicators is based on the pro-portion of a large number of judges considering one statement to be more of less in favor of an issue. The standard deviation of judgments about one statement is taken as the unit of measurement.

With the exception of the fourth type the method of relative ranking is based on very arbitrary units of measurement. The steps

( 313) of order of merit are considered to be the units of measurement. The distance between the items representing the steps, however, cannot be thought to be equal, not even approximately so. Consequently the scores obtained from such tests cannot be plotted in a frequency distribution and no adequate comparison of attitudes of groups can be made.

It is a simple method in its first three forms. It can easily be set up and applied to the study of a variety of specific attitudes as Vetter (107) has done. The first form of the method is best applicable to words representing the objects of an attitude such as nationalities and occupations. It is less applicable to statements expressing the attitudes themselves such as " The United States should join the League of Nations." That is. it is easier to rank words than statements and words represent more uniform units of classification. The third form, in the construction of which statements are ranked, provides a more satisfactory scoring method than the first two forms.

The fourth form, in which Thurstone has used a special statistical method is the most accurate of all the four forms. It provides for approximately equal units of measurement and makes possible a distribution of attitudes along a linear scale. Yet it must be recognized that this procedure is laborious and in most cases impracticable.



By a " graphic rating scale " in the measurement of attitudes is meant a line along which the steps representing the various degrees of attitudes are indicated by words, numbers, or phrases. Objects of attitudes are not represented by the graphic rating scale. Two types of graphic rating scales may be distinguished: the self-rating type in which the subject marks his own attitude on the line, and the " rating by others " type in which a person's attitudes are rated by his friends.

The use of the graphic rating scale for measuring attitudes is very recent. Porter (75) has had several judges rate their friends' attitudes on a scale from 0 to 10. This is the " rating by others type."Rice (76) has employed a graphic self-rating scale to measure degrees of attitude extending from an extreme degree of radicalism to an extreme degree of reactionism. Thurstone and Chave (101) also used a self-rating scale and so did Droba (35).

One of the possible disadvantages of the self-rating scale is that the raters tend to overestimate their desirable attitudes and to under-

( 314) -estimate their undesirable attitudes. However, this happens only if the rater's attitude deviates from the central or neutral position. That is, if his attitude is somewhere inside the neutral or middle range, he will very likely estimate himself correctly on the self-rating scale. This tendency has been found with respect to other traits by several investigators. It will, of course, have to be verified in the field of attitude testing. Ratings by others, though time consuming, are probably more reliable than the self-rating scales.

The rating scale is only a very rough way of measuring attitudes. A check mark on a line is to some extent influenced by chance factors and may vary considerably from one trial to another. A person's discrimination may be very keen, yet there are no concrete meanings or acts to guide him as is the case in a statement scale.

Some of the advantages of the graphic rating scale that are applicable in attitude measurement, as Freyd (42) pointed out, are as follows: The scheme is very simple and is easily grasped. It can be quickly filled out and can be easily scored. The numbering of the various steps may be altered at will. Several types of attitudes may be studied on several self-rating scales in a comparatively brief period of time.


The essential feature of this method is that the indicators are presented to the subject in pairs and he has to decide which of the two is preferable. Words or statements may be used for this purpose.

The method of paired comparisons was originated by G. T. Fechner. It is the old method of constant difference except that every item representing a degree of attitude is used as a standard and a comparison item. It was applied to attitude measurement first by Thurstone (92). By the use of the method he has measured attitudes toward crime and toward nationalities (93, 97). He also devised a technique for constructing a scale on the basis of percent-ages of preferred items.

Guilford (44, 45, 46) has suggested a short-cut to calculating scale values from the percentages that are obtained for each item. The scale value calculated by the Guilford method correlated almost perfectly with the scale values found by the elaborate Thurstone method. The Guilford method was tested in a weight experiment under conditions not quite comparable with conditions in an attitude experiment. Instead of asking one subject to make a large number of judgments it would have been better to use a large number of subjects and have each make only one judgment about the whole

( 315) series. Still it is safe to say that the G-method is a more convenient method to use in attitude measurement than the T-method, as demonstrated by Guilford in another experiment (46). Its one weakness is that the unit does not stay constant.

The method cannot be used to measure individual attitudes. That is, it is impossible to obtain individual scores because the very calculation of the scale values is dependent upon a combination of markings of a number of individuals. Nor can the standing of a group of individuals on an attitude scale of this Ise be determined by a single score. The scatter of the scale values themselves along the scale is the only picture of the attitude of a group.

Theoretically it would be possible to use a large number of items in the construction of the scale by the use of his method and then select the most evenly distributed ones to constitute the scale for the use in testing attitudes. This way the above two disadvantages would be eliminated. The enormous labor connected with such procedure however, eliminates it as a practical possibility.

For the purpose of comparing the attitudes of two or more groups the method is a useful and objective tool. A correlation can be calculated between the scale values obtained from one group with the scale values obtained from another group. 1 f the correlation is high the two groups agree closely with respect to the object of the attitude. If the correlation is low the agreement is slight and the attitudes of the two groups toward the issue in question are shown to be markedly different.

In this method it is possible to obtain distances between the items such as nationalities and use them as an indication of degrees of favorableness or unfavorableness of one nationality toward the others. This would be a more accurate way of calculating social distance in the sense Bogardus has conceived it. Any nationality can be taken for the origin from which the scale values are to be calculated depending upon the nationality of the subjects used. If the subjects are Americans, the American nationality should be chosen for the origin and the distance determined from it. If the subjects are Russians the Russian nationality in the list should mark the beginning of the scale. All other nationalities would then be considered with reference to the Russians.


This method differs from the method of relative ranking only in the construction of the scale. In administering the scale no difference

( 316) exists between the two methods. But there is some difference in scoring and the practical application of the methods.

Essentially a relatively large number of indicators is sorted into a number of piles, say, seven, eight, nine, ten, or eleven, according to the degree of attitude expressed in the indicators. A scale value is obtained for each indicator on the basis of percentages of classification. Also a measure of the scatter of the judgments is determined for each indicator. Mainly on the basis of the scale values and the dispersions a smaller number of indicators is selected to constitute the scale. As a result the intervals between the indicators on the final scale appear to the majority of judges approximately equal.

The method of the equal appearing intervals is a variation of the method of mean gradation first used by Plateau, a Belgian physicist, about the twiddle of the nineteenth century. It was first suggested probably by Boas, a German writer, in the second half of the last century and was since used by a number of European and American psychophysicists for the solution of psychophysical problems.

For the purpose of measuring attitudes the method was first used by Chave (27) and Droba (32, 33). Chave has applied it to the measurement of attitudes toward the church and published his scale in collaboration with Thurstone (101, 102). Droba has used the method for measuring attitudes toward war (35, 37). Following these studies twenty-two other experiments were reported using this method (28, 38, 39, 50, 53, 55, 72, 73, 74, 81, 84, 85, 87, 90, 96, 97, 100, 103, 104, 105, 108, 109).

The most evident disadvantage of the method of equal appearing intervals is the " end effect." Subjects tend to place an indicator more frequently into the end piles than into the intermediate piles. The two end piles seem to be the most conspicuous groups in the series since they mark the limits of the scale. The " end effect " tends to shorten unduly the distances between the end statements and the adjacent statements in the final scale. Consequently the scale values may indicate an even distribution of statements along the scale, yet actually the middle statements are further apart than the end statements.

This disadvantage can be probably eliminated by the use of a method which is more similar to the old method of mean gradations than the, method of equal appearing intervals. In the method to be suggested we will be concerned with only two groups of indicators at a time. Divide the whole series of statements into two groups for and against an object. Then subdivide each of the two groups into

( 317) two according to a strong attitude " against," a mild attitude " against," a mild attitude " for," and a strong attitude " for."The subdivision may be continued until a desired number of piles or groups is obtained.

In the suggested procedure the classification of statements is more accurate. If eight piles are used, which in most cases is a sufficient number, each statement must be react three times before the classification is complete. By a successive rereading of the statement the understanding of the statement will increase. This is an appropriate condition because with continued subdivision of the piles the discrimination will necessarily become finer.

Classification is easier than in the method of equal appearing intervals. In the latter method the subject has to decide into which of the several groups the statement belongs. According to our method the choice to be made is between two groups only. It is not necessary for the subject to keep in his mind all die classes at a time. Only one discrimination is to be made at a time, e.g., does the statement belong to a group more strongly against the issue or does it belong to a group less strongly against the issue?

In the first reading of the statements the task is simply to decide which statements are for and which statements are against the issue without the necessity of keeping in mind ti-limits. It is in the second reading of the statements that the ends ,f the scale come into view. However, the neutral ends of each of the two halves of the scale appear also. As a consequence, a possible "end effect" is counterbalanced by the effect of the neutral ends. In the third reading of the statements the "end effect" is counterbalanced by the effect of the ends found midway between the neutral limits and the extreme ends of the scale.

In conclusion a brief reference will be made to the different types of indicators. It is to be noted that writers do not always use the various indicators consistently. The question form, if used, is generally employed throughout the investigation. However, the personal form and the impersonal forms are sometimes interchangeably used in one experiment. This inconsistency appears to be greater if the above three main forms are further analyzed and divided into sub-forms. It would be better from the point of view of scaling if one indicator form is used consistently, as pointed out by Droba (34). At least three reasons may be mentioned for this suggestion. The use of one or at most two forms of statements in a scale would be more in line with the requirement of a unidimensional scale than a mixture

( 318) of several forms. The reliability of the scale would probably vary with the forms of statements used, or at least with the degree of uniformity in the use of the forms. The various forms also do not foretell equally the behavior of the individual endorsing them. The distance between behavior and certain forms of indicators is probably greater than between behavior and some other forms. These contentions should, of course, experimentally be demonstrated.


