# Rank Order as a Psychophysical Method

### Louis L. Thurstone

University of Chicago

University of Chicago

In every psychophysical experiment it is necessary for the investigator to
select a psychophysical method and in doing so he is guided by two principal
types of consideration, namely experimental convenience and theoretical
propriety. Other things being equal, he will choose that experimental procedure
which is simplest for the subject and most convenient. He will also be
constrained to select a method which is legitimate for his problem. One of the
simplest of experimental procedures _{,}is to ask the subject to arrange
a series of stimuli in absolute rank order according to length, weight,
brightness, beauty or whatever the psychological continuum may happen to be.
This experimental procedure does not ordinarily lend itself to psychological
measurement because the subjective increments represented by pie successive rank
orders are ordinarily entirely unknown. Occasionally it is legitimate to assume
that the entire distribution of subjective values is Gaussian and then it is
possible to translate rank orders into legitimate psychological measurement. We
are dealing now with the situation in which a small number of stimuli, such as
ten or twenty, are to be dealt with and in which no assumption can be made
legitimately regarding their distribution in subjective value.

For theoretical purposes the constant method is probably the best in most experimental problems. The ideal form of the constant method is the method of paired comparison in which not one but all of the stimuli serve as standards, but

(
188) the constant method is also one of the most laborious experimental
methods. If there are twenty stimuli and if the method is to be complete there
would be required *190 *judgments in order to compare each stimulus with
every other stimulus. This assumes that the constant method is used in complete
form with every stimulus serving in turn as a standard. The method has serious
limitations when only one or two of the stimuli are used as standards. We shall
also assume that the intermediate category is excluded. If the reader insists on
using the intermediate category of judgment in the constant method, this paper
is of no interest to him.

Our present problem is to devise a plan whereby simple absolute rank order may be used as the experimental procedure with the advantages of the much more laborious constant method. Given the data for absolute rank order we shall extract the proportion of judgments "A is greater than B" for every possible pair of stimuli in the given series. These derived proportions will be used instead of the proportions that are obtained directly in the constant method. From these derived proportions the subjective separations between any pair of stimuli can then be readily calculated by the equation of comparative judgment. The method will be derived first theoretically and then we shall describe its empirical verification.

If a subject has placed four stimuli *A B C D *in the rank order *B D
A C *it is possible to tabulate his various comparisons as though he had made
them separately. If each of the four stimuli were to be compared with every
other one in the series it would require six separate judgments, namely *AB AC
AD BC BD CD. *If there are *n *stimuli in the series it would require *
n(n — 1) *such judgments with counterbalanced order of presentation or half
that many if counterbalanced order is disregarded. This would give only one
judgment for each of the possible pairs of stimuli. Now if the four stimuli have
been placed in the rank order *B D A *C by one subject, it is clear that
six judgments may be extracted from this one rank order. Evidently the above
rank order series is equi-

(
189) -valent to the judgments *B > D, B > A, B > C, D > A, D > C, A > C. *
If a large number of subjects have arranged fifteen or twenty stimuli in rank
order it is an almost prohibitive task to tabulate the separate judgments to
which the single rank order is equivalent. However, it can be done by a shorter
procedure.

Let there be *n *specimens in the series to be arranged in rank order by *
N *subjects. Let *A *and *B *be two of these specimens and let a_{l}=
frequency with which specimen *A is *placed in rank 1 by the *N *
subjects, b_{1}= frequency with which specimen *B is *placed in
rank i by the *N *subjects, *pa _{l}= *proportion of the

*N*subjects who place specimen A in rank 1, p

*proportion who place specimen*

_{bl }=*B*in rank 1, and similarly for the other specimens and the other rank orders. See

*figure 1.*

Since these values of p may be regarded as probabilities, we have *p _{b2}
+ p_{b3} + p_{b4} + . . . p_{bn} = *p

_{b>1}

*=*probability that any subject at random (or any one judgment of a single subject) will place

*B*in a rank higher than rank 1.

Hence p_{a1} . P_{b>1}*= *probability that any subject,
chosen at random, will place *A *in rank 1, and *B *in a higher rank.
Similarly, *p _{a2} . p_{b>2} = *probability that a subject
will place

*A*in rank

*2*and

*B*in a higher rank.

In general, this product may be written *p _{ak }. p_{b>k} = *

(
190) probability that *A *will be perceived in rank k and that *B *
will be perceived in a rank higher than *k.*

Summing for all of the *n *ranks, we have *Σ(p _{ak} p_{b>k})
= *probability that

*B will*be perceived in a rank higher than that of A.

But we must also consider the possibility that the two specimens will be
perceived as of practically equal or nearly equal rank. If it were possible for
two specimens to be perceived in the same rank order, then *p _{ak} p_{bk}
= *probability that both specimens

*A*and

*B*will be perceived in the same rank order

*k.*But we assume that the subject is asked to place all the n specimens in absolute rank order without any duplicate or tied ranks and hence it is experimentally impossible for the two specimens to be placed in the same rank order. We shall make the assumption that if the two specimens are perceived to be sufficiently nearly alike to warrant the same rank order, the probability

*p*.50, and the probability

^{1}a>b =*p*.50. The notation p

^{1}b>a =^{1}refers to a single class interval. This is not quite correct because we know that if the two stimuli are slightly different in objective measurement then there will be a slight majority of correct judgments while the incorrect judgments will be in the minority.

This, of course, ignores the possible time and space errors. This is legitimate in dealing with rank order as an experimental method because the subject is given a series of stimuli or specimens to sort out into a rank order by his own devices. It is therefore a matter of chance which of any pair of stimuli is perceived first and whether it happens to be held to the right or to the left of the second stimulus. The subject has the privilege of revising his results and of looking at any and all of the specimens in any order and as many times as he may choose. The problem of constant errors may therefore be ignored.

However, if the two specimens differ slightly in psychological value so that
A > B and if they are presented to a subject repeatedly or once to a group of
subjects in counter-balanced order by the constant method, we should find that *
p _{a>b} > *.50. In other words, a slight majority would favor
specimen

*A.*The departure of the judgment

*p*from .50

_{a>b}(
191) will be small if the difference between *A *and *B is *small.
Since in a rank order experiment we shall assume that *n is *as large as 10
or 15 or 20*, *the interval in value represented by one rank order is
relatively small. In such situations and especially when the discriminal error
is much larger than the interval represented by one rank order, our assumption
is approximately correct, namely that

*=*probability that both

*A*and

*B*will be perceived in the same rank order interval and that B will be

perceived higher than A.

Hence we may write the formula with this approximation as

*p _{b>a} = *Σ(

*p*) + ½ Σ (

_{ak}. p_{b>k}*p*)(I)

_{ak}. p_{bk}In other words, we have expressed the proportion of subjects who perceive *
B *higher than *A *in terms of the frequencies with which the two
specimens are placed in the n rank orders. We can now use simple absolute rank
order as an experimental procedure and we can obtain the same results as with
the order of merit method and practically the same results as with the constant
method with counterbalanced order. It is taken for granted here that the
intermediate category is not used.

The approximation involved in the last term of equation (I) is close enough for all situations in which the number of stimuli to be arranged in rank order is greater than 10 or 15. The approximation is not satisfactory when the number of stimuli is small such as five or six. In the present study we have used the approximation represented by equation (I) but we shall develop here a more general formula for translating rank order into the proportions of the constant method which can be used when the experiment involves only a small number of stimuli.

If the number of stimuli represented in *figure z is *rather small it is
evident that some distortion is introduced by regarding the probability
ordinates to be constant within each. class interval. That is in effect the
assumption in

(
192) deriving the approximation equation (1). That equation is derived as
though the diagrams in *figure *1 were drawn as column diagrams instead of
probability polygons. Let *figure 2 *represent one of these class intervals
in which the probabilities show variation within the class interval. The
probability that stimulus i will be perceived in this class interval is p_{1},
and the notation p_{2} has a similar interpretation for the second
stimulus. The horizontal dotted lines in *figure 2 *represent the situation
in which a stimulus would be

as likely to be perceived at one part of the class interval as at any other
part of it. Let the sloping straight line through p_{i} represent for
each value of x the probability that stimulus i will be perceived at *x. *
As we have drawn *figure 2, *the

( 193) stimulus is more likely to be perceived in the upper part of the class interval than in the lower part of it but our correction formula will cover the general case in which the probability is assumed to vary throughout the class interval according to any linear function.

By inspection of *figure 2 *we see that the equations of the two sloping
straight lines are as follows:

in which

*p*_{1x} = the probability that stimulus 1 will be perceived at
any point x within the class interval.

*p*_{2x} = the probability that stimulus 2 will be perceived at
any point x within the class interval.

m_{l }and m_{2 }are the slopes of the lines.

p_{i} and p_{2} are the probabilities that stimuli 1 and 2
respectively will be perceived in the class interval.

The probability *P _{1x} *that stimulus 1 will be perceived
higher than x but within the class interval is therefore

The probability that stimulus 2 will be perceived at x and that stimulus i
will be perceived higher than x but within the class interval is the product of
these probabilities, namely,

After integrating and simplifying, we have

( 194)

Inspection of equation (6) makes it evident that if the two slopes m_{l }and
m* _{2 }*are zero, this term becomes identical with the second term
of equation (1) which is what we should expect. Furthermore, if

*a*and

*b*are interchanged as well as p

_{1}and p

_{2}we should have

In other words, the probability that both stimuli will be perceived in the
same class interval is *p*_{1}*p*_{2} *. *This
probability is split into two parts, namely the probability p_{1>2}^{1}
that 1 is perceived above 2, and the probability *p*_{2>1}^{1}_{ }
that 2 is perceived above 1* in the same class interval.*

If we use equation (6) in summation form instead of the second term of
equation (1),we shall have a closer approximation to the true value of *p*_{1>2}.
Let the class interval of *figure 2 *be designated *k *and let the two
stimuli 1 and 2 in *figure 2 *be designated *b *and a respectively, so
that p* _{2 }*and p

_{i}in that figure become, in the more general notation,

*p*and

_{ak}*p*respectively. Then the complete formula becomes

_{bk}in which the slopes *mak *and *mbk *are defined as follows:

The notation in the general equation (8) may be summarized as follows :

*p _{b>a} *= estimated proportion of subjects who judge stimulus

*b*higher than stimulus

*a.*

*p _{ak} = *
proportion of subjects who place stimulus a in any particular rank order

*k.*Similar interpretation for

*p*

_{bk}.*p _{a>k} = *
proportion of subjects who place stimulus a higher than any specified class
interval k. Similar interpretation for

*p*

_{b>k}.( 195)

*p _{a(k+1)} = *
proportion of subjects who place

_{ }stimulus a in the rank order next higher than k. Similar interpretation for

*p*

_{b(k+1)}*p _{a(k-1)} = *
proportion of subjects who place stimulus a in the rank order next lower thank.
Similar interpretation for

*p*

_{b(k-1)}It should be noted that both of the equations (1) and (8) are approximation
equations but that equation (8) involves the least assumptions. Thus equation
(1) assumes that the probabilities in *figure i *can be adequately
represented as column diagrams. This is legitimate for most problems where the
number of stimuli is as large as, say 20. Equation (8) assumes only that the
variation in the probabilities of each class interval in *figure 2 *is
linear. It is represented by frequency polygons in *figure 1 *instead of
column diagrams. In most experimental situations formula (1) is adequate since
it shows only a very slight discrepancy with the actual count for *p _{b>a.}*

Returning to formula (1) we shall now show its application to some
experimental data. The psychophysical comparisons of social stimuli are much
more complex than the comparisons of simple sensory stimuli such as line lengths
and weights. If our method is applicable to the complexities of social stimuli,
they may safely be assumed to be applicable to the simpler case of sensory
stimuli. We shall test the formula on Miss Hevner's data for judgments about
handwriting specimens.**[1]**

In her experiment on the order of merit method she asked 370 subjects to arrange twenty specimens of handwriting in rank order. From such experimental data it was of course possible to count the number of subjects who placed each one of the twenty specimens in each one of the ?twenty rank orders. For example, 59 out of the 370 subjects placed specimen ¢ in the fifth place from the top in excellence. In a table of this kind there must of course be as many rank orders as there are specimens because tied ranks were not allowed. From this table a second table was prepared showing the

( 196)

(
197) *proportion *of all the subjects who placed each specimen in each
of the twenty rank orders. In the above example this proportion is 0.1595. This
means that about sixteen percent of the entire group of 370 subjects placed
specimen 4 in rank order five. This is shown in *Table 1 *and the rest of
the table is interpreted in the same manner.

*Table 2 is *a summary of the calculation for estimating the proportion
of subjects who perceived specimen *2
*to be better than specimen *1. *The first column is a list of the
twenty rank orders. For each specimen a strip was prepared similar to the second
and third columns. Columns four and five in *Table 2 *represent such a
strip for the second stimulus. These strips were used in calculation so as to
avoid unnecessary transcription. The sixth column shows the product *p _{1k}
.p_{2>k} *for each of the rank orders. This is merely the product of
columns two and five in the table. The last column shows

(
198) the product p_{1k} . p_{2k} for each of the rank
orders. It is the product of items in columns two and four. In actual
calculation the entries in the last two columns were not recorded. The products
were calculated on a Marchant calculating machine and they were allowed to total
without recording of the separate items in the last two columns. The entries of
these two columns are here shown for completeness al-though in practice it is
not necessary to write them. Only the sums for the last two columns are
recorded. These sums are shown at the bottom of the table. The simple
calculation of the estimated proportion of all the subjects who perceived
specimen 2 to be better than specimen i is also there indicated. It is done in
accordance with equation z.

This procedure was carried out for each of the *1/2n(n — 1) *= 190
possible pairs of stimuli although it should be noted that when the two
specimens are far apart in excellence the amount of overlapping is small so that
the calculation is then short. For example, the strip for specimen 1 covers the
first 12 rank orders as shown in the second and third columns of *Table 2. *
The corresponding strip for specimen 20 covers the rank orders from 14 to 20
inclusive. Since there is no overlapping it is clear that none of the subjects
regarded specimen 20 as better than specimen i and consequently we can write
without further calculation that the estimated proportion of the subjects who
perceived specimen i as better than specimen 20 is unity. The amount of
calculation is a maximum when the two specimens are nearly of the same degree of
excellence and two such specimens are shown in the example of *Table 2. *
Consequently the labor of calculation is not nearly so great as would be
indicated by merely multi-plying the labor of *Table 2 *by 190.

Miss Hevner actually tabulated the number of judges who placed each one of the twenty specimens higher than every other specimen. This was accomplished by her records of order of merit for 370 subjects. This was an exceedingly laborious procedure but it was done in order to compare the order of merit method with two other psychophysical methods, namely the method of equal appearing intervals and the

(
199) method of paired comparison. In order to test our equation 1. we listed
the proportions estimated by equation 1 and also the actual proportions
tabulated by Miss Hevner for the order of merit method. The discrepancy for each
proportion was listed in the form *(p _{H} — *p

_{c}) in which

*p*refers to the actual proportion found by Miss Hevner by the order of merit method and p

_{H}_{c}

*refers to the estimated proportions calculated by our present equation (1).The distribution of discrepancies is shown in*

*figure 3.*The average discrepancy, disregarding

sign, is .0078 which shows a very close agreement. The fact that the average
discrepancy is less than one percent constitutes practical justification for
equation (i) as a method of estimating the proportions of the constant method
when the experimental procedure was that of simple rank order. The close
similarity in results from the order of merit method and the constant method was
demonstrated by Miss Hevner.

It is to be expected that the scale values for the twenty specimens determined
by the two sets of proportions should be practically identical and this is shown
in *figure 3. *We

(
200) have tabulated the scale values determined by Miss Hevner for the order
of merit method and for the same raw data treated by the present formula. The
agreement is practically perfect as shown in *figure 4. *Not only are the
two scales comparable but the units are identical since the proportions
themselves agree and consequently the slope of the plot in *figure 4 *is
unity.

An important psychophysical inference may be drawn from these experiments. In a previous paper I have assumed that the correlation between discriminal errors of two specimens that are being compared is zero. That is a fundamental assumption of the law of comparative judgment. At that time I did not see any clear method of testing that assumption separately from the other assumptions involved in the same

(
201) study. Since the equation of comparative judgment has been shown to fit
experimental data for a wide variety of stimuli I have felt assured that the
assumptions were justified. In the present data we find, however, a specific
verification for the assumption that the correlational term of the equation of
comparative judgment is zero.**[2]**
If it
were not zero, then the probabilities involved in equation (1) would not be
independent and consequently the product of the several probabilities would not
tally with the experimentally observed proportion of judgments that constitute a
compound event. Since the equation satisfies the experimentally observed
frequencies with a degree of accuracy that is unusual in psychological work, we
are justified in concluding that the several probabilities are truly independent
and that therefore the correlational term in the law of comparative judgment is
zero as has been previously assumed.

In Miss Hevner's study a comparison was made between the order of merit
method and the constant method in its complete form, namely paired comparison.
These two methods were shown to be identical when treated by the equation of
comparative judgment. We have here shown that it is not necessary in the order
of merit method to tabulate separately all of the *n(n — *1) judgments for
each subject that are implied in his arrangement of *n *stimuli in a single
rank order. It is possible to estimate the proportions directly from a frequency
table of rank orders for each specimen. This makes it possible to use simple
rank order when that method is .experimentally the easiest and to extract from
the rank order data the proportions that would be obtained by the laborious
constant method or the even more laborious paired comparison method. Miss Hevner
has previously shown that the order of merit method gives results practically
identical with the constant method. We have also verified our previous
assumption that the discriminal errors in a comparison of two stimuli are
usually uncorrelated. This assumption enters the law of comparative judgment
when the correlational term is assumed to be zero.

(Manuscript received May 27, 1930)