Application of the "Order of Merit Method" to Advertising

Edward K. Strong

THE "order of merit method" has only of late been applied to psychological problems. By this method a series of stimuli are arranged according to some designated order. For example, a series of soap advertisements are arranged in the order of preference according to which the subject would buy the soap. The great advantage of this method over that of the more generally used one of "paired comparison," is the comparative ease and quickness with which a large number of stimuli may be graded on the basis of the given criterion. This method facilitates the obtaining of results from a large number of subjects, thus avoiding the small "select" groups so commonly used in psychological experiments. A second feature of the method not yet appreciated by many psychologists is the ability to secure judgments upon very complex stimuli. Not only in these cases may the stimuli be too complex to be analyzed into their component parts, but the resulting judgments may also be based on so many details that they can not be analyzed through introspections. Yet with all these complications a series of judgments may be secured that will not vary greatly for the same individual if repeated after considerable lapses of time. In fact, one of the striking points of the method is this reliability of the judgments.

Professor Cattell was the first to make use of this method in his study of two hundred shades of gray. Since then it has been used in the study of beliefs, the measurement of scientific merit, judgments and their reliability, jokes, family resemblance in handwriting, etc. In all these cases an order of superiority was established. But now the question arises : How much superior is the best one in the series to the second, and how may this superiority be measured? In other words, has A twice as much "pulling power" as B, or three times as much, or is there a very slight difference between them in "pulling power"? In the preceding work the probable errors of the positions were employed as a measure not only of the reliability of the judgments involved, but also as a measure of the differences between the assigned positions. Recent work, however, of Dr. Hollingworth and myself would indicate that the PE. is, to some extent,

(601) a function of the number of possible positions to be assigned the stimuli, and to that extent, at least, is not a measure of the absolute difference between the positions. Professor Thorndike has determined an absolute scale for handwriting where the differences between each grade of handwriting are supposedly equal. This was done by selecting such successive samples as were judged by a certain per cent. of the judges as better than the succeeding sample. Just such an absolute scale for advertisements is what has been attempted in this report. It is not enough to know the order of superiority in a set of advertisements. The amount of difference in superiority is essential to any practical use of the series. For example, if in a series of five advertisements three are between eighty-five and ninety per cent. of perfection and the other two are below forty per cent. of perfection ; then surely their order of superiority is not enough. The first three are practically equal from a business point of view, while the last two should immediately be discontinued from use.

Accordingly a set of fifty Packer Tar Soap advertisements was secured through the courtesy of Mr. Edward A. Olds, Jr., of the Packer Manufacturing Company. This set includes advertisements covering a period of twenty years, many of which have been long discarded from use. Twenty-five subjects were employed, fifteen men and ten women. The following type-written directions were given to the subjects together with the fifty advertisements :


Sort these fifty advertisements according to the order in which you would buy the soap.

Take for granted that each advertisement represents a different make of soap. Arrange them into as many piles as you desire; but so arrange them that the difference in superiority of one pile over the next is "just noticeable."

If the superiority of one advertisement over the next is more than " just noticeable," leave as many gaps as you feel are needed to indicate this superiority.

After the sorting was completed a second set of directions was given them, as follows:


Designate the pile, if there is such, which has no appeal to you at all. The piles above it should then all have an increasing appeal for you to buy their soap and the piles below it should have an increasing negative effect upon you (i. e., prejudice, distaste, or disgust with the soap).

The subjects sorted the advertisements into piles ranging from six in number to thirty-seven. The highest pile was arbitrarily assigned the value of one hundred, and the pile which the subject designated as having no appeal was assigned the value of zero. The


Table 1

piles between these two were assigned values proportionately. The piles below the "no appeal" pile were assigned correspondingly negative values. The values assigned to the advertisements are thus figured from (1) the advertisements which the subject considered the best in the set and (2) the advertisements which the subject considered of no appeal. Considerable care was taken in every case that each subject understood the meaning of "no appeal," so that

(603) as far as possible it had the same content for all twenty-five subjects. It is believed that this zero point does actually approximate the zero point of appeal in advertisements. The one hundred mark, of course, simply marks the best advertisement in the fifty.

Table I. gives the results from these arrangements. The first column gives the advertisements by number in the order of superiority based an the median judgment of the twenty-five subjects. The following three columns give the median judgment and its quartile for the fifteen men, the ten women, and the twenty-five subjects, respectively. The distribution of judgments throughout this series, with the exception of four "copy ads," approximates very closely to the normal curve of distribution, and hence the quartile approximates very closely to the P.E. The P.E. of the median position is then the quartile divided by the square root of the number of cases. The fifth column gives the rating of the Packer Manufacturing Company, and the sixth column gives similarly the average rating of three members of the Blackman-Ross Advertising Agency. It is scarcely necessary to repeat that the results of the Packer Manufacturing Company are not based upon carefully compiled data, but only upon the judgment of the firm based on their business experience. Any one familiar with advertising knows that such data have not been compiled for any extensive set of advertisements, let alone a series of fifty extending over twenty years of service. If such data did exist, it could not be used to its full face value, as an advertisement of twenty years ago might have been very effective then and be out of date to-day.

The order of the twenty-five subjects correlates +.52 with the order of either of the two advertising experts. The correlations between the orders of the two advertising experts is + .64.[1] These relationships are lower than those which have been obtained with other sets of advertisements.

From the above figures, then, we have an order of the superiority of Packer's Tar Soap advertisements as to "pulling power." Also, by constructing a scale from the data of Table I., we have the amount of difference between any two advertisements. From an inspection of such a scale it is very evident that there is a far greater difference between the advertisement ranked first and the one ranked fifth than there is between the advertisement ranked sixth and the one ranked seventeenth.

To further check the reliability of this method, eight advertisements were so chosen that the difference between each should be equal. These advertisements were then arranged by one hundred

(604) subjects in the order in which they would buy the soap. The same ratio of men and women was preserved as in the former experiments, so that of the one hundred subjects sixty were men and forty women. Table II shows the results of this experiment. The first column gives the order of superiority as assigned by the median judgment of the one hundred subjects ; the second column gives likewise the order as determined above by the twenty-five subjects. Then follow the order as assigned by the Packer Manufacturing Company and the Blackman-Ross Advertising Agency. It is only fair to state that advertisement No. 4 was badly tarn at the start of the experiment with the one hundred subjects, and when mended became badly wrinkled. This injury to its appearance, I believe, will fully explain the difference in its lower position with the one hundred subjects than with the twenty-five subjects. Below this table are given the coefficients of correlation between the four different orders of preference. The order of the one hundred subjects correlates as high in one ease with the three other orders as the two orders of the advertising experts correlate with each other, and higher in the other two cases.

Table 2 Order of Superiority of Eight Advertisements
It is evident, then, that the "order of merit method" does give results that correlate high with results obtained in business. A series of lathe advertisements from the Bullard Machine Company has given a correlation of one hundred when compared with the data of this company.

Considerable information as to the factors which enter into good and bad advertisements is obtained from these results. In fact, the few advertising men that have seen them are very enthusiastic in declaring that this laboratory study checks up with their business

(605) experience. One such expert stated that in his judgment the order from the twenty-five subjects was nearer the truth than that of the Packer Manufacturing Company. However, let us turn from results of interest mainly to advertising men, and consider a number of sex differences which this study brings out.

An inspection of Table I. shows that the range of judgments for the men is much less than that for the women, i. e., from + 84 to 0 for the men and from + 75 to - 43 for the women. Not only is the range of judgments for women greater than that for men, but the variability of the judgments for each advertisement is also greater. The average A.D. of the median judgments for each advertisement for women is sixty-nine per cent. greater than for the men, while the absolute range of difference of judgment for the women is seventy-one per cent, greater than for the men. The explanation of this situation appears to be that when women are given an equal opportunity with men to rate appeals (advertisements), they are able to classify their dislikes as readily as their preferences, which the men do not do. Such a condition naturally results in a greater total range (where methods of experimentation similar to those in this chapter are used) and consequently in a seemingly greater variability. A careful analysis of the data will not really show greater variability of judgment among the women. What it does show is that women have more and greater dislikes than men and are surer of them. This is also shown in the fact that the women rank thirteen advertisements as negative in appeal while the men do not rank any-the thirteen occupying thirty-six per cent. of the en-tire range of the women.

If now we turn to the question of sex difference in "appeal," we find that there are twelve advertisements that the men ranked higher than the women, and nine advertisements that the women ranked higher than the men. Four of the advertisements of the above twenty-one are ranked above the sixteenth position by the twenty-five subjects, the remainder are ranked below the twenty-second position of the fifty. It is evident that the two sexes nearly agreed about the best advertisements but disagreed about the poorer ones.

Only those sex differences are considered here in which the probable errors of the true medians from the obtained do not overlap, that is to say, the chances are more than equal that the differences discussed are real and not due to chance. Among the advertisements ranked higher by the women than by the men we have the three "kitten ads," the "baby in the satchel ad," the "little boy in the cart ad," the "tired tourist ad," and the "letter to Santa Clans ad." The main feature of all these advertisements is the irrelevancy of

(606) their cuts. Among the twelve advertisements ranked higher by the men than by the women, only two can be grouped under the heading of irrelevancy-No. 20, a mother and naked child, and No. 36, two children. This preference for the irrelevant among women confirms the early work of Gale upon attention value. He states that "the female attention was more susceptible to irrelevancy, as it was also to cuts, than was the masculine attention."

Another difference that might be mentioned here is the preference of the men for the so-called "copy ads." Of the twelve advertisements preferred by the men, three were "copy ads" and four were "half copy and half cut ads." Only one of the advertisements preferred by the women could be considered as approximating a "copy ad," and there the main interest, apparently small, I should judge, would lie in the three small cuts. We should conclude, then, that women are more interested in irrelevant matter and in cuts than are men.

In conclusion, let me repeat that we have in this "order of merit method" a system of handling very complex material, and in all the cases in which it has been possible to cheek up its results with known conditions it has shown a high degree of reliability. Many questions of conduct, esthetics, morals, and religion, which have been too complex to be handled experimentally in the past, can be investigated to advantage by this method.



  1. Neither of these coefficients of correlation has been corrected for attenuation: hence the true coefficients would be somewhat higher than those given here.

