THE UNIVERSITY OF TEXAS AT AUSTIN THE GENERAL LIBRARIES LIMITED CIRCULATION A CRITICAL STUDY OF THE THURSTONE TECHNIQUE OF SOCIAL ATTITUDE MEASUREMENT Approved: Approved: J ' ' Dean OS' the Graduate School THIS IS AN OEISTNAL MANUSCRIPT ill MAY NOT BE COPIES WITHOUT THE AUTHOR’S PmiSSION A CRITICAL STUDY OF THE THURSTONE TECHNIQUE OF SOCIAL ATTITUDE MEASUREMENT THESIS Presented to the Faculty of the Graduate School of The University of Texas in Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY By Marie Elizabeth Faddis Gentry, 8.A., M.A. Austin, Texas June, 1933 o 4 Q 9 । 9 o ■* O /0 1 O PREFACE The writer wishes to express her sincere appreciation to Dr. F. A. C. Perrin for his suggestion of this problem and for his assistance in its development, and to Dr. L. A. Jeffress for statistical aid and constructive criticisms of the manuscript. She is also grateful to Dr. D. B. Klein for helpful suggestions. Finally she wishes to acknowledge her indebtedness to all the students who gave their time as subjects for the experiment. M. E. F. G. TABLE OF CONTENTS Chapter Page I The Problem 1 II The Experimental Procedure 6 111 Construction of a Scale by the Method of Equal Appearing Intervals 19 IV Construction of a Scale by the Order of Merit Method 46 V Comparison of -he Method of Equa.l Appearing Intervals and the Order of Merit Method 63 VI Some Measures of Reliability and Validity of the Experimental Scale 78 VII Summary and Conclusions 83 LIST OF TABLES Table Page I Summary of Sorting of 77 Statements by 200 persons 22-24 II Proportion of 200 judges who placed each one of twenty opinions toward smoking in each of twenty rank orders by the order of merit method . 51 111 Estimation of the proportions of subjects who perceived opinion 2 > 1 52 IV Calculated proportion of judgments in which the opinion at the top was considered more favorable than the opinion at the side . . 53 V The rank of order of the statements determined by the method of equal appearing intervals and by the order of merit method .... 55 VI Calculation of for statement 1 ... 58 VII Determination of the scale of distance S z - 59 VIII Scale separation between opinions .... 60 IX Scale valuer order of merit method .... 61 X Scale values of the twenty opinions by the two methods 65 XI Scale values of the twenty statements calculated from unweighted proportions 72 LIST OF FIGURES Figure Page 1 Cumulative frequency curve for statement 39 Method of equal appearing intervals .... 25 2 Cumulative frequency curve for statement 145 Method of equal appearing intervals .... 26 3 Cumulative frequency statement 70 Method of equal appearing intervals .... 27 4 Average ambiguity throughout the scale .... 31 5 Criterion of irrelevance on statement 106 ... 35 6 Criterion of irrelevance for statement 61 . . . 36 7 Criterion of irrelevance for statement 43 . . . 37 8 Experimental scale of 31 statements .... 41 9 Experimental scale of 20 statements .... 64 10 Comparison of the scale values from the method of equal appearing intervals and the order of merit method 66 11 Comparison of those who smoke and those who do not .............. 82 CHAPTER I THE PROBLEM The term attitude has been defined in various ways. For this study we use the term as defined by Thurstone* in his latest publication on the subject. Attitude is the affect for or against a psychological object. Affect in its primitive form is described as appetition or aversion. Appetition is the positive form of affect which in more sophisticated situations appears as liking the psychological object, defending it, favoring it in various ways. Aversion is the negative form of affect which is described as hating the psychological object, disliking it, destroying it, or otherwise reacting against it. Attitude is here used to describe potential action toward the object with regard only to the question whether the potential action will be favorable or unfavorable toward the object. For example, if we say that a man’s attitude toward prohibition is negative, we mean that his potential actions about prohibition may be expected to be against it, barring compromises in particular cases. When we say that a man’s attitude toward prohibition is negative, we have merely indicated the affective direction of his potential action toward the object. We have not said anything about the particular detailed manner in which he might act. The affect about an object may be of strong intensity or it may be weak. The positive and negative affect therefore constitutes a linear continuum, with a neutral point or zone and two opposite directions, one positive and the other negative. Measurement along this affective continuum is of a discriminatory character with the discriminal error as a unit of measurement. A subject’s attitude is measured by the endorsement or rejection of opinions. Opinion, as defined by signifies ”a verbalization of attitude ... An opinion symbolizes an attitude.” The methods selected for the measurement of attitude are methods which originally were utilized in experiments on length of lines, lifted-weights, etc., that is, for stimuli which had physical correlates. Cattell was the first to extend psychophysical method to social stimuli, for which there is only the psychological continuum, known as the Sscale. He used the order of merit method to rate scientific men. $ His students applied psychophysical method to still other social values. Ratings of the excellence of handwriting by Thorndike, and literary merit by Wells are two examples. To Thurstone belongs the credit for applying psychophysical method to the scaling of attitude. The three methods that he has utilized for this purpose are the method of paired comparison, the order of merit method, and the method of equal appearing intervals. The Law of Comparative Judgment, formulated by in 1927, gives us a method of handling the method of paired comparison, so that it satisfies the criterion of internal consistency. But this method involves the use of every stimulus as a standard, and hence is very laborious if the stimuli are numerous. The order of merit method likewise is practically justified only when there are few stimuli to be ranked, that is, not more than twenty. Thurstone has devised a technique for this method whereby it is possible to utilize the Law of Comparative Judgment. This involves the extraction of the proportion of judgments ”A is greater than B” for every possible pair of stimuli in the given series. Thurstone describes 5 the theory of this procedure as follows: If a subject has placed four stimuli A B C D in the rank order B D A 0 it is possible to tabulate his various comparisons as though he had made them separately. If each of the four stimuli were compared with every other one in the series it would require six separate judgments, namely AB AC AD BC BD CD. If there are n stimuli in the series it would require n(n-l) such judgments with counterbalanced order of presentation or half that many if counterbalanced order is disregarded. This would give only one judgment for each of the possible pairs of stimuli. Now if four stimuli have been placed in the rank order B D A C by one subject, it is clear that six judgments may be extracted from this one rank order. Evidently the above rank order series is equivalent to the judgments B>D, B>A, B>C, D>A, D>C, A>C. When it is desired to scale a great many stimuli the method of equal appearing intervals is the only one practicable.* This procedure Thurstone has applied to the scaling of attitudes toward the church, toward the movies, and toward other social issues. The general concept of attitude measurement has been criticized by many people. They say that to attempt to arrange attitudes along a continuum is to force something which is multidimensional into a line. Thurstone discusses this argument in the following manner When we discuss opinions, we quickly find that these opinions are multidimensional, that they cannot all be represented in a linear continuum. The various opinions cannot be completely described merely as “more” or “less I ’. They scatter in many dimensions, but the very idea of measurement implies a linear continuum of some sort such as length, price, volume, weight, age. When the idea of measurement is applied to scholastic achievement, for example, it is necessary to force the qualitative variations into a scholastic linear scale of some kind. We judge in a similar way qualities such as mechanical skill, the excellence of handwriting, and the amount of a man’s education, as though these traits were strung out along a single scale, although they are, of course, in reality scattered in many dimensions. As a matter of fact, we get along quite well with the concept of a linear scale in describing traits even so qualitative as education, social, and economic status, or beauty. A scale or linear continuum is implied when we say that a man has more education than another, or that a woman is more beautiful than another, even though, if pressed, we admit that perhaps the pair involved in each of the comparisons have little in common. It is clear that the linear continuum which is implied in a ’’more or less” judgment may be conceptual, that it does not necessarily have the physical existence of a yardstick. Therefore, Thurstone’s justification for the measurement of attitude is a pragmatic one. It works. The purposes of this study are the following: (1) to construct a scale by the method of equal appearing intervals, varying some of the conditions of Thurstone’s procedure, in order to determine whether the labor involved could be decreased without decreasing at the same time the reliability of the resulting scale; (2) to compare the method of equal appearing intervals with the order of merit method; and (3) to construct a scale of the attitude toward smoking. We hoped that by carrying out these steps we could make a critical analysis of Thurstone’s technique of scaling attitude. L. L.:”The Measurement of Social Attitudes,” Jour, of Abnormal and Social Psychology, 1931, Vol. XXVI, pTT6I. 2 stone, L. L.: The Measurement of Attitude, The University of Chicago Press',” p. 7. H. L.: Prof. Cattell 1 s Studies by the Method of Relative Position 1 . Oniv. Oon. - ToTThiT. and Psy.. L. L. :”A Law of Comparative Judgment; Psy. Rev., 1927, Vol. 34, pp. 273-286. ‘■'Thurstone, L. L.:”Rank Order as a Psychophysical Method,” Journal of Experimental Psychology, 1931, Vol. 14, p. 188. * Thurstone*s article, ”A Study for Measuring Attitude toward the Movies,”Journal of Educational Research, 1930, Vol. 22, pp. 89-94, outlines a”t echnique involving the method of equal appearing intervals for the selection of stimuli and the order of merit method for the actual scaling. g Thurstone, L. L. : The Measurement of Attitude, pp.lo-11* CHAPTER II THE EXPERIMENTAL PROCEDURE The steps to be followed in the procedure worked out by Thurstone? for the method of equal appearing intervals are (1) selection of an attitude variable, (2) collection of opinions describing this attitude variable, (3) editing of the opinions, (4) sorting of the selected opinions into piles, (5) determination of the scale values of the opinions, and (6) final selection of opinions to form an equally graduated scale. Each of these steps will be elaborated in later paragraphs. 1. Selection of the attitude variable For the first step, it is necessary to select an attitude variable about which one can speak in terms of more or less. The attitude variable chosen for this experiment was the favorableness toward the social practice of smoking. Smoking was selected, rather than some other practice, because there is no university restriction against it, and also because it is generally indulged in by both sexes at the University of Texas. As a consequence, there should have been no restraint when the scaled opinions were pre- sented for endorsement. Moreover, smoking is a practice which is of interest to all students, and hence an issue about which every student has opinions. ?Thur stone, L. L.: The Measurement of Attitude, Chapter ii-iii, pp. 22-58. 2. Collection of attitudes The collection of the attitudes was made in the main from a group of 60 students, most of whom were juniors and seniors, the rest graduates. We thought it best to get the opinions from the students who were to be the subjects to scale and indorse them. This group of sixty was used in every step of the procedure. The students were directed to state any opinions toward the practice of smoking that they themselves professed, or that they had heard professed by others. As a result, over fifteen hundred statements of opinion were collected, many of them duplicates, and many which had nothing to do with the attitude in question. 3. Editing of the opinions formulates a list of informal criteria for the selection of opinions: (1) The statements should be as brief as possible so as not to fatigue the subjects who are asked to read the whole list. (2) The statements should be such that they can be indorsed or rejected in accordance with their agreement or disagreement with the attitude of the reader. (3) Every statement should be such that acceptance or rejection of the statement does indicate something regarding the reader’s attitude about the issue in question. (4) Double-barrelled statements should be avoided except possibly as examples of neutrality when better neutral statements do not seem readily available. (5) One must insure that at least a fair majority of the statements really belong on the attitude variable that is to be measured. (6) As far as possible the opinions should reflect the present attitude of the subject rather than his attitudes in the past. (7) One should avoid statements which are evidently applicable to a very restricted range of indorsers. (8) Each opinion selected for the attitude scale should be such that it is not possible for subjects from both ends of the scale to endorse it. (9) As far as possible the statements should be free from related and confusing concepts. (10) Other things being equal, slang may be avoided except where it serves the purpose of describing an attitude more briefly than it could otherwise be stated. From the 1500 opinions collected, 145 were selected. These 145 statements were mimeographed on 2| x 4 inch cards and presented to the 60 subjects mentioned above, and also to six members of the psychology department, in a preliminary experiment. B lbid., pp. 22-23; 57-58. 4. The preliminary experiment The purposes of the preliminary experiment were as follows: (1) to determine whether Thurstone’s directions were satisfactory for the subjects with whom we had to deal at the University of Texas; (2) to determine whether favor- ableness or unfavorableness toward smoking could be extracted from statements involving the sex factor, and the situation factor, as well as from statements concerning smoking in general; (3) to select the most satisfactory statements; (4) to determine the best conditions under which to conduct the experiment proper; (5) to determine the approximate amount of time to be given to the experiment; and (6) to determine whether the statements were well distributed along the scale. The results of this preliminary experiment will be discussed in later paragraphs, as each step in the procedure is described. The direction sheet given with the cards was a duplicate of except that the wording was adapted to smoking instead of to the church: DIRECTIONS FOR SORTING CARDS 1. The 145 cards contain statements of attitude toward smoking. These have been made by students and other people. 2. As a first step in the making of a scale that may be used in a test of opinions relating to smoking we want a number of persons to sort these 145 cards into eleven piles. 3. You are given eleven cards with letters on them. A, 8,C,D,E,F,G,H,1,J,K. Please arrange these before you in the regular order. On card A put those statements which you believe express the most favorable attitude toward smoking. On card F put those expressing a neutral position. On card K put those cards which express the most unfavorable attitude toward smoking. On the rest of the cards arrange statements in accordance with the degree of approval or disapproval in them. It will aid in placing these statements if you ask yourself, ”How favorable toward smoking is the person who makes this statement?”* 4. This means that when you are through sorting you will have eleven piles arranged in order of approval estimate from A, the highest to K, the lowest. The intervals between the successive piles should be apparently equal shifts of attitude.* 5. Do not try to get the same number in each pile. They are not evenly distributed. 6. The numbers on the cards are code numbers and have nothing to do with the arrangement in piles. 7. You will find it easier to sort them if you look over a number of the cards, chosen at random, before you begin to sort them. 8. It will probably take you about forty-five minutes to sort them. 9. When you are through sorting, please put a rubber band around each pile, each with its letter card on top. Replace the eleven sets with the direction sheet in the large envelope and return to the person in charge. 10. Put your name and classification on the slip enclosed. The sortings from this preliminary experiment were tabulated and cumulative frequency curves drawn to get the scale values, as described in more detail for the experiment proper. As a result of this preliminary experiment the list of 145 opinions was cut to 77. Of the sixty-seven thrown out the majority concerned smoking by the two sexes, and in different situations, and a few were very ambiguous, having a Q-value of over 2.5 scale units. (The concept of the Q-value will be explained later.) It was found to be impossible to arrange the statements dealing with sex and situation along the con- tinuum of favorableness toward the general practice of smoking. It is contended by some individuals that it is also impossible to arrange statements that are concerned with opinions about smoking from different standpoints such as health, morals, social advantages, and so on, along the cortinuum of favorableness toward smoking. Against this contention we claim that it is not health, morals, social advantages, and so on, that are being arranged along the continuum, but rather the degree of affect toward smoking that is abstracted from the opinions. The fact that these statements have been arranged rather uniformly by the subjects in this study is evidence that something which is common to all of the statements has been abstracted from them. This common something is the degree of favorableness expressed by the statements. The 77 opinions selected for presentation in the experiment proper are as follows: (1) 1. I get no pleasure from smoking, but I do not object to others smoking. (5) 2. Smoking indicates bad home training. (?) 3. Smoking detracts from some personalities and makes other types more attractive. (10) 4. Smoking is hard on personal appearance, for it ruins the teeth, stains the fingers, and makes smudges on the clothes. (13) 5. I think it is for the individual to judge for himself whether or not he shall smoke. (20) 6. I approve of smoking though I seldom smoke myself. (21) 7. I like smoking but I do not miss it when a cigarette is not available. (25) 8. Smoking is a dirty habit. (26) 9. Smoking is socially obnoxious because the odor of tobacco is nauseating to a great many people. (27) 10. I think a party is more enjoyable when everyone smokes. (29) 11. Smoking is bad because of the prodigious amount of time and effort wasted. (30) 12. I have no particular desire to smoke and therefore do not. (31) 13. Smoking should always be discouraged regardless of its few good effects. (32) 14. Smoking is just like any other appetite, a bad thing if indulged in to the extreme. (34) 15. I do not like smoking because it is such a messy habit, resulting in ashes, cigarette butts, etc., in a room where it has been done. (35) 16. I like to see others smoke and to smell the smoke. (39) 17. Smoking encourages carelessness in personal habits. (40) 18. Smoking uses money that could be spent to better advantage in other ways. (43) 19. I smoke occasionally but I don’t care anything about it. (46) 20. Smoking has the effects of a drug and weakens the will power. (48) 21. Smoking is as bad as a drug when it becomes a habit. (50) 22. Smoking is a valueless and expensive habit. (51; 23. I do not approve of smoking because it leads to the acquisition of other expensive habits. (52) 24. Smoking is all right for those who find it agreeable. (53) 25. Smoking is an aid to poise and self-possession. (54) 26. Smoking is a bad habit because it distracts one from the necessity of doing something. (55) 27. I like to smoke while I am studying, because it lends a little distraction in the midst of tedious work. (56) 28. Smoking is a source of harmless enjoyment. (61; 29. Smoking is a bad habit because it is only a temporary relief for nervousness and later increases it. (62) 30. Smoking just finishes off a good meal. (63) 31. Smoking ’’breaks the ice” when people are not well acquainted. (64) 32. Smoking leaves an after-taste which is very unpleasant. (65) 33. Smoking merely satisfies for a while a craving which is intensified by the very act designed to give relief. (67) 34. Smoking leads to other bad habits. (68) 35. Smoking tends to make the mind become sluggish and inactive. (70) 36. Smoking is harmful to everyone, regardless of sex, because nicotine is a poison. (72) 37. Smoking is immoral, regardless of time, place, or person. (74) 38. Smoking is a joy that all should partake of. (75) 39. Smoking is a good way of passing leisure time. (76/ 40. Smoking makes one feel more at ease and more self-confident. (77) 41. Smoking cheers up many a lonely hour. (78) 42. Smoking helps a person to forget his troubles. (79) 43. I can think better when I smoke. (80) 44. Smoking is a comfort, physically and spiritually. (81) 45. A cigarette is very consoling when one is hungry or cold. (86) 46. Smoking is necessary to have a good party. (87) 47. Smoking creates a feeling of friendliness. (89; 48. Boys and girls should be taught and made to smoke since it is the socially approved custom. (90) 49. I consider smoking just as desirable as candyeating or any other harmless pastime appetite. 192) 50. A person who does not smoke is a ’’wet-blanket ”. 93) 51. Smoking prevents one from getting restful sleep. 94) 52. Smoking makes one extremely nervous. 95) 53. Smoking is good because it aids digestion. 96) 54. Smoking takes away one’s natural appetite for food. (98) 55. Smoking harms the teeth. (99) 56. Smoking weakens the lungs, and makes conditions better for the development of T.B. (100) 57. Smoking is a bad habit because it is injurious to the eyes. (101) 58. Smoking ruins the throat and voice. (102) 59. Tobacco smoke, being a mild germ killer, is good for the throat and nasal passages. (104) 60. Smoking by both parents will lead to degeneracy of the race. (105) 61. Smoking is a good habit because it enables one to carry on in spite of exhaustion. (106) 62. Smoking makes life more enjoyable. (Ill) 63. Smoking makes for ’’pipe” dreams, an escape from reality, and is therefore bad. (112) 64. Smoking is a good thing because it frequently provides an emotional outlet. (113) 65. Smoking is conducive to absolute relaxation and is therfore beneficial to the nerves of the smoker. (115) 66. Smoking is valuable to a nervous person because it gives him something to do with his hands. (120) 67. Smoking is a good habit because it is conducive to concentration. (123) 68. Smoking is neither a good nor a bad habit, but is merely a habit of indifferent character. (124) 69. Smoking is an unnatural and perverted taste. (132; 70. Smoking is a bad habit because the desire for a cigarette interferes with concentration on another activity. (135) 71. Smoking is a good escape from boredom. (136; 72. It is all right for a person to smoke occasionally to relieve nervous or emotional tension. (139) 73. Smoking lowers one’s resistance to disease and epidemics. (140) 74. I have a feeling of disgust when I see anyone smoke. (141) 75. It is no concern of mine whether people smoke or not. (143) 76. I have a better time if I smoke than if I don’t. (145; 77. I get more enjoyment from smoking than from anything else I do. As will be readily seen, some of these opinions do not describe the attitude variable; but in order to check Thurstone’s criteria of ambiguity and irrelevance they were included among the 77. 9 lbid.. p. 31. ♦This sentence was added by the experimenter. 5. Sorting of the opinions in the main experiment At the time of the experiment, before any sorting took place, the experimenter explained in general the purpose of the scaling procedure, and emphasized the fact that it was not the subjects’ attitude toward smoking that was concerned, but rather their judgment of the degree of favorableness or of unfavorableness expressed by the opinions. Two hundred and twenty subjects performed the experiment, this number including the group of 60 mentioned above and 160 others, most of whom were sophomores. They served in groups of twenty-five or less, working at large tables. The experimenter was present at every session and read all of the directions. In addition, the professor whose class was serving helped to supervise the experiment. The 77 cards with the sheet of directions were placed in a 6| x inch manila envelope. As a result of experience in the preliminary experiment we considered it necessary to revise Thurstone’s directions. We found (1) that some subjects did not read all of the directions; (2) that many subjects who read all of the directions did not read them carefully, and (3) that some subjects who read the directions carefully did not understand them. Although this is true of any experiment, the proportion of the group was too large to ignore in this case. To make sure that some time was spent reading the direction sheet, and that everyone read it to the end, the experimenter read it aloud while the subjects read silently. The revised directions are as follows: DIRECTIONS FOR SORTING CARDS 1. The 77 cards contain statements of attitude toward smoking. As a first step in the making of a scale that may be used in a test of opinions relating to smoking we want you to sort these 77 cards into eleven piles. 2. Think of a scale A F K extremely neutral extremely favorable unfavorable with the other letters in order along this scale, all expressing a different degree of favorableness toward smoking. You are given eleven cards with the letters A, B, 0, D, E, F, G, H, I, J, Please arrange these before you in the following order:* A B C D E F Under card A put those statement s which you believe G H I J K express the most favorable attitude toward smoking. Under card F put those statements which express the neutral position. Under card K put those cards which express the mo st unt avorable attitude toward smoking. Under the rest of the cards arrange statements in accordance with the degree of approval or disapproval in them. It will aid in placing these statements if you ask yourself, ”How favorable toward smoking is the person who makes this .statement? ” 3. This means that when you are through sorting you will have eleven piles arranged in order of approvalestimate from A, the highest, to K, the lowest, that is, B should express less approval than A, C less than B, and so on. The change in the degree of approval from A to B should be approximately equal to the change in the degree of approval from B to C, C to D, etc., that is, the intervals between the piles should be approximately equal. 4. Do not try to get the same number in each pile. They are not evenly distributed. 5. The numbers on the cards are code numbers and have nothing to do with the arrangement in piles. Not only was the wording of the directions revised, but the procedure also was changed somewhat. Each step was timed, the time varying for different groups; as a rule the smaller groups took less time than the larger ones. We thought it best to allow that time which was adapted to the majority of the group. The primary purpose of the timing was to make certain that everyone at least attempted to do all that he was directed to do.■ During the preliminary experiment the subjects did the sorting as quickly as possible in order to get away. The subjects of the experiment proper were told that the experiment would take a minimum of an hour and fifteen minutes, and that it would be impossible to leave early by hurrying. The following are the directions for timing. Each item was read by the experimenter after the majority of the group had completed the previous one: EXPERIMENTER’S DIRECTION SHEET For the following steps we are going to give you a definite amount of time. We will tell you when to begin and when to stop. Please work the entire time and follow the directions closely. 1. First of,all write on the top card your name, and the name of the professor who announced this procedure to you. 2. You will be given 10 minutes to make a rough sorting into the eleven piles. Go through all of the cards and place them in the correct pile as nearly as possible in this short time. (10 minutes). 3. Now make a more careful sorting. Go through all of the cards again, leaving the eleven piles before you the entire time. It is necessary to sort into piles a second time because the range of opinion is not realized until at least half of the cards are sorted. Moreover, it is easy to make mistakes. (10 minutes) 4. Write in the upper right hand corner of each card with an opinion on it, the letter of the pile into which you placed it. This finishes the first step. (5 minutes) 5. Now rank the statements within each pile. For example, take statements in pile A and rank them from highest approval to least approval within that pile. Do the same for all the other piles, leaving the piles of cards in front of you as before. Put the card with the highest approval on the top of the pile. This will take you 15 minutes. Be sure to get through all of the cards this time. Do not change any of the cards from one pile to another. You will be given an opportunity to do this in the next step. (15 minutes) 6. This step is the most difficult of all of the steps and requires the greatest care. Go through all of the statements once more, carefully comparing each statement with the two on either side of it, to be sure it belongs between them as to degree of approval. Make especially careful comparisons between the last two cards in A and the first two in B, the last two in B and the first two in C, and so on. If in this careful ranking you feel that some card in a pile really belongs in another pile, change it, without however changing the letter you previously placed on it. If you change a card from one pile to another, do not feel that it is because you were not careful enough in the pile method. We are comparing the two methods, and if you change the letter you have previously placed on a card, it will invalidate some of the results. (10 minutes) 7. Now write in the upper left-hand corner of each card with an opinion on it, the rank number you have given it. Start with the first card in A, that card which expresses greatest approval of smoking, and mark it 1, the second highest 2, and so on, continuing your counting from pile to pile. Suppose that the last card A was 5, then the first card in B will be 6. Be sure that the cards of the adjoining piles are stacked so that each succeeding card expresses less approval than the preceding. Keep the cards in the separate piles as before. The last card numbered should be 77. (5 minutes) 8. When you are through, please put a rubber band around each pile, each with its letter card on top. Place the cards in the envelopes and return the pencils to the box at the front of the room. The procedure was further changed by the addition of the task of ranking the statements. It was because we desired to compare the scales constructed by the method of equal appearing intervals and the order of merit method, that we cut down the number of opinions in the preliminary experiment. The ranking, procedure will be discussed in Chapter IV. *lt was considered important to have the piles arranged in order, to insure the presence of the idea of the scale during the entire experiment. It was found necessary to supervise every subject individually to see that he followed the order indicated. CHAPTER III CONSTRUCTION OF A SCALE BY THE METHOD OF EQUAL APPEARING INTERVALS 10 Rid.. pp. 32-35. 1. Tabulation of cases Because of the labor of tabulating the data, makes a plea for the elimination of subjects who have not understood the directions, or who have done the sorting in a careless and perfunctory manner, and whose sorting, as a consequence, is inconsistent with the sorting by the majority of subjects. As a criterion for the elimination of individual cases, he excluded all subjects who had placed 30 or more of his 130 statements in one of the eleveh piles. He says that this eliminated many subjects who were known to do the work carelessly, or who showed in conversation that they had evidently failed to understand the directions. But this criterion seems arbitrary and, what is more important, insufficient for the removal of careless subjects. A subject may be careless, and, at the same time, may not put approximately one fourth of the statements in one pile. Therefore, a different method was used for the exclusion of subjects. It is perchance a more subjective method, in that it depends somewhat on the judgment of the experimenter, but it seems to work more effectively than does Thurstone’s. It is more effective in that it rules out many cases which would not have been rejected by the use of Thurstone’s method, although it leaves in a few which would have been thrown out by his rule. There are some cases which are not excluded by his method, which are more inconsistent than many which were. From the tabulation of the sorting in the preliminary experiment, a general impression of the scaling of each statement was secured. By merely glancing at the statements, it can usually be determined whether they are favorable, unfavorable, or neutral. .This is especially true with regard to the statements usualy placed in the extreme piles. Figure 4 and the discussion of it, pages 30-32, give the evidence upon which this statement is based. To eliminate cases, the statements placed in the extreme piles, A and K, and in the neutral pile F were surveyed. If pile A contained extremely favorable, extremely unfavorable and neutral statements in it, and if F and K piles also included statements which were placed by the majority in piles far distant from each other, the case was thrown out. After a little experience with the statements, it was very easy to spot the subjects who either did not understand, or were careless. stresses the fact that it is important not to load the results, that is, not to bring about an artificial selection of cases. In order to determine whether the percentage of cases rejected made this a possibility, a comparison of cases eliminated by the two methods, Thurstone l s and that just described, reveals that Thurstone excluded 41 out of 341 cases, or whereas in this study we excluded 20 out of 220, or of the cases. Inasmuch as we eliminated a smaller percentage of cases than did Thurstone, unless the method of exclusion was an illegitimate one, we did not load the results to any degree. Xl lbld., p. 35. 2. Determination of scale values Thurstone l s technique was followed in the remaining steps. The tabulations for each statement were converted into cumulative proportions, which are given in Table I. The first column gives the code number of the statements, the second their scale values, and the third their Q-value, which is a measure of ambiguity, and which will be explained in a later paragraph. The remaining columns give the cumulative frequencies for each statement. The method of reading this part of the table, with regard to statement number 10 as an example, is as follows: No one of the 200 subjects placed statement number 10 in the first six piles; placed it in pile G; placed it in G or H; placed it in G, H, or I; etc. These cumulative frequencies were plotted. Figures 1, 2, and 3 are examples of the graphs. Referring to Figure 1, the curve crosses the 50% line at the interpolated value of 7.8, which is the scale value assigned to the opinion. The scale value is therefore the median of the distribution of placements for that statement. Likewise the interpolated values of the quartile points of the curve were determined. The distance between them is taken as a measure of the ambiguity of the statement. The quartile points for Figure 1 are 8.5 and 7.2, the distance between them, 1.3, which is called by Thurstone the Q-value of the statement. If a statement is very ambiguous, it will be placed over a great range of piles, the distance between the quartiles will be great, and the Q-value will be large. On the other hand, if the statement is placed rather consistently by the various subjects, it will have a small Q-value, Figures 2 and 3 represent curves for statements at the extreme ends of the scale. Their scale values are obtained by extrapolation of the curve. The Q-value of Figure 2 is equal to the upper quartile distance doubled, that of Figure 3 is equal to the lower quartile distance doubled. It should be noted that ,for the latter the entry in column K is not plotted, for of necessity it is unity, since all judgments which would have placed the statement beyond point 11.0 are included in K. The frequency distributions for the statements scaled at the extremes suffer from what Thurstone calls an ’’end effect”, which produces a skewed curve. This means that all statements which would have been placed by sane people in piles to the left of pile A are placed in A, and all statements which would have been placed by some individuals to the right of pile K are placed in K. This ’’effect” cannot be avoided since it is in the nature of the method that the scale must be arbitrarily cut off at either end. The scale values for these statements are therefore not so accurate as those for the other statements. Figure 1. Cumulative Frequency Curve for Statement 39 Method of Equal Appearing Intervals . 7 ; t ..7747 7"XpV.7X77 i|(;77 • ' 7y-p|7 p ...7 7 ; ■I - • ...4 . J f- Figure 2. Cumulative Frequency Curve for Statement 145 Method of Equal Appearing Intervals - - - । , 7 . J I -j- -Hf- [ • f ’ ~f - - -f- —— - "f- — ~----. -y -- ■ -|- Figure 3. Cumulative Frequency Curve for Statement 70 Method of Equal Appearing Intervals State- ment Scale- Value Q A 0-1 Accumulat i v e E 4-5 Proportions B 1-2 0 2-3 D 3-4 F 5-6 G 6-7 H 7-8 I 8-9 J 9-10 K 10-11 1 5.4 0.6 .00 .01 .01 .03 .18 .90 .99 1.00 1.00 1.00 1.00 5 9.2 1.8 .00 .00 .00 .00 .00 .00 .06 .19 .44 .73 1.00 7 5.5 0.6 .00 .00 .01 .02 .11 .87 .94 . 96 .99 1.00 1.00 10 8.7 1.7 .00 .00 .00 .00 .00 .00 .07 .32 .60 .86 1.00 15 5.5 0.4 .01 .01 .01 .02 .03 1.00 1.00 1.00 1.00 1.00 1.00 20 4.4 1.0 .02 .05 .13 .30 .85 1.00 1.00 1.00 1.00 1.00 1.00 21 4.6 0.9 .00 .01 .04 .19 .75 .98 .99 1.00 1.00 1.00 1.00 25 8.7 1.9 .00 .00 .00 .00 .00 .01 .09 .31 .58 .75 1.00 26 8.6 1.9 .00 .00 .00 .00 .00 .00 .10 .36 .62 .87 1.00 27 2.8 1.7 .04 .29 .61 .86 .99 .99 .99 1.00 1.00 1.00 1.00 29 8.2 1.6 .00 .00 .00 .00 .00 .00 .18 .43 .77 .93 1.00 30 5.7 0.7 .00 .00 .00 .00 .08 .72 .97 .99 .99 1.00 1.00 31 9.9 2.6 .00 .00 .00 .00 .00 .00 .12 .20 .31 .52 1.00 32 6.3 1.2 .00 .00 .02 .04 .10 .36 .79 .92 .96 .99 1.00 34 7.8 1.8 .00 .00 .00 .00 .00 .00 .25 .55 .80 .95 1.00 35 4.0 1.3 .05 .12 .23 .47 .92 .99 1.00 1.00 1.00 1.00 1.00 39 7.9 1.2 .00 .00 .00 .00 .00 .00 .16 .58 .88 .99 1.00 40 7.4 1.5 .00 .00 .00 .00 .01 .03 .37 .70 .89 .97 1.00 43 5.1 1.0 .00 .00 .00 .02 .44 .87 .99 1.00 1.00 1.00 1.00 46 9.6 1.4 .00 .00 .00 .00 .01 .01 .03 .07 .27 .65 1.00 48 9.7 1.6 .00 .00 .00 .00 .00 .00 .07 .16 .30 .63 1.00 50 8.4 2.0 .00 .00 .00 .00 .00 .01 .14 .37 .65 .85 1.00 51 7.8 1.5 .00 .00 .00 .00, .01 .01 .20 .59 .80 .93 1.00 52 5.1 0.8 .00 .00 .00 .03 .42 .95 .98 1.00 1.00 1.00 1.00 53 2.2 1.6 .12 .44 .74 .93 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Table I (continued) State- Scale- Ace u m u 1 a t i v e P r o p 0 r t i o n s ment Value Q A B 0 B E E G H I J K 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 10-11 54 7.9 1.4 .00 .00 .00 .00 .00 .00 • 14 .55 .82 .94 1.00 55 2.7 1.4 .06 • 24 .63 .88 1.00 1.00 1.00 1.00 1.00 1.00 1.00 56 2.9 2.2 .10 .28 .48 .75 .97 1.00 1.00 1.00 1.00 1.00 1.00 61 8.9 1.3 .00 .00 .00 .00 .00 .00 .04 .20 .56 .90 1.00 62 2.7 1.6 .11 .42 .60 .89 .99 1.00 1.00 1.00 1.00 1.00 1.00 63 3.2 1.5 .03 .16 .43 .80 .99 .99 1.00 1.00 1.00 1.00 1.00 64 7.2 1.5 .00 .00 .00 .00 .01 .01 .44 .75 .95 1.00 1.00 65 7.9 1.9 .00 .00 .00 .01 .05 .12 .29 .52 .79 .93 1.00 67 8.6 1.6 .CO .00 .00 .00 .00 .00 .09 .29 .63 .90 1.00 68 8.9 1.5 .00 .00 .00 .00 .00 .00 .05 .22 .52 .89 1.00 70 10.7 0.8 .00 .00 .00 .00 .00 .00 .00 .03 .08 .26 1.00 72 11.0 0.8 .00 .00 .00 .00 .00 .00 .02 .02 .04 .07 1.00 74 0.6 1.0 .73 .90 .95 .98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 75 3.3 1.4 .02 .13 .36 .75 1.00 loQO 1.00 1.00 1.00 1.00 1.00 76 2.4 1.5 .10 .39 .73 .95 1.00 1.00 1.00 1.00 1.00 1.00 1.00 77 2.4 1.4 .11 .42 .66 .90 1.00 1.00 1.00 1.00 1.00 1.00 1.00 78 2.6 1.5 .08 .27 .62 .91 1.00 1.00 1.00 1.00 1.00 1.00 1.00 79 1.8 1.3 .14 .56 .85 .97 1.00 1.00 1.00 1.00 1.00 1.00 1.00 80 1.2 2.0 .45 .71 .90 .97 1.00 1.00 1.00 1.00 1.00 1.00 1.00 81 2.7 1.5 .05 .28 .62 .92 1.00 1.00 1.00 1.00 1.00 1.00 1.00 86 1.9 1.9 .23 .52 .74 .91 1.00 1.00 1.00 1.00 1.00 1.00 1.00 87 2.6 1.6 .08 .31 .62 .91 1.00 1.00 1.00 1.00 1.00 1.00 1.00 89 1.1 0, 6 .80 .87 .91 .94 .99 .99 .99 .99 1.00 1.00 1.00 90 2.7 2.8 .20 .41 .56 .73 .98 .99 1.00 1.00 1.00 1.00 1.00 92 1.3 2.8 .52 .68 .81 .89 .95 .95 .96 .97 .98 .98 1.00 93 8.4 1.5 .00 .00 .00 .00 .00 .00 .10 .36 .73 .93 1.00 Table I (continued) State- Scale- A o c u m u 1 a t i v e P T o p 0 r t i o n s ment Value Q A B C D E F G H I J K 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 10-11 94 9.0 1.3 .00 .00 .00 .00 .00 .00 .02 .12 .48 .85 1.00 95 2.1 1.7 .17 .49 .78 .94 1.00 1.00 1.00 1.00 1.00 1.00 1.00 96 8.0 1.6 .00 .00 .00 .00 .01 .01 .15 .50 .77 .96 1.00 98 8.5 1.6 .00 .00 .00 .00 .00 .00 .10 .32 .66 .94 1.00 99 9.5 1.3 .00 .00 .00 .01 .01 .01 .02 .08 .29 .69 1.00 100 9.0 1.2 .00 .00 .00 .00 .00 .00 .02 .14 .49 .85 1.00 101 9,3 1.3 .00 .00 .00 .00 .00 .01 .03 .10 .37 .74 1.00 102 2.5 1.6 .11 .42 .66 .91 1.00 1.00 1.00 1.00 1.00 1.00 1.00 104 10.4 1.2 .00 .00 .00 .00 .00 .00 .01 .03 .09 .33 1.00 105 1.7 1.4 .22 .60 .85 .97 1.00 1.00 1.00 1.00 1.00 1.00 1.00 106 1.5 1.8 .37 .66 .85 .95 1.00 1.00 1.00 1.00 1.00 1.00 1.00 111 8.0 1.5 .00 .00 .00 .00 .00 .00 .17 .49 .77 .94 1.00 112 2.1 1.3 .10 .47 .79 .95 1.00 1.00 1.00 l.OQ 1.00 1.00 1.00 113 1.6 1.4 .26 .62 .86 .96 1.00 1.00 1.00 1.00 1.00 1.00 1.00 115 3.1 1.4 .04 .17 .45 .89 1.00 1.00 1.00 1.00 1.00 1.00 1.00 120 1.6 1.3 .24 .68 .91 .98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 123 5.4 0.6 .00 .00 .00 .00 .01 .84 .98 .99 1.00 1.00 1.00 124 8.8 2.0 .00 .00 .00 .00 .00 .01 .11 .37 .54 .72 1.00 132 8.4 1.5 .00 .00 .00 .00 .00 .01 0 11 .37 .72 .90 1.00 135 3.3 1.4 .03 .11 .41 .77 .99 .99 1.00 1.00 1.00 1.00 1.00 136 4.4 0.8 .01 .02 .08 .26 .89 .96 .99 1.00 1.00 1.00 1.00 139 9.0 1.7 .03 .03 .05 .05 .07 .07 .13 .25 .50 .83 1.00 140 10.1 0.8 .00 .00 .00 .00 .00 .00 .03 .09 .13 .39 1.00 141 5.5 0.6 .01 .98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 IsOO 1.00 143 2.9 1.8 .07 .25 .54 .82 .98 1.00 1.00 1.00 1.00 1.00 1.00 145 0.2 1.0 .86 .96 .98 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Table I. Summary of Sorting of 77 Statements by 200 Persons 3. Selection of statements for the experimental scale After securing the scale values for all the statements, it is possible to arrange them into a scale. But the scale we are seeking is one which has statements more or less equally spaced along it. The statements were selected for the final scale on the basis of four points: (1) the criterion of ambiguity, (2) the criterion of irrelevance, (3) the scale value, and (4) inspection of the statements. We thought a fifth point, introspection, also might offer some aid in the selection. Therefore, we requested the first sixty subjects to mark the five opinions with which they had had the most difficulty, and to write on the other side of the card the reason for the difficulty. We classified the reasons into three groups, those which claimed that the opinion did not describe the attitude variable, those which claimed that the opinion was ambiguous, and those which expressed irrelevant criticisms. The results of these introspective reports are as follows: (1) Ten of the seventy-seven statements were not marked, that is, they gave no difficulty. (2) Thirty-one escaped the charge of ambiguity We determined by an objective method, which is described above, that nine of these thirty-one statements were ambiguous, but they were not recognized as such by the subjects. Only ten of the forty-six indicated as ambiguous proved to be ambiguous by this same objective method. (3) Eleven statements were marked because they did not describe the attitude variable. Only two of these were eliminated for this reason. (4) Three-fifths of the introspections were classified as irrelevant. Therefore, we conclude that introspections cannot aid in the selections of statements. The criterion of ambiguity gives us a basis for eliminating unsuitable statements. Other things being equal, a statement with high ambiguity should be eliminated. An ambiguous statement is one which carries more than one meaning, that is, it is one which is judged from different .standpoints by the subject. A glance at Table I will show that the ambiguity varies for different statements. It becomes important to know whether the ambiguity varies for different parts of the scale. The Q-value throughout the scale was plotted in order to determine how uniform it was. This is represented in Figure 4. The curve dips markedly at the neutral point and at the ends. This means that the neutral and extreme statements are the easiest to discriminate or place. This result is not surprising. Thurstone suggests that it is ideal to have the statements of the same average ambiguity throughout the scale. Although this is not true for our scale, the average ambiguity in all parts of the scale is lower than that of the corresponding scale units of Thurstone’s scale toward the church. The second criterion used in the selection of statements is that of irrelevance. This criterion is concerned with the indorsements of statements. The entire list of 77 statements was mimeographed and presented to 200 subjects who were requested to indorse those statements with which they agreed. There is a certain amount of inconsistency in the indorsement of opinions, that is, inconsistency from the viewpoint of the experimenter. This may be due to inconsistency of thinking by the subjects, to carelessness, to both or to neither. But the inconsistencies vary with the statement which is chosen as a basis of comparison with the others. This variation is taken to mean that the statements themselves are defective. Thurstone^ 3 describes the criterion thus: This criterion is constructed as follows: Suppose that a statement of low ambiguity is properly scaled at the point 6. If a subject has an attitude which is also scaled properly at the point 6, then we should expect him to check that statement. Another subject who is scaled at the point 12 should be less likely to check that statement, and similarly there should be a low probability that a subject at the point zero will check the statement at 6 on the scale. In order to make this type of analysis quantitative we have devised a rather crude index of similarity which is based on the voting of any large group of subjects. The index of similarity for any pair of statements is based on three facts, namely, n a = the total number of subjects who indorse statement in the comparison; = the total number of subjects who indorse statement X in the comparison; n the total number of subjects who indorse both eu and If the two statements & and are practically identical in the attitudes they reflect, then we should expect to find that those subjects who indorse statement (c will also indorse statement Z . This factor, n will therefore be in the numerator of the index of similarity. On the other hand, the statements vary considerably in intrinsic popularity even when they are scaled at identical points on the scale. The more popular a statement is, the larger will be the number of people who indorse it and any other statement. In order to reduce the index of similarity to the same basis of popularity for all statements, the number of subjects who indorse both statements is divided by the product of the number of total indorsements for each of the two statements so that the index of similarity becomes n 4 • n If we tabulate the indices for statement a. with each of all the other statements in turn, we shall have the common factor l/n a which may be disregarded since it is a constant. We shall have then Index of similarity for statement This index is written for the comparison of statement d with each of the others. It is evident that the maximum possible value for this index is unity and its minimum value zero. If all the people who indorse statement a, , also indorse statement/ , then the index of similarity is unity as it should be, because the two statements are then very similar in the attitudes reflected. If, on the other hand, none of those who indorse statement indorse statement / , then the index is zero, and this is reasonable because the two statements are then evidently very different in the attitudes which they describe. Figures 5,6, and 7 represent the plotted indices of similarity. The small arrow on the top line of the figure represents the scale value of the statement. The criterion of irrelevance is the appearance of the graph. If the plotted points are relatively high near the arrow and become relatively low with increasing distance from it, the statement is satisfactory. This means that persons who indorse statements in one part of the scale are not so likely to indorse statements in parts of the scale distant from it. If, on the other hand, the points go more or less horizontally across the page, the statement is unsatisfactory. This means that people who indorse statements in one part of the scale are just as likely to indorse statements in other parts of the scale. This is probably due to the fact that people in different parts of the scale are indorsing the same statement for different reasons. Figures 5 and 6 are satisfactory, whereas Figure 7 is unsatisfactory. Because the determination of the criterion of irrelevance is such a laborious task, involving the tabulation of the indorsements of every statement with every other state- ment, it was shortened by tabulating only 75 out of the 200 cases. Moreover, an attempt was made to shorten the calculation. A preliminary selection of statements was made on the basis of the criterion of ambiguity, scale value, and inspection of the statements. Because we desired to have the statements as equally spaced as possible along the scale, the scale value of the statements plays a part in the elimination of statements. Selection by inspection means selection on the basis of clearness and wording. Some statements can be readily identified as much better than others. In this way 45 statements were selected. The remainder, 14 in number, were eliminated on the basis of the criterion of irrelevance. These fourteen were statements satisfactory with regard to ambiguity, scale value and inspection. The criterion of irrelevance was determined for each of these 45 statements with all of the other 44. Inasmuch as it is the appearance of the graph that is the essence of the criterion of irrelevance, so long as the statements occurred in every part of the scale, it was thought unnecessary to plot each of the 45 statements with every one of the other 76. To test this conclusion, the first ten of the 45 statements were each plotted with the other 76 and with the other 44. The same conclusions were drawn concerning the value of the plot in these ten cases. In the opinion of the experimenter, the criterion of irrelevance is not worth the tremendous amount of labor required for its determination. A good selection of statements can be made without it. The evidence for this conclusion is as follows: Preliminary to its determination a preference was made for the statements whose selection was doubtful on the basis of ambiguity, scale value, and inspection. In ten out of fourteen cases the correct preference was made. In the other four the criterion of irrelevance gave satisfactory plots, but not so satisfactory as for those selected. Also, five statements were selected, which, it was thought, would give a horizontal plot. The selection again was correct. In other words, the determination of the indices of similarity is probably unnecessary in most cases. As if to corroborate this conclusion, Thurstone*latest publication concerning a technique for scaling makes no mention of the use of this criterion in the selection of statements. Thurstone says nothing about any peculiar results for his nearly neutral statements, that is, those on either side of the neutral point. Those selected in this experiment do not meet the criterion of irrelevance satisfactorily. The plot of all of them goes more or less horizontally across the diagram. Figure 7 is one of those selected and is typical. A glance at the statements may give a clew to the reason. It was impossible, however, to select statements which met all four points satisfactorily. The statements read: 20. I approve of smoking though I seldom smoke myself, (scale value 4.4) 21. I like smoking but I do not miss it when a cigarette is not available. (4.6) 43. I smoke occasionally but I don’t care anything about it. (5.1) 32. Smoking is just like any other appetite, a bad thing if indulged in to the extreme. (6.3) It is easily understood why people from all parts of the scale indorse these statements. 13 r01d.. p. 45. 15 roid.. p. 47. Thurstone, L. L.: "A Scale for Measuring Attitude toward the Movies,” Journal of Educational Research, 1930, Vol. 22, pp. 89-94. Figure 4 e Average Ambiguity throughout the Scale figure 5. Criterion of Irrelevance for Statement 106 1 — Figure 6. Criterion Of Irrelevance for Statement 6\ Figure 7* Criterion of Irrelevance for Statement 43 4. The experimental scale The 31 statements selected were put into a scale which is represented in Figure 8. The numbers inside the small circles are the code numbers of the statements. In all but three scale steps there are three statements. The last statement, scaled at 11*2, was added for the purpose of completeness. The fact that there is only one statement in this step makes no difference because no one in the 200 cases indorsed it. The first scale unit has in it two statements, which were indorsed by one and two persons, respectively, out of the 75 cases tabulated. The indorsement of these statements, therefore, affects the score of only two subjects. Only one statement occurred in the seventh scale step. An inspection of the range of scale values brings out a difficulty of scaling attitude. The scale values extend from 0.2 to 11.2. Probably most people would agree that the extremely favorable attitude toward smoking, and toward other issues as well, is not as extreme as the extremely unfavorable attitude, that is, the distance from the neutral point to the favorable end is not so great as the distance from the neutral point to the unfavorable end. A glance at the two statements at the extreme ends will illustrate this point: 145. I get more enjoyment from smoking than from anything else. I do. 72. Smoking is immoral regardless of time, place, or person. > - > The neutral point is about 5.5 although the sixth pile was the neutral one. The cumulative proportion curve brings the value to the left. This is merely a mechanical result of drawing it. The two most neutral statements have values of 5.4 and 5.7, and from the tabulation of their sortings it can be seen that the neutral point belongs in between. Taking 5.5 to be the neutral point, the distance between this and the extremely favorable end 0.2 is 5.3, that between 5.5 and the extremely unfavorable end, 11.2, is 5.7 scale units. Therefore, the distances differ as they should. Figure S. Experimental Scale of 31 Statements 5. Reliability of the scale values Having completed the scale by the method of equal appearing intervals, we turn to some measures of reliability of the scale values. Thur suggests two methods for determining their reliability. The first method which gives an approximate estimate of reliability of the scale values, is the following: The Q-value is twice the quartile deviation of the distribution of each opinion on the subjective scale, that is, it is equal to the quartile distance. Therefore, Q = 2q A comparison of the values obtained by Thurstone and in this experiment follows: Average Q-value of the Our results Thurstone’s results statements in the (31 statements) (45 statements) experimental scale 1.23 1.67 0.63 0.84 The standard deviation of the distribution of scale values, assuming a normal distribution, is,therefore, on the average, j _ Q 0.92 scale units 1.25 scale units dist. ~ 0.67 The scale value of an opinion is the median of its distribution on the subjective scale. Hence, the standard error of the scale value is, again assuming a normal distribution, Our results Thurstone 1 s results aL 0.92 r = 1.25 HT 1.25fW med. = .08 when n = 200 = .09 when n = 300 p.e. - 0.67x0' =0.67 x .08 med. med. ? ,054 scale units = .06 scale units Thus it is seen that the reliability of Thurstone’s church scale and that of the smoking scale is just about the same, although the O’ differ a good deal. This result has been obtained with 200 cases instead of 300, which shows that 200 cases were sufficient to stabilize the scale values. There are several reasons for this result: (1) more effective elimination of the cases of carelessness and misunderstanding decreased the ambiguity of the statements. (2) Fewer statements are included in the present experimental scale, than in Thurstone’s. An additional 14 statements probably would have included some of greater ambiguity, and hence would have raised the values. The average ambiguity for the entire group of 77 statements is 1.42. A substitution of this value into the formulas above gives the values: 0.71 = 1.06 scale units dist. 0.67 - .09 when n = 200 n• e • med.' .06 scale units The last two values are again the same as Thurstone’s, and the o' of the distribution is smaller as before. The latter would have been less for 45 statements, the number Thurstone used in his calculation. (3) Thurstone’s values are for different opinions, which concern a different issue, the church. A second method of estimating reliability is to determine the change in stability of scale value as the size of the group changes. For this purpose the scale values were determined for the first 140 subjects who sorted the statements. The sorting continued until 200 subjects had done it. The scale values were then calculated for the entire group of 200. Again we will compare Thurstone’s results and those of this experiment. Our results Change in scale value from 140 subjects to 200 subjects Mean discrepancy for the whole list of 77 statements = 0.06 scale units. Mean discrepancy for the 31 statements in the experimental scale - 0.03 scale units. Thurstone’s results Change in scale value from 150 subjects to 300 subjects Mean discrepancy for the whole list of 130 statements = 0.074 scale units. Mean discrepancy for the 45 statements in the experimental scale = 0.056 scale units. A valid comparison between these two sets of results cannot be made. Thurstone used a greater number of cases and a greater number of statements which concerned a different issue. Again, part of the difference is probably due to better experimental conditions, and a better method of eliminating careless c&ses. It can be said, however, that for this experiment 200 cases were sufficient to make the scale values stable. 15 Ibid., pp. 42-44. CHAPTER IV CONSTRUCTION OF A SCALE BY THE ORDER OF MERIT METHOD We desired to make a comparison between the scale constructed by the method of equal appearing intervals, and by the order of merit method. It was necessary, therefore, to use the same statements and the same subjects for both. It will be remembered that after the subjects had sorted the 77 statements into piles, they also ranked the opinions within the piles. (See directions 5,6, and 7, pages 17-18) This method was followed, inasmuch as the sort, ing into piles was considered the best preliminary procedure for so many statements. The number of statements was cut down from 145 to 77 in the preliminary experiment, because it would have required too much time to sort into piles and rank 145 statements. Of the 31 statements selected for the experimental scale by the method of equal appearing intervals, we chose 20 statements, approximately equally spaced along the scale. This was for the purpose of decreasing the amount of work necessary. Twenty statements involve n(n-l) or 190 2 possible pairs of statements, whereas 31 statements would involve 465 possible pairs of statements. Later explanation will show the significance of the pairs of statements. 1. Tabulation of the rank orders Preliminary to tabulation of the rank order data, it was necessary to select from the sorting of each subject the twenty statements to be used, keeping them in the rank order into which they had been placed relative to each other, and then to renumber them from 1 to 20. The procedure for scaling statements by the order of merit method is that outlined by Thurstone, except for a few changes which were suggested by him in a later These changes are instituted at this point. Thurstone made a comparison of some experimental proportions from a Doctor*s Thesis by Kate Hevner, and the calculated proportions secured by the use of his equation below. Miss using handwriting specimens, constructed a scale by the order of merit method. She asked 370 subjects to arrange 20 specimens in rank order. From the data she tabulated the number of subjects who placed each one of the twenty specimens higher than every other specimen. From a table of this sort, she made a second, showing the proper- tion of all the subjects who placed each specimen higher than every other specimen. The tabulating necessary is very laborious. Thurstone has constructed an equation which can be used to secure these proportions. It saves a great deal of work in the tabulating, because it is only necessary to tabulate the number of subjects who placed each one of the twenty specimens in each one of the rank orders. He compared the proportions calculated by means of his equation and the experimental proportions secured by Miss Hevner. He found that the average discrepancy, disregarding sign, was .0078, which, he says, l9 “constitutes practical justification for equation (1) as a method of estimating the proportions of the constant method when the experimental procedure was that of rank order.” Therefore, Thurstone concludes" that, “it is not necessary in the order of merit method to tabulate separately all of the n(n-l) judgments for each subject that are implied in his arrangement of n stimuli in a single rank order. It is possible to estimate the proportions directly from a frequency table of rank orders for each specimen.” The equation expresses "the proportion of subjects which perceive B higher than A in terms of the frequencies with which the two specimens are placed in the n rank orders”. It is written as follows : J (i) in which P -proportion of subjects who per- ceive B higher than A S(P y • •) =the probability that B will be perceived in a rank higher than A S(p -probability that both A and B a will be perceived in the same rank order interval and that B will be perceived higher than A Therefore, the procedure suggested in this paper was used. The rank orders assigned by the 200 subjects were tabulated for the 20 specimens. L. L.: ,f The Measurement of O-oinion I ,’ J. of Ahn. & Soc. Psych., Vol. XXII, 1928, pp. 415-430. 17 Thurstone, L. L.: "Rank Order as a Psycho-physical method, J. of Exper. Psych., Vol. XIV, 1931, pp. 187-201. 18 Hevner, K.: A Comparative Study of Three Psycho-physical methods, Doctor’s Thesis, The University of Chicago. 19 Thurstone, L. L.: ’’Rank Order as a Psychophysical Method”, loc. cit., p. 199. 20 Ibid., p. 201. 3. Determination of scale values The rank order frequencies were converted into proportions, which are given in Table 11. The first column is the code numbers of the statements. The second column represents their rank order as determined by the method of equal appearing intervals. The remaining columns represent the proportion of 200 subjects who placed the statement in the rank order at the top of the column. Thus, column three reads as follows: of the 200 cases ranked opinion number 145 first, ranked statement 113 first, and so on. Table 111 shows the calculation for estimating the proportion of subjects who judged statement 2 to be more unfavorable than statement 1. The first column is a list of the twenty rank orders. For each opinion a strip was prepared similar to the second and third columns. Columns four and five represent such a strip for the second statement. The sixth column is a product of columns two and five for each of the rank orders. In calculation the entries in these last two columns were not recorded, the products being allowed to total on the calculating machine. Only the sums for the two columns were recorded, These are shown at the bottom of the table. In accordance with equation (1) the calculation of the estimated proportion of all the subjects who judged statement 2 to be more favorable than statement 1 is given also. This procedure was carried out for each of the 1/2 n(n-l)=190 possible pairs of statements. These calculated proportions from the 190 tables, represented by Table 111, have been transferred to one complete table, Table IV. The first column and the numbers at the top of the columns represent the rank orders of the statements, assigned as a result of the scaling by the method of equal appearing intervals. This table should be s *788725 * •0 6^5 p z>l = .855775 read as follows: 85.5% ranked statement 1 as more favorable than statement 2, 90.3% ranked statement 1 as more favorable than statement 3, and so on. The summations at the bottom of the columns give us a method for determining the rank order of the statements from the data of the order of merit method. This rank order is given below the summations. Table V gives the rank orders of the statements from the method of equal appearing intervals and the order of merit method. The rank orders are the same. From this point onward the procedure follows that outlined in Thurstone’s article, The Measurement of Opinion. 22 The next step is the application of the Law of Comparative Judgment. The law can be applied to any situation in which a number of observers make one discriminatory judgment for each possible pair of stimuli in any given stimulus series.lt can be employed in the scaling of opinions which have been ranked, because from the rank order data has been determined for each possible pair of statements the proportion of the two hundred judges who considered one of the statements more unfavorable toward smoking than the other. (Table IV) The shorter form of the law, Thurstone’s Case V has been used in this study. This form is applicable to the situation in which each subject of a group of subjects gives only one judgment for every stimuli pair. 24 S - S - x /T (2) in which S and S,, = the two scale values Z? 45 = sigma value of the observed proportion of judgments The following table represents the method of determining the scale distance between any two statements A and B so that all the available comparisons are taken into account: (S c -S- ) - (S c -S. - (s a - S fl ) - S B ) - (S. - S J - - S* (S t - S fl ) - (S E - ) - (Sb -s* ) etc. in which the left hand members are determined by the calculated proportions and by equation (2). We have then the same numerical value for the scale distance (S$ - ) from all the equations except for the 11 observational errors” in the calculated proportions.2s It is necessary to weight these values in determining the final scale distance, because the standard errors of the numerical values of (S s - ) from the different equations vary. Thurstone’s equation for this purpose is 26 W„ = —1 P Q P . Cl Cr ’ , ' C-6 -»»'"—I ■■■■■— t t« mi ■„!«■■ im.T— Z '7 C H (3) Table VI gives the calculation of for statement 1. The first column shows the numerical identification of the statements. Column p was taken from Table IV, column 2. Column q gives the values (1—p)• Column x is the sigma value of the given value of p. z is the value of the ordinate of the probability curve at x when the area of surface is taken as unity and O’ as unity. The last three columns are self-explanatory. Columns q, x, z and pq were read directly from the Kelley-Wood tables. One such table was prepared for each of the twenty statements. Table VII shows the calculation of the scale distance (S t - S z ) as an example. The first column is the numerical identification of the statements. The second and third columns are sigma values of statements 1 and 2, taken from St-So - . 1»145507 , .948132 x X’ 1.208239 Table VI. The fourth column is the difference x ( - x z . The fifth column is the weight w /x determined from equation (3), and the last column is the weighted difference. The scale distance is secured by the equation S - S S w which is given at the bottom of the table. Since there are twenty statements, there are nineteen tables like this one. Table VIII shows the scale distances between adjacent statements when they are arranged in rank order. Each of the entries in column 2 was determined by a table like Table VII. From the scale distances of Table VIII it is possible to obtain the final scale values which are given in Table IX. The statement having rank 1, that is, opinion 145, was arbitrarily chosen as the origin of the scale and the scale values of all of the other statements were secured by successive additions of the scale separations. They go from 0.00 to 7.49. 21 1b1d., pp. 190-191. After Thurstone, loc. cit. 2sThur stone, L. L.: ’’The Measurement of Opinion”, loc. cit., p. 424. 24 Ibid., p. 424. 25Ibid«, pp. 424-425. 26 Ibid., p. 426. * Kelley, T., L. : Statistical Method, pp. 373-385. (75, (35 (20 (43 (30 (32 (64 (39 (132 (61 (101 (48 (70 (72 H<l ci cn M -4 H 1 to tO CP Cl ) 14 ) 15 ) 16 ) 17 ) 18 .005 ) 19 ) 20 W H* • • • • • • O OOOH< H O O O Cl O tn • •••••••* OOOOOhNWh to OOCiCiOCiOCiCl OOOOHNHNO W CP OOOCiCiOOcici • »•••••••• O OOOONHNMO o H cn tO tO to bl to tn OOOOOOCiOO • ••••••••• O OOOHMNWOO O to -4 tn O “4 -4 CO Cl Cl O Cl Ci O O O O O Ct • •••••••••• O OOOHMHHOO O O O Cl to to CP iP*. *4 O* O ci Cl O Cl Cl Cl o o o o o • • *•••••••• O OOhWMHOOO o H o Ci to <1 O O H» -4 Cl Cl Ci Cl O Cl O Ci o o o OOOMWHOOOOO F-» o o CP to >4 to IO CP O CP 00 CiCiOOCtOCiCiOCiO • ••••••••♦•• o ih> H H 1 -4 w H K* <0 ciOtntnotndOiOdtno .080 .020 .010 .005 .120 .035 .010 .510 .220 .055 .010 .010 .195 .570 .195 .010 .005 .005 .075 .135 . 440 .125 .080 .020 . 025 .010 .005 .020 .145 .420 .215 .110 .035 .030 .020 .055 .240 .290 .190 .125 .065 .020 .005 .010 .040 .085 .180 .275 .205 .110 .065 .025 .015 .050 .065 .200 .280 .250 .115 .020 .020 .005 .035 .055 .115 .155 .305 .240 .055 .035 .015 .020 .090 .055 .110 .145 .320 .190 .050 .005 .005 .020 .015 .045 .065 .155 .550 .140 .010 .005 .020 .005 .030 .055 .145 .730 .005 .005 10 11 12 13 14 15 16 17 18 19 20 .005 Table 11. Proportion of 200 Judges Who Placed Each One of Twenty Opinions toward Smoking in Each of Twenty Absolute Rank Orders by the Order of Merit Method Rank Strip No. 1 Strip Ho. 2 Or&er Pit Pl>k Pz,yk Pil<-P z >k Pik' Pz,k 1 .795 ♦ 205 .110 .890 2 .105 .100 .315 .575 3 ♦ 035 .065 .235 .340 4 ♦ 020 .045 .130 ♦ 210 5 .025 .020 .070 .140 6 .010 .010 .060 .080 7 .010 .000 .040 .040 8 .030 .010 9 .010 .000 10 11 12 13 14 15 16 17 18 19 20 Table 111. Estimation of the Proportion of Subjects Who Perceived m 1 2 3 4 5 6 7 8 9 10 11 12 IS 14 15 16 17 18 19 20 1 * .145 .097 .080 .057 .037 .023 .017 .007 .001 .000 .001 .000 .000 .000 .000 .000 .000 .000 .000 2 • 855 .399 .329 .279 .166 ♦ 119 .097 .040 .003 .001 ♦ 005 .001 .000 .000 .000 .000 .000 .000 .000 3 • 90S .601 .417 .360 .195 .127 • 111 .045 .003 .001 .005 .001 .000 .000 .000 .000 .000 .000 .000 4 • 920 .671 .583 .447 .256 .174 .142 •062 .012 .006 .008 .006 .000 .000 .000 .000 .000 .000 .000 5 • 943 .721 .640 .553 .292 .195 .154 .062 .005 .001 .008 .001 .000 .000 .000 .000 .000 • 000 .000 6 • 96S .834 .805 .744 .708 .362 .241 • 124 .018 .006 .006 .001 .001 .001 .000 .001 .001 • 001 .001 7 • 977 .881 .873 .826 .805 .638 .307 .168 .020 .002 .023 .001 .000 .000 .000 .000 .000 .000 .000 8 •98S .903 .889 .858 .846 .759 .693 .355 .214 .037 .055 .007 .005 .004 .003 .002 .005 .001 .001 9 .993 .960 .955 .938 .938 .876 .832 .645 .161 •047 .068 .004 .001 .001 .000 .001 .005 .001 .001 10 .999 .997 .997 .988 .995 .982 .980 .786 .839 • 240 .069 .023 .018 .015 .012 .012 .007 .011 .011 11 1.000 ♦ 999 • 999 .994 .999 .994 .998 .963 .953 .760 .273 .041 .014 .010 .005 .004 .009 .002 .002 12 .999 .995 .995 .992 .992 .994 .977 .945 .932 .831 .727 .204 .118 .076 .038 .030 .035 .011 .008 IS 1.000 .999 .999 .994 .999 .999 .999 .993 .996 .977 .959 .796 .324 .201 .108 .083 .082 .023 .016 14 1.000 1.000 1.000 1.000 1.000 .999 1.000 .995 .999 .982 .986 .882 .676 .341 .208 .154 .144 .049 .029 15 1.000 1.000 1.000 1.000 1.000 .999 1.000 .996 .999 .985 .990 .924 .799 .659 .347 .262 .228 .086 .045 16 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .997 1.000 .988 .995 .962 .892 .792 .653 .376 .309 .116 .059 17 1.000 1.000 1.000 1.000 1.000 .999 1.000 .998 .999 .988 .996 .970 .917 .846 .738 .624 .402 .175 .084 18 1.000 1.000 1.000 1.000 1.000 • 999 leOOO .995 .995 .993 .991 .965 .918 .856 .772 .693 .598 .267 .121 19 1.000 1.000 1.000 1.000 1.000 .999 1.000 .999 ♦ 999 .989 .998 .989 .977 .951 .914 .884 .825 .733 .219 20 1.000 1.000 1.000 ,1.000 1.000 ♦ 999 1.000 .999 .999 .989 .998 .992 .984 .971 .955 .941 .916 .879 .781 s 18.535 16.706 16.231 15.713 15.425 14.182 13.479 12,380 11.573 9.919 8.981 8.001 6.453 5.556 4.681 3.863 3.264 2.839 1.524 .597 Rank Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sable IV. The Calculated Proportion of Judgements in Which the Opinion at the Top Was Considered More Favorable than the Opinion at the Side State- ment E. A. I. 0. M. 145 1 1 113 2 2 79 3 3 112 4 4 76 5 5 115 6 6 75 7 7 35 8 8 20 9 9 43 10 10 30 11 11 32 12 12 64 13 13 39 14 14 132 15 15 61 16 16 101 17 17 48 18 18 70 19 19 72 20 20 Table V. The Rank Order of the Statements Determined by the Method of Equal Appearing Intervals and by the Order of Merit Method P a X z 2, z pa pa z 2 1 0.000000 .398942 •159156 .250000 1.5708 2 .855 • 145 1.058122 .227923 •051949 .123975 2.3865 3 .903 .097 1.298837 .171628 .029456 .087591 2.9736 4 • 920 .080 1.405072 .148666 .022102 .073600 3.3300 5 .943 .057 1.580467 .114420 .013092 .053751 4.1056 6 .963 .037 1.786614 .080868 .006539 .035631 5.4482 7 • 977 .023 1.995393 .054490 •002969 .022471 7.5685 8 • 983 .017 2.120072 .042160 .001777 .016711 9.4041 9 • 993 .007 2.457264 .019487 .000379 .006951 18.3403 10 •1 i .999 .001 3.090229 .003367 .000011 .000999 90.8181 n 12 .999 •001 3.090229 .003367 .000011 .000999 90.8181 13 14 15 16 17 18 19 Zable VI. Calculation of for Statement 1 z a w 1 0.000000 -1.058122 +1.058122 .252690 .267377 2 +1.058122 0.000000 3 +1.298837 +0.255936 +1.042901 ♦218231 .227593 4 +1.405072 + .442676 + .962396 .199310 .191815 5 +1.580467 + .585815 + .994652 .169865 .168957 6 +1.786614 + .970093 + .816521 .130250 .106352 7 +1.995393 +1.180001 + .815392 .097851 .079787 8 +2.120072 +1.298837 + .821235 .080790 .066348 9 +2.457264 +1.750686 + .706578 .042531 .030051 10 +3.090229 +2.747781 + .342448 .007997 .002739 11 12 +3.090229 +2.575829 + .514400 ♦008724 .004488 13 14 15 16 17 18 19 20 Table VII. Determination of the Scale Distance Specimen Scale Separation S, - Sx .948132 Sx- S 3 •181324 S 3 - So •195317 So — S 5 •097446 Sy- .441856 S t - s 7 .251203 S 7 - Sg .314955 s,- s, •293381 s«- s /o .770755 Sa,- So •385194 So- Six .489982 S/>— S/3 .673501 S/3— S/1- .377746 S»*— S/*" .341565 8/*- S,c .309240 S/7 .228218 S/7— 2'8 .151268 S/?— Srt .582001 — S 2o .456908 Table VIII. Scale Separation between Opinions Statement Scale Value 145 (S, ) 0.0000 113 (SU 0.9481 79 (Sj 1.1294 112 (SJ 1.3247 76 (SJ 1.4221 115 (S* ) 1.8640 75 (Si) 2.1152 35 (S, ) 2.4302 20 (Si ) 2.7236 43 (S,o) 3.4944 30 (S>>) 3.8796 32 (S.J 4.3696 64 (S„) 5.0431 39 (Sj 5.4208 132 (S.s) 5.7624 61 (S,J 6.0716 101 (Sn) 6.2998 48 (S.») 6.4511 70 (S n ) 7.0331 72 (S„) 7.4900 Table IX. Scale Value Order of Merit Method CHAPTER V COMPARISON OF THE METHOD OF EQUAL APPEARING INTERVALS AND THE ORDER OF MERIT METHOD The relationship between the order of merit method and the method of equal appearing intervals can be shown by comparing the two scales resulting from their application. This comparison has been represented in two ways. The first is given in Figure 9, which represents the scale values determined by the order of merit method plotted on the same base line as that for the method of equal appearing intervals; that is, the scale units were made comparable, so that one unit of the order of merit scale is equal to 1.466 scale units of the other scale. The scale values of the statement determined by the former method were then read off in terms of the base line of the latter method. The two sets of values are given in Table X. The differences between them were averaged to give the value of .14 scale units. This small average difference means that there is close correspondence between the two scales. The second way of shewing this relationship is to plot the scale values secured by one method against the values obtained by the other. This has been done in black ink in Figure 10. It is apparent that the relation between the two sets of scale values is a linear one. Miss also compared these two methods by a graph which is shown in blue ink in Figure 10.* She found the relationship curvilinear. We offer the following possibilities for the explanation of the difference in the relationship found by the two studies: (1) the kind of specimens scaled was different for the two studies. We used opinions toward smoking, whereas Miss Hevner used handwriting specimens. (2) The method of securing the proportion of subject who ranked one specimen higher than another was different in the two studies. (3) We weighted the proportions mentioned in (2), whereas Miss Hevner did not weight hers. (4) The ’’end effect”, may account for the fact that Miss Hevner*s values at the ends of the scale do not fit a straight line. (5) Some of the experimental conditions of the two studies were different. Each of these possibilities will be considered in detail in the following paragraphs. (1) It is the relationship between the method of equal appearing intervals and the order of merit method that we are seeking to determine. It is unlikely that the kind of specimens used for the scaling influences this relation- ship. (2) The method of calculating a set of proportions differed in the two studies. In this experiment we determined the proportions of judges who placed one statement above another by means of Thurstone’s equation. Miss Hevner tabulated from the experimental data of rank order the number of subjects who placed one specimen above another, and from these frequencies calculated the proportion of the entire group of subjects. Thur stone in his article ’’Rank Order as a Psychophysical Method” determined the scale values of Miss Hevner 1 s specimens, by the use of his equation and plotted^ o these against the scale values secured by her. The relation was a linear one, all of the points falling on or tai ching the line, which extends from the lower left hand corner to the upper right hand corner. This means that the scale values were practically identical. Therefore, the difference in the procedure for determining the proportions cannot help to explain the difference in relationship between the two methods secured in the two studies. (3) In order to determine whether the weighting of proportions is a causal factor of the difference, we calculated the scale distances utilizing Miss Hevner J s procedure. 3o All of the proportions referred to above which had a value of .97 or more and of .03 or less, were excluded from the calculations, because they are the most unreliable ones. The remaining proportions were given equal weight. Miss Hevner says, 3l ’’When the measures of internal consistency are calculated, it will be seen that this approximation is not unwarranted.” The test of internal consistency consists of using the scale values of the specimens as starting points, and by reversing the whole process, finally arriving at a set of calculated proportions which may be compared with the experimental proportions. The degree of similarity between such proportions gives a measure of the internal consistency of the scale values. The average discrepancy between the calculated and experimental proportions for all the specimens, obtained by Miss Hevner for the order of merit method, was .012. This discrepancy is very small. The next step is to determine from the sigma values corresponding to the retained proportions the scale separation for every pair of specimens. The tables represented by Table VII were used for this purpose, but it was first necessary to cross out the x values which corresponded to the proportions excluded. The calculations re cessary are as follows: If any of the other specimens be designated by , then from equation (2), page 56, we have S, - (4) - (5) and where x IA V 2 represents the distance on the scale between specimens S, and the other given specimen , and x 2 represents the distance on the scale between S x and the given specimen. Subtracting (5) from (4) we have S, (6) In other words, Ps - x z4 ) is the scale separation of S, and S when calculated in termsof but one of the twenty specimens. Summing such values for all specimens, we have, in general terms, n(S, - Sj- 2:(x, t - x tf ) (7) ' S, - 8 r -J27n( S (8) or that is, the actual scale separation for any two specimens is equal to the sum of the scale separations for those two in terms of all the other specimens, divided by the total number of such calculated scale separations. For Table VII S x ;< - is 4.872, and the number of items is 5. Therefore, the scale separations would be or 1.37. As before the scale value- of S ; was arbitrarily designated as the origin of the scale, and the other scale values were secured by successive additions of the scale separations. The scale values obtained in this manner are given in Table XI. They are plotted in red ink in Figure 10. The points fall into a straight line as before. The values based on the weighted proportions fit its line better, that is, they do not scatter so far from it. The straight line relationship secured for the scale values based on unweighted proportions shows that the failure of Miss Hevner to weight her proportions is not a factor in the explanation of our problem. (4) The “end effect”, which is characteristic of the frequency distributions of the statements at the extremes of the scale constructed by the method of equal appearing intervals, may explain the difference in relationship between the two methods obtained in the two studies. The two extreme piles A and K contain all of the statements which would have been placed in piles outside of the extremes if there had been more piles, that is, the scale has been arbitrarily cut off at Aand K, which fact constitutes the “end effect”. With regard to the left end of the scale, as a rule more than of the placements of the most “extreme” statements are in pile A. It is thus necessary to extrapolate the curve until it reaches the line, in order to read the scale value of the specimen at this point. With regard to the extreme right end of the scale, all of the cumulative proportions for pile K equal 1.00. But this last proportion cannot be used in plotting, because if there had not been an end effect, 1.00 would have occurred in some pile beyond K. It is necessary to extrapolate the curve to the line after it has been drawn through the other points. For both cases of extrapolation, it is often difficult to know to what point on the line to extrapolate the curve, due to the fact that there is a possible range of sometimes as much as an entire scale unit. It is a matter of judgment. A glance at Figures 2 and 5 will illustrate this fact. In order to determine which distributions were characterized by the ’’end effect”, and whether the range afforded for extrapolation was sufficient to move the points into a straight line, we calculated the cumulative proportions and plotted them for the five specimens of handwriting at both ends of Miss Hevne l s scale, constructed by the method of equal appearing intervals. A glance at the curve drawn in blue ink in Figure 10 will show that it would be necessary to move the points at the extremes much more than a scale unit, in order to make the points fall into anything like a straight line. But the range over which one could extrapolate the above mentioned curves was not sufficient. Therefore, this aspect of the end effect does not offer an explanation of the difference. (5) The last possibility is to found in the differences in the experimental conditions 33 for the two studies. Pre- sumably in Miss Hevner’s study the materials were given to the subjects without further supervision on the part of the experimenter, whereas every step carried out by the subjects in our experiment was supervised. As a consequence, the subjects in her study were more careless, no doubt. But carelessness can hardly explain the regular curve obtained by Miss Hevner. Again, most of the specimens of handwriting were placed over a great range of piles, that is, the excellence of handwriting was not uniformly judged. This "ambiguity” may be due to the fact that excellence of handwriting is more difficult to judge than favorableness of an opinion, but it probably has its source in the greater amount of carelessness. What effect this had on the relationship between the scales, if any, we do not know. Since the scale values of the "extreme” specimens are not so representative as those of the other statements, due to the end effect which necessitates extrapolation of the curves, it is possible that great "ambiguity” would make these particular scale values even less representative. Especially is this true for those at the left end, which have a greater ambiguity than those at the right end. In fact, the latter are placed more uniformly than any others of the entire group of twenty. Carelessness and ambiguity together might be sufficient to lower the points at the ends of Miss Hevner’s line of relationship. If the points at the left end of the scale were raised one and one-half scale units, and those at the right end onehalf scale unit, a straight line could be drawn through them, the slope of which turns out to be very similar to that found in this study. Moreover, in Miss Hevner 1 s study the subjects first placed 72 specimens of handwriting into eleven piles. They then ranked 20 specimens which were duplicates of twenty of the 72 specimens. In the present study, the subjects ranked 77 statements, utilizing the assortment into eleven piles as a preliminary step for the procedure. Although they were permitted to change opinions from one pile to another, the original assortment into piles may have made more similar the scales constructed by these two methods. But we do not think that this played a very big part in the similarity of the scales secured by us. Furthermore, for the method of equal appearing intervals Miss Hevner did not direct her subjects to make the intervals between the piles equal, but only successive. It is possible that the presence of this direction would have changed some of the scale vaHues of the specimens. Wether they would have fit a straight line, when they were plotted against those secured by the order of merit method, we do not know. Inasmuch as we have eliminated the first three possi- bilities, the explanation of the difference in results must be in terms of the other two, or in terms of possibilities which we have not considered; but to us it seems most likely that the solution of our problem is to found in the ’’end effect” in combination with the great ambiguity of the specimens used by Miss Hevner. 33 Miss Hevner concludes with several factors which point to the superiority of the order of merit method over the method of equal appearing intervals. Her scales from the order of merit method and the method of paired comparison agree rather well, that is, the relationship is linear, but the relationship between these two methods and the method of equal appearing intervals is curvilinear. The last argument for the superiority of the order of merit method does not apply to the results of this study. In fact, in this study the two scales agree better than do her two scales constructed by the method of paried comparison and the order of merit method. There are other factors,however, that do point to the superiority of the order of merit method: (1) The order of merit method does not suffer from the ’’end effect”. (2) The unit of measurement in the scale constructed by the method of equal appearing intervals is an arbitrary one, namely one-eleventh of the range on the psychological continuum, represented by the particular statements used in this study. The unit of measurement for the order of merit method is the standard discriminal error projected by a single statement on the psychological continuum. Such a unit of measurement can be obtained by the direct application of the Law of Comparative Judgment. (3) Also there is a test of internal consistency available for the order of merit method, although it was not applied in this study. This test has been described on page 69. This test could not have been applied, for the reason that to follow the directions would have meant to compare the calculated proportions with calculated proportions. One set of proportions was calculated by the use of equation (1); the other set would have been calculated by reversing the procedure of obtaining the scale values from the proportions. This would not have been a test of consistency. No test of internal consistency has been devised for the method of equal appearing intervals. K.: loc. cit., p. 44. 7 wwauw mmmehmsmm * ♦The curve reproduced has been plotted from the values given in Table XVII, p. 41, If these are correct, there were some errors in plotting the graph on g. 44. After Thurstone, loc. cit., p. 200. 3sAfter Hevner, loc. cit., pp. 17-33. 51 Ibid., p. 24. 33 Ibid., pp. 7-11, Ss lbid., p. 45. 34 Thurstone, L. L.: “The Law of Comparative Judgment,” loc. cit. Figure 9* Experimental Scale of 20 Statements A- Order of Merit Method J— B- Equal Appearing Intervals Eigide 10» Comparison of the Scale Values from the Method of Equal Appearing Intervals and —i u_ -the Order of Merit Method State- ment E. A. I. 0. M. 145 0*2 0.2 ns 1.6 1.6 79 1.8 1.8 112 2.1 2.1 76 2.4 2.3 115 3.1 2.9 75 3.3 3.3 35 4.0 3.8 20 4.4 4.2 43 5.1 5.3 30 5.7 6.0 32 6.3 6.6 64 7.2 7.6 39 7.9 8.1 132 8.4 8.6 61 8.9 9.1 101 9.3 9.4 48 9.7 9.7 70 10.7 10.6 72 11.2 11.2 Table X. The Scale Values of the Twenty Opinions by the Two Methods Statement 0. Me 145 0.00 113 1.37 79 1.58 112 1.83 76 1.93 115 2.52 75 2.89 35 3.27 20 3.72 43 4.74 30 5.71 32 6.25 64 7.32 39 7.87 132 8.35 61 8.77 101 9.09 48 9.32 70 10.16 72 10.85 Table XI. Scale Values of the Twenty Statements Calculated from Unweighted Proportions CHAPTER VI SOME MEASURES OF RELIABILITY AND VALIDITY OF THE EXPERIMENTAL SCALE 1. The reliability of the individual scores The experimental scale constructed by the method of equal appearing intervals was presented to two hundred voters for endorsement of the 31 opinions. A copy of it, together with the directions is given here: EXPERIMENTAL S TUDY OF ATTITUDE TOWARD SMOKING* This is an experimental study of the distribution of attitude toward smoking. You will be asked to read a list of statements about smoking, and to indorse those that express your own opinion. 1. Name ........ 2. Sex ' (UncTerTine One) 3. Course Professor 4. Do you ine one}” 5. How many cigarettes do you smoke per day, 0-10; 10-20; 20-30; 30-40; 40-50; 50-60. (Underline one) If you smoke a pipe or cigar, indicate the number of pipesful, or cigars per day 6. Underline classification: Fresh., Soph., Jun., Sen., Grad., Fac., Unclas. Check every statement below that expresses your opinion toward smoking. Interpret the statements in accordance with your own experience with smoking. (1) 1. I get no pleasure from smoking but I do not object to others smoking. (20) 2. I approve of smoking though I seldom smoke myself. (39) 3. Smoking encourages carelessness in personal habits. (48) 4. Smoking is as bad as a drug when it becomes a habit. (21) 5. I like smoking but I do not miss it when a cigar ette is not available. (32) 6. Smoking is just like any other appetite, a bad thing if indulged in to the extreme. (35) 7. I like to see others smoke and to snell the smoke. (40) 8. Smoking uses money that could be spent to better advantage in other ways. (30) 9. I have no particular desire to smoke and therefore do not. (55) 10. I like to smoke while I am studying, because it lends a little distraction in the midst of tedious work. (61) 11. Smoking is a bad habit because it is only a temporary relief for nervousness and later increases it. (43) 12. I smoke occasionally but I don*t care anything about it. (63) 13. Smoking ’’breaks the ice” when people are not well acquainted. (64) 14. Smoking leaves an after-taste which is very unpleasant. (70) 15. Smoking is harmful to everyone, regardless of sex, because nicotine is a poison. (74) 16. Smoking is a joy that all should partake of. (75) 17. Smokihg is a good way of passing leisure time. (98/ 18. Smoking harms the teeth. (72) 19. Smoking is immoral, regardless of time, place or person. (76) 20. Smoking makes one feel more at ease and more self-confident. (79) 21. I can think better when I smoke. (99) 22. Smoking weakens the lungs, and makes conditions better for T.B. (106) 23. Smoking makes life more enjoyable. (112) 24. Smoking is a good thing because it frequently provides an emotional outlet. (132) 25. Smoking is a bad habit because the desire for a cigarette interferes with concentration on an another activity. (140) 26. I have a feeling of disgust when I see anyone smoke. (101) 27. Smoking ruins the throat and voice. (113) 28. Smoking is conducive to absolute relaxation and is therefore beneficial to the nerves of the smoker. (115) 29. Smoking is valuable to a nervous person because it gives him something to do with his hands. (104) 30. Smoking by both parents will lead to degeneracy of the. race. (145) 31. I get more enjoyment from smoking than from anything else I do. The data requested at the top of the sheet would be used in an investigation of the attitude toward smoking. Only a part of it was utilized here for the purposes of validity. The opinions occur in random order as to scale value, in order to induce the subjects to read every state ment. The unit of measurement of the scale is defined by the number of equal appearing intervals called for by the directions. An arbitrary origin was assigned to the favorable end of the scale, and each scale unit received a numerical designation. The method of scoring the blanks was that of averaging the scale values of the opinions indorsed by each subject. In order to test the reliability of the scores, the scale was given a second time from four to six weeks following the first presentation. We considered this period of time sufficient to obviate the possibility of remembering the previous indorsement. Of course, at the same time, the possibility of a change in attitude during that period enters to lower the correlation. But the correlation coefficient is high enough to indicate that this possibility is not very serious. It is much more likely that the correlation was lowered by the carelessness of.the subjects, some of whom did not take the second indorsement seriously. The correlation between the individual scores in the two indorse. ments was ♦The wording of the directions is similar to Thurstone*s blank for the church. 2. Validity of the experimental scale For the purpose of securing some idea of the validity of this scale, the group of two hundred who indorsed the statements was divided into two parts, those who smoke and those who do not. The distributions of their scores were plotted in Figure 11. The distributions have been given the same area by expressing each class-frequency as a proportion of the entire group. The arrow indicates the mean score of the group. For those who smoke it is 4.8, for those who do not smoke, 6.7. There is probably a positive correlation between action and attitude with regard to smoking, although we do not know how high it is. Since the distribution for those who smoke is farther to the right, that is toward the favorable end, than is that for those who do not, the scale has some validity. The difference between the means for the two groups is almost two scale units. These graphs are one measure of the relationship between action and attitude. Many of those who smoke have scores on the unfavorable side of the scale. In the majority of such cases, this is due to the fact that they indorsed statements which inply the unhealthfulness of smoking. If 5.5 is taken as the neutral point, it will be seen that most of those who do not smoke have scores on the unfavorable side. Figure 11. Comparison of Those Who Smoke and Those Who Do Not CHAPTER VII SUMMARY AND CONCLUSIONS The primary purpose of this study was to make a critical analysis of Thur st one 1 s technique of scaling attitude by the method of equal appearing intervals. Two hundred subjects placed 77 statements of opinion about smoking into eleven piles, which were subjectively spaced equally along a continuum which extended from ’’extremely favorable” through a neutral point to ’’extremely unfavorable”. The scale values of the statements were calculated by plotting a cumulative proportion curve from the data for each specimen, and reading the value of the median directly from this curve. The scale values are given in Table I. As a result of the construction of this scale, the following conclusions were drawn: 1. Two hundred subjects are sufficient for the stabilization of the scale values of the statements, provided * the following precautions are taken: (1) the subjects supervised, (2) especial care exercised to see that the subjects understand the directions and follow them, (3) the subjects forced to revise their assortment of the statements into piles, and (4) an effective method for the elimination of cases which exhibit carelessness or misunderstanding employed, in order that the scale values shall be as reliable as possible. 2. The ’’criterion of irrelevance” is of little value for the selection of statements. It is not worth the labor required for its determination. The next step in the critical analysis of the method of equal appearing intervals was the construction of a scale by the order of merit method. The scale values for the order of merit method were obtained by applying the ’’Law of Comparative Judgment” in its simplest form Case V. These scale values are given in Table IX. The results show that for the attitude toward smoking the scale ccnstructed by the method of equal appearing inter vals is very similar to that constructed by the order of merit method. The two methods are related in a linear fashion, as shown in Figure 10. It is valuable to know that the two methods give approximately the same results, because the method of equal appearing intervals involves less work than does the order of merit method. A secondary purpose of this study was to construct a scale for the attitude toward smoking. The experimental scale thus constructed consists of 31 statements more or less equally spaced. It is represented in Figure 8. The reliability of the scale values secured by the method of equal appearing intervals was very high. It was determined by increasing the number of cases from 140 to 200, and calculating the change in scale values. For the whole list of 77 statements the mean discrepancy was 0,06 scale units, for the 31 statements in the experimental scale, it was 0.03 scale units. The experimental scale was then presented to 200 subjects for indorsement of the opinions. The individual score was determined by averaging the scale values of the opinions indorsed. The opinions were presented a second time after an interval of from four to six weeks. The correlation coefficient for the two sets of scores was which shows a relatively high reliability of the individual scores. A measure of the validity of the scale is suggested in Figure 11. The group of voters was divided into two parts, those who smoke and those who do not. Their scores were plotted in frequency distributions and the means of the groups determined. The mean of the group who smoke was 4.8, of those who do not smoke, 6.7. The difference in the means of the two groups and the appearance of the distributions indicate that the scale has validity. Bibliography Hevner, K.: “A Comparative Study of Three Psychophysical Methods”, Thesis, The University of Chicago. 1928. Hollingworth, H. L.: “Professor Catt6ll*B Studies by the Method of Relative Position,” Col. Univ. Contr. to < wwii ir । > I—-mi hkw ——m— Phil, and Psych., Vol. XXII, No. 30. Thurstone, L. L.: “A Law of Comparative Judgment,” Psychological Review, Vol. XXXIV, 1927, pp. 273-286. Thurstone, L. L.: “The Measurement of Opinion”, Journal of Abnormal and Social Psychology, Vol. XXII, 1928, pp. 415-430. Thurstone, L. L.: The Measurement of Attitude, The Uni' nt—rua ——a— a— e-rs-a-c—— cmew am— :aaiu r j—caak-g— sj— * ver sit y of Chicago Press, 1929. Thurstone, L. L.: “A Study for Measuring Attitude toward the Movies,” Journal of Educational Research, Vol. XXII, pp. 89-94. Thurstone, L. L.: ”The Measurement of Social Attitudes,” Journal of Abnormal and Social Psychology, Vol. XXVI, pp. 249-267. Thurstone, L. L.: “Rank Order as a Psychophysical Method,” Journal of Experimental Psychology, Vol. XIV, 1931, pp. 187-201.