Publication date: 10/10/2017 20:53

The vast majority of psychological research is aimed at achieving two main goals:

  1. Reveal the relationship between the indicator. For this, correlation analysis is used.
  2. Establish differences in the severity of psychological indicators in two or more groups. In this case, either Mann-Whitney U-test or Student's t-test are used.

In this article, we will consider the main aspects of using the Mann-Whitney test in processing the results empirical research in term papers and theses, as well as master's theses in psychology.

Why is the Mann-Whitney test needed?

In psychological research, the results of individual subjects are not studied, but generalized data. For example, when studying the characteristics of psychological parameters in two groups, the average values ​​in these groups are studied.

Recall that the average (arithmetic mean) reflects the average indicator for the group. The average value is calculated as follows:

  • The scores of all subjects in the group are summarized.
  • The amount is divided by the number of subjects.

Thus, when we compare psychological indicators in two subjects, then no statistical criteria are needed. Indeed, let in the course of testing the level of personal anxiety of Ivanov turned out to be 40 points, and Petrov - 50 points. In this case, we boldly say that Petrov is more anxious than Ivanov. However, if we are talking about about comparing the two groups, the situation becomes more complicated.

For example, we calculated the average level of personal anxiety in the group of women - 58 points, and men - 49 points. Since averages are statistics and not just numbers, you can't just compare them. That is, we cannot say that the anxiety of women is higher than that of men. But how to be? How to compare anxiety rates in groups of men and women?

For this, there are statistical criteria for analyzing differences. Their calculation allows us to conclude with a certain accuracy whether there are differences in the severity of indicators in the two groups or not.

Student's t-test is used to analyze differences in mean values ​​between two groups. The Mann-Whitney U-test allows you to compare not the average values, but the severity of the indicators, but in this case, the average values ​​of the parameters in the groups will differ accordingly.

Calculation of the Mann-Whitney criterion: an explanation in simple words

In the vast majority of psychological studies, the calculation of statistical criteria, including the Mann-Whitney test, is carried out using statistical programs. The most famous are SPSS and STATISTICA. However, despite this, it is important in general terms imagine the essence of the calculation - this will give the student-psychologist on the defense of the diploma.

Let's return to our example with the anxiety of men and women. Suppose we have two groups of 10 people. Each subject has a certain value of personal anxiety. We need to find out if anxiety levels differ between male and female groups. The calculation of the Mann-Whitney criterion will approximately take place according to the following steps:

  1. Anxiety indicators in groups are entered into a table and are ranked, that is, they are arranged in ascending order.
  2. Further, data on men and women are combined into a common column (at the same time, they are marked, for example, in different colors) and are ranked again.
  3. And then the analysis is carried out. If the data for men and women (blue and red numbers) mostly alternate, then there is probably no difference.
  4. But if the data for men are grouped mainly at the top, where there are low rates, and for women at the bottom, where there are high, then most likely there are differences.

We gave an explanation on the fingers. Statistical programs for calculation use special algorithms that allow you to numerically evaluate these intersections of the data of both groups (blue and red numbers) and draw a conclusion about the existence or non-existence of differences.

What you need to know about the Mann-Whitney criterion for thesis defense

The Mann-Whitney U-test is a non-parametric statistical test used to compare the severity of indicators in two disconnected samples.

What is non-parametric? Without going into statistical subtleties, you need to understand the following. Parametric statistical tests are more accurate, but they impose more stringent data requirements. That is, before the calculation, you need to check all the data in the groups, for example, for the normal distribution. This means that on the distribution graph, such data should be arranged in the form of a bell - most of the subjects with average values, and a minority with low and high values. Student's t-test is a parametric test.

Nonparametric criteria are less accurate, but they do not have strict data requirements. This data can be almost anything.

What does disjointed samples mean? This means that the groups are not suppressed, that is, they have different subjects. The calculation of differences in connected samples is used, for example, when identifying the effectiveness of trainings, when measurements are taken "before" and "after", and then compared. Student's t-test has a variant for connected samples. The Mann-Whitney test is used only for disconnected ones.

Limitations of the Mann-Whitney test

  1. The number of subjects in groups when using the Mann-Whitney test should not exceed 60 people.
  2. The minimum number of subjects is 3 people in each group.
  3. The size of the groups should not be strictly the same, but should not differ greatly.
  4. The compared indicators can be both psychological (anxiety, aggressiveness, self-esteem, etc.) and non-psychological (learning success, effectiveness professional activity etc.)

"Why did you choose to calculate the Mann-Whitney test?"

It is this very question that frightens many psychology students before defending their diploma. We offer the following answer as a basis for individual modifications:

“In this paper, we did not test the data for normal distribution, so we used the non-parametric Ann-Whitney statistical test, designed to detect differences in indicators in two disjoint samples.”

It is important to understand that this question actually means the following: "Why did you choose the Mann-Whitney test, and not the Student's test." It is these criteria that are most often used for comparative analysis in psychological research.

Therefore, in the answer, it is necessary to indicate that the data were not checked for normality, for example, due to the small size of the groups. Therefore, we decided to stop at a non-parametric criterion.

Level of statistical significance

If you use a statistical program to calculate the Mann-Whitney test, then two important indicators will be present in the output of the results:

  1. U is, in fact, the numerical value of the criterion. To determine the reliability of differences in the severity of indicators in groups, it is necessary to compare the obtained value of Uemp with the critical value from a special table - Ucr. If Uemp≤ Ucr, then the differences in the severity of indicators in the groups are statistically significant.
  2. p is the level of statistical significance. This indicator is present in the calculation of all statistical criteria and reflects the degree of accuracy of the conclusion about the presence of differences. Two levels of accuracy are accepted in psychological research:
  • p≤0.01 - error probability 1%;
  • p≤0.05 - 5% error probability.

An example of data analysis using the Mann-Whitney test in a diploma in psychology

The results of a comparative analysis of indicators of resilience in young people and people middle age

Averages

Mann-Whitney U test

Level of statistical significance (p)

the youth

people of mature age

Involvement

32,9

40,9

0,000*

Control

27,2

28,3

1170,5

0,584

Risk taking

17,9

14,4

0,000*

Vitality

78,0

83,6

1022,5

0,117

* - differences are statistically significant (p0,05)

Analysis of the data given in the table allows us to draw the following conclusions:

The indicators on the “involvement” scale in the group of representatives of the older generation are statistically significantly higher than in the group of representatives of the younger generation. This means that people of mature age, compared with young people, are characterized by a higher involvement in what is happening, they enjoy their own activities to a greater extent. At the same time, young people, to a greater extent than more mature people, experience a feeling of rejection, a feeling of being “outside” of life. This result is associated with psychological features ages: young people have not yet found their place in life, which leads to their lack of involvement in what is happening, while at the same time, mature people are largely rooted in life, which allows them to be at a higher level of involvement.

Indicators on the “risk acceptance” scale in the group of young people are statistically significantly higher than in the group of representatives of mature age. This means that young people, in comparison with people of mature age, are characterized by a higher conviction that everything that happens to him contributes to his development through knowledge derived from experience, no matter positive or negative. Young people, to a greater extent than mature people, consider life as a way of gaining experience, they are ready to act in the absence of reliable guarantees of success, at their own peril and risk, considering the desire for simple comfort and security to impoverish the life of an individual.

As the data obtained show, the differences in the indicators of hardiness in the groups of young people and people of mature age are multidirectional, which ultimately determines the absence of differences in the overall indicators of hardiness in the groups of subjects.

So, the differences in the indicators of resilience in the groups of representatives of the younger generation and people of mature age are multidirectional: young people are more likely to accept risk, and people of mature age are more involved in what is happening. As a result, no differences were found in the general indicators of hardiness in the groups of subjects.

From Wikipedia, the free encyclopedia

Mann-Whitney U-test(English) Mann-Whitney U-test) is a statistical test used to assess the differences between two independent samples in terms of the level of any trait, measured quantitatively. Allows you to detect differences in the value of a parameter between small samples.

Other names: Mann-Whitney-Wilcoxon test Mann-Whitney-Wilcoxon, MWW ), the Wilcoxon rank sum test (eng. Wilcoxon rank-sum test) or the Wilcoxon-Mann-Whitney test (eng. Wilcoxon - Mann - Whitney test ). Less common: the number of inversions test.

Story

This method for detecting differences between samples was proposed in 1945 by Frank Wilcoxon ( F. Wilcoxon). In 1947 it was substantially revised and expanded by H. B. Mann ( H. B. Mann) and D. R. Whitney ( D. R. Whitney), by whose names it is usually called today.

Description of the criterion

A simple nonparametric test. The power of the test is higher than that of the Rosenbaum Q-test.

This method determines if the area of ​​overlapping values ​​between two series (the ranked series of parameter values ​​in the first sample and the same in the second sample) is small enough. The smaller the criterion value, the more likely it is that the differences between the parameter values ​​in the samples are significant.

Criterion Applicability Limitations

  1. Each of the samples must contain at least 3 feature values. It is allowed that in one sample there are two values, but in the second there are at least five.
  2. There should be no matching values ​​in the sample data (all numbers are different) or there should be very few such matches.

Using a Criterion

To apply the Mann-Whitney U-test, you need to perform the following operations.

  1. Compile a single ranked series from both compared samples, arranging their elements according to the degree of growth of the feature and assigning a lower rank to the lower value. Total ranks will be equal to: N=n_1+n_2, Where n_1 is the number of elements in the first sample, and n_2 is the number of elements in the second sample.
  2. Divide a single ranked series into two, consisting of units of the first and second samples, respectively. Calculate separately the sum of the ranks that fell on the share of the elements of the first sample, and separately - on the share of the elements of the second sample. Define big from two rank sums ( T_x) corresponding to the sample with n_x elements.
  3. Determine the value of the Mann-Whitney U-test using the formula: U=n_1\cdot n_2+\frac(n_x\cdot(n_x+1))(2)-T_x.
  4. According to the table for the selected level of statistical significance, determine the critical value of the criterion for the data n_1 And n_2. If the received value U less tabular or equal to it, then the existence of a significant difference between the level of the feature in the considered samples is recognized (an alternative hypothesis is accepted). If the resulting value U more than the table, the null hypothesis is accepted. The significance of differences is higher, the lower the value U.
  5. If the null hypothesis is true, the criterion has the mathematical expectation M(U)=\frac(n_1\cdot n_2)(2) and dispersion D(U)=\frac(n_1\cdot n_2\cdot (n_1+n_2+1))(12) and with a sufficiently large amount of sample data (n_1>19,\;n_2>19) distributed almost normally.

Table of critical values

see also

  • The Kruskal-Wallis test is a multivariate generalization of the Mann-Whitney U-test.

Write a review on the article "Mann-Whitney U-test"

Notes

Literature

  • Mann H.B., Whitney D.R. On a test of whether one of two random variables is stochastically larger than the other. // Annals of Mathematical Statistics. - 1947. - No. 18. - P. 50-60.
  • Wilcoxon F. Individual Comparisons by Ranking Methods. // Biometrics Bulletin 1. - 1945. - P. 80-83.
  • Gubler E. V., Genkin A. A. Application nonparametric criteria statistics in biomedical research. - L., 1973.
  • Sidorenko E.V. Methods of mathematical processing in psychology. - St. Petersburg, 2002.

An excerpt characterizing the Mann-Whitney U-test

He forgot himself for one minute, but during this short interval of oblivion he saw countless objects in a dream: he saw his mother and her big white hand, saw Sonya's thin shoulders, Natasha's eyes and laughter, and Denisov with his voice and mustache, and Telyanin , and all his history with Telyanin and Bogdanych. This whole story was one and the same, that this soldier with a sharp voice, and this and that whole story, and this and that soldier so painfully, relentlessly held, crushed, and all in one direction pulled his hand. He tried to move away from them, but they did not let go of his hair, not even for a second on his shoulder. It wouldn't hurt, it would be great if they didn't pull it; but it was impossible to get rid of them.
He opened his eyes and looked up. The black canopy of night hung a yard above the light of the coals. Powders of falling snow flew in this light. Tushin did not return, the doctor did not come. He was alone, only some kind of soldier was now sitting naked on the other side of the fire and warming his thin yellow body.
"No one wants me! thought Rostov. - No one to help or pity. And I was once at home, strong, cheerful, beloved. He sighed and groaned involuntarily.
- What hurts? - asked the soldier, shaking his shirt over the fire, and without waiting for an answer, grunting, added: - You never know they spoiled the people in a day - passion!
Rostov did not listen to the soldier. He looked at the snowflakes fluttering over the fire and recalled the Russian winter with a warm, bright house, a fluffy fur coat, a fast sleigh, a healthy body, and with all the love and care of the family. "And why did I come here!" he thought.
The next day, the French did not resume their attacks, and the remnant of the Bagration detachment joined Kutuzov's army.

Prince Vasily did not consider his plans. He even less thought to do evil to people in order to gain an advantage. He was only a man of the world who had succeeded in the world and made a habit out of this success. He constantly, depending on the circumstances, on rapprochements with people, drew up various plans and considerations, in which he himself did not fully realize, but which constituted the whole interest of his life. Not one or two such plans and considerations happened to him in use, but dozens, of which some were just beginning to appear to him, others were achieved, and still others were destroyed. He did not say to himself, for example: “This man is now in power, I must gain his trust and friendship and through him arrange for a lump-sum allowance,” or he did not say to himself: “Here, Pierre is rich, I must lure him to marry his daughter and borrow the 40,000 I need”; but a man in strength met him, and at that very moment instinct told him that this man could be useful, and Prince Vasily approached him and at the first opportunity, without preparation, instinctively, flattered, became familiar, talked about that, about what was needed.
Pierre was at hand in Moscow, and Prince Vasily arranged for him to be appointed to the Junker Chamber, which then equaled the rank of State Councilor, and insisted that the young man go with him to Petersburg and stay at his house. As if absent-mindedly and at the same time with undoubted confidence that this should be so, Prince Vasily did everything that was necessary in order to marry Pierre to his daughter. If Prince Vasily had thought ahead of his plans, he could not have had such naturalness in his manner and such simplicity and familiarity in dealing with all people placed above and below himself. Something constantly attracted him to people stronger or richer than him, and he was gifted with a rare art of seizing precisely that moment when it was necessary and possible to use people.
Pierre, having suddenly become rich and Count Bezukhy, after recent loneliness and carelessness, felt himself surrounded and busy to such an extent that he only managed to remain alone in bed with himself. He had to sign papers, deal with government offices, the meaning of which he did not have a clear idea, ask the general manager about something, go to an estate near Moscow and receive many people who previously did not want to even know about its existence, but now would be offended and upset if he did not want to see them. All these diverse faces - businessmen, relatives, acquaintances - were all equally well, affectionately disposed towards the young heir; all of them, obviously and undoubtedly, were convinced of the high merits of Pierre. Incessantly he heard the words: "With your extraordinary kindness" or "with your beautiful heart", or "you yourself are so pure, count ..." or "if he were as smart as you", etc., so he he sincerely began to believe in his extraordinary kindness and his extraordinary mind, all the more so since it always seemed to him, in the depths of his soul, that he was really very kind and very clever. Even people who were previously angry and obviously hostile became tender and loving with him. Such an angry eldest of the princesses, with a long waist, with her hair smoothed like a doll's, came to Pierre's room after the funeral. Lowering her eyes and constantly flashing, she told him that she was very sorry for the misunderstandings that had been between them and that now she did not feel entitled to ask anything, except for permission, after the stroke that had befallen her, to stay for several weeks in the house that she loved so much and where made so many sacrifices. She could not help but cry at these words. Touched by the fact that this statue-like princess could have changed so much, Pierre took her by the hand and asked for forgiveness, without knowing why. From that day on, the princess began to knit a striped scarf for Pierre and completely changed towards him.

Mann-Whitney U-test(English) Mann-Whitney U-test) is a statistical test used to assess the differences between two independent samples in terms of the level of any trait, measured quantitatively. Allows you to detect differences in the value of a parameter between small samples.

Wilcoxon rank-sum test ). Less common: the criterion for the number of inversions.

Story

This method for detecting differences between samples was proposed in 1945 by Frank Wilcoxon ( F. WilcoxonH. B. Mann) and D. R. Whitney ( D. R. Whitney

Description of the criterion

  1. There should be no matching values ​​in the sample data (all numbers are different) or there should be very few such matches (up to 10).

Using a Criterion

  1. Compile a single ranked series from both compared samples, arranging their elements according to the degree of growth of the feature and assigning a lower rank to the lower value. The total number of ranks will be equal to: N = n 1 + n 2 , (\displaystyle N=n_(1)+n_(2),) where n 1 (\displaystyle n_(1)) is the number of elements in the first sample, and n 2 (\displaystyle n_(2)) - the number of elements in the second sample.
  2. Divide a single ranked series into two, consisting of units of the first and second samples, respectively. Calculate separately the sum of the ranks that fell on the share of the elements of the first sample, and separately - on the share of the elements of the second sample. Define big of two rank sums (T x (\displaystyle T_(x))) corresponding to a sample with n x (\displaystyle n_(x)) elements.
  3. Determine the value of the Mann-Whitney U-test using the formula: U = n 1 ⋅ n 2 + n x ⋅ (n x + 1) 2 − T x . (\displaystyle U=n_(1)\cdot n_(2)+(\frac (n_(x)\cdot (n_(x)+1))(2))-T_(x).)
  4. Using the table for the selected level of statistical significance, determine the critical value of the criterion for data n 1 (\displaystyle n_(1)) and n 2 (\displaystyle n_(2)) . If the received value is U (\displaystyle U) less tabular or equal to it, then the existence of a significant difference between the level of the feature in the considered samples is recognized (an alternative hypothesis is accepted). If the resulting value U (\displaystyle U) is greater than the table value, the null hypothesis is accepted. The significance of the differences is higher, the smaller the value of U (\displaystyle U) .
  5. If the null hypothesis is true, the criterion has the expectation M (U) = n 1 ⋅ n 2 2 (\displaystyle M(U)=(\frac (n_(1)\cdot n_(2))(2))) and variance D (U) = n 1 ⋅ n 2 ⋅ (n 1 + n 2 + 1) 12 (\displaystyle D(U)=(\frac (n_(1)\cdot n_(2)\cdot (n_(1)+ n_(2)+1))(12))) and with a sufficiently large amount of sample data (n 1 > 19 , n 2 > 19) (\displaystyle (n_(1)>19,\;n_(2)>19 )) is distributed almost normally.

Table of critical values

  • Calculation of the critical values ​​of the Mann-Whitney U-test for samples greater than 20 (N>20) (downlink from 10-02-2017 )

Mann-Whitney test: example, table

A criterion in mathematical statistics is a strict rule according to which a hypothesis with a certain level of significance is accepted or rejected. To build it, you need to find a certain function. It should depend on the final results of the experiment, that is, on empirically found values. It is this function that will be a tool for assessing the discrepancy between the samples.

Statistically significant value. General information

Statistical significance is a quantity that is unlikely to occur by chance. Its more extreme indicators are also insignificant. A difference is said to be statistically significant if there are data that are unlikely to occur if the difference is said not to exist. But this does not mean at all that this difference must necessarily be large and significant.

The level of statistical significance of the test

This term should be understood as the probability of rejecting the null hypothesis if it is true. This is also called a Type I error or a false positive decision. In most cases, the process relies on a p-value ("pi-value"). This is the cumulative probability when observing the level of the statistical criterion. It, in turn, is calculated from the sample at the time of accepting the null hypothesis. The assumption will be rejected if this p-value is less than the level declared by the analyst. The significance of the test value directly depends on this indicator: the smaller it is, the more reason to reject the hypothesis, respectively.
The significance level is usually denoted by the letter b (alpha). Popular indicators among specialists: 0.1%, 1%, 5% and 10%. If, say, it says that the chances of a match are 1 in 1000, then we are definitely talking about the 0.1% level of statistical significance. random variable. Different b-levels have their pros and cons. If the score is lower, then the alternative hypothesis is more likely to be significant. However, there is a risk that the false null guess will not be rejected. It can be concluded that the choice of the optimal b-level depends on the "significance-power" balance or, accordingly, on the trade-off of the probabilities of false positive and false negative decisions. A synonym for "statistical significance" in the domestic literature is the term "reliability".

Null Hypothesis Definition

In mathematical statistics, this is an assumption that is tested for consistency with empirical data already in stock. In most cases, the null hypothesis is the hypothesis that there is no correlation between the variables under study or that there are no differences in homogeneity in the distributions under study. In standard research, a mathematician tries to disprove the null hypothesis, that is, to prove that it is not consistent with experimental data. Moreover, there must be an alternative assumption, which is taken instead of the zero one.

Key Definition

The U criterion (Mann-Whitney) in mathematical statistics allows you to evaluate the differences between two samples. They can be given according to the level of some trait, which is measured quantitatively. This method is ideal for estimating differences in small samples. This simple criterion was proposed by Frank Wilcoxon in 1945. And already in 1947, the method was revised and supplemented by scientists H. B. Mann and D. R. Whitney, whose names it is called to this day. The Mann-Whitney criterion in psychology, mathematics, statistics and many other sciences is one of the fundamental elements of the mathematical substantiation of the results of theoretical research.

Description

The Mann-Whitney test is a relatively simple method with no parameters. Its power is significant. It is significantly higher than the power of the Rosenbaum Q-test. The method evaluates how small the area of ​​cross values ​​between samples, namely between the ranked series of values ​​of the first and second sets. The smaller the criterion value, the more likely it is that the parameter value discrepancies are reliable. To correctly apply the U (Mann-Whitney) criterion, one should not forget about some limitations. Each sample must contain at least 3 feature values. It is possible that in one case there are two values, but in the second case there must be at least five of them. The samples under study should have minimal amount matching scores. All numbers should be different ideally.

Usage

How to use the Mann-Whitney test correctly? The table compiled by this method, contains certain critical values. The first step is to create a single series from both matched samples, which is then ranked. That is, the elements are lined up according to the degree of growth of the attribute, and a lower rank is assigned to a lower value. As a result, we get this total number ranks:

N = N1 + N2,

where the values ​​N1 and N2 are the number of units contained in the first and second samples, respectively. Further, a single ranked series of values ​​is divided into two categories. Units, respectively, from the first and second samples. Now the sum of the ranks of the values ​​in the first and second rows is calculated in turn. The largest of them (Tx) is determined, which corresponds to a sample with nx units. To use the Wilcoxon method further, its value is calculated by the following method. It is necessary to find out from the table for the chosen level of significance the critical value of this criterion for specifically taken N1 and N2.
The resulting indicator can be less than or equal to the value from the table. In this case, a significant difference in the levels of the trait in the studied samples is stated. If the value obtained is greater than the table value, then the null hypothesis is accepted. When calculating the Mann-Whitney test, it should be noted that if the null hypothesis is true, the test will have a mean as well as a variance. Note that for sufficiently large volumes of sample data, the method is considered to be almost normally distributed. The significance of differences is the higher, the lower the value of the Mann-Whitney test.

Values ​​of the Pearson criterion (criterion)

  1. Tables of probabilities associated with the values ​​of the Mann-Whitney test.

Tables of probabilities associated with the values ​​of the Mann-Whitney test. For the experimental value of the criterion (the smaller of the two values) and the sample sizes, find the probability that both groups belong to the same population. Thus, a low probability value, for example, P

    Table 3

  1. Table 4

  2. Table 5

    1. Table 6

  1. Table of critical values ​​of the Mann-Whitney test for the significance level.

If , then the difference between the samples is significant for , that is, the null hypothesis should be rejected.

N 2

N 1

2. U - Mann-Whitney test

The criterion is designed to assess the differences between two samples in terms of the level of any trait, quantitatively measured. It allows you to detect differences between small samples when n1 and n2 are greater than or equal to 3 (or n1 = 2, and n2 is then greater than or equal to 5.)

The method determines if the area of ​​overlapping values ​​between two series is small enough. The smaller this area, the more likely it is that the differences are significant. The empirical (actually obtained) value of the U criterion reflects how large the zone of coincidence between the rows is. The lower Uemp., the more likely it is that the differences are significant.

Hypotheses.

But: The level of the attribute in group 2 is not lower than the level of the attribute in group 1.

H1: The level of the trait in group 2 is lower than the level of the trait in group 1.

Limitations of the U criterion.

1. There must be at least 3 observations in each sample or, in extreme cases, a ratio of 2 to 5 or more is allowed.

2. There should be no more than 60 observations in each sample.

Algorithm for calculating the criterion U - Mann-Whitney.

1. Transfer all sample data to individual cards (on which it will be reflected in color or in some other way which of the samples the value belongs to).

2. Lay out all the cards in a common row as the sign increases, regardless of which sample they belong to.

3. Rank (according to the ranking algorithm) the values ​​on the cards, assigning a lower rank to the lower value. There should be n1 + n2 ranks in total (the size of the first sample + the size of the second sample).

4. Re-arrange the cards in two rows, based on belonging to sample 1 or sample 2.

6. Determine the larger of the two rank sums.

7. Determine the value of U by the formula:

8. Determine from the tables the critical values ​​of U, in accordance with this, accept or reject the hypothesis No.

3. H - Kruskal - Wallis criterion

The H criterion is used to assess differences in the severity of the analyzed trait simultaneously between three, four or more samples. It allows you to identify the degree of change in the trait in the samples, without indicating, however, the direction of these changes.

The criterion is based on the principle that the smaller the overlap of samples, the higher the level of significance. H emp . It should be emphasized that there may be a different number of subjects in the samples, although in the tasks below, an equal number of subjects in the samples is given.

Working with data begins with the fact that all samples are conditionally combined in the order of occurring values ​​into one sample, and the values ​​of this combined sample are ranked. Then the obtained ranks are affixed to the original sample data, and the sum of the ranks is calculated separately for each sample. The criterion is based on the following idea – if the differences between the samples are insignificant, then the sums of ranks will not differ significantly from one another and vice versa.

Value H emp calculated by the formula:

H emp

Where N is the total number of members in the generalized sample;

n i is the number of members in each individual sample;

are the squares of the sums of ranks for each sample.

When determining the critical values ​​of the criterion for four or more samples, use the table for the criterion hee-square, having previously calculated the number of degrees of freedom v For c = 4. Then v = c - 1 = 4 – 1=3..

We emphasize that if we use criteria that allow us to compare only two series of values, then the result obtained above would require six comparisons - the first sample with the second, third, etc.

To use the criterion H the following conditions must be met:

1. The measurement must be taken on a scale of order, intervals or ratios.

2. Samples must be independent.

3. Allowed different number subjects in the compared samples.

4. When comparing three samples, it is allowed that one of them contains n = 3, and in the other two n = 2. However, in this case, the differences can be recorded only at the 5% significance level.

5. Table 9 of the Appendix is ​​provided for only three samples and ( n 1n 2, n H), £ 5, that is, the maximum number of subjects in all three samples can be less than and equal to 5.

6. With a larger number of samples and different amount subjects in each sample should use the table for the criterion hee-square. In this case, the number of degrees of freedom is determined by the formula: v = With - 1, where With - the number of matched samples.

The Mann-Whitney U-test is:

Mann-Whitney U-test

Mann-Whitney U-test

Mann-Whitney U-test(English) Mann-Whitney U-test) is a statistical criterion used to evaluate the differences between two samples in terms of the level of some trait, measured quantitatively. Allows you to detect differences in the value of a parameter between small samples.

Other names: Mann-Whitney-Wilcoxon test Mann-Whitney-Wilcoxon, MWW), the Wilcoxon rank sum test (eng. Wilcoxon rank-sum test) or the Wilcoxon-Mann-Whitney test (eng. Wilcoxon - Mann - Whitney test).

Story

This method of detecting differences between samples was proposed in 1945 by Frank Wilcoxon ( F. Wilcoxon). In 1947 it was substantially revised and expanded by H. B. Mann ( H. B. Mann) and D. R. Whitney ( D. R. Whitney), by whose names it is usually called today.

Description of the criterion

A simple nonparametric test. The power of the test is higher than that of the Rosenbaum Q-test.

This method determines if the area of ​​overlapping values ​​between two series (the ranked series of parameter values ​​in the first sample and the same in the second sample) is small enough. The smaller the criterion value, the more likely it is that the differences between the parameter values ​​in the samples are significant.

Criterion Applicability Limitations

  1. Each of the samples must contain at least 3 feature values. It is allowed that in one sample there are two values, but in the second there are at least five.
  2. There should be no matching values ​​in the sample data (all numbers are different) or there should be very few such matches.

Using a Criterion

To apply the Mann-Whitney U-test, you need to perform the following operations.

  • Automatic calculation of the Mann-Whitney U-test

Table of critical values

  • Table of critical values ​​of the Mann-Whitney U-test
  • Critical Values ​​for the Mann - Whitney U-Test.

see also

  • The Kruskal-Wallis test is a multivariate generalization of the Mann-Whitney U-test.

Literature

  • Mann H.B., Whitney D.R. On a test of whether one of two random variables is stochastically larger than the other. // Annals of Mathematical Statistics. - 1947. - No. 18. - P. 50-60.
  • Wilcoxon F. Individual Comparisons by Ranking Methods. // Biometrics Bulletin 1. - 1945. - P. 80-83.
  • Gubler E. V., Genkin A. A. Application of non-parametric statistics criteria in biomedical research. - L., 1973.
  • Sidorenko E.V. Methods of mathematical processing in psychology. - St. Petersburg, 2002.

Wikimedia Foundation. 2010.

  • U-954
  • U-point women

See what the "Mann-Whitney U-test" is in other dictionaries:

    Mann U-test- U test Mann Whitney (eng. Mann Whitney U test) is a statistical test used to assess the differences between two independent samples in terms of the level of any trait, measured quantitatively. Allows you to identify ... ... Wikipedia

    Mann-Whitney U test- (Eng. Mann Whitney U test) non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in the value of a parameter between small ... Wikipedia

    Mann-Whitney test

    Mann-Whitney-Wilcoxon test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

    Mann-Whitney-Wilcoxon test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

    Mann Whitney test- - Telecommunication topics, basic concepts EN Mann Whitney U test ... Technical translator's guide

    Wilcoxon-Mann-Whitney test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

    Wilcoxon-Mann-Whitney test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

    Wilcoxon rank sum test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

    Wilcoxon rank sum test- The Mann Whitney U test is a non-parametric statistical test used to assess the differences between two samples in terms of the level of any trait, measured quantitatively. Allows you to identify differences in meaning ... Wikipedia

Books

  • Statistics and seals, Vladimir Savelyev. In this book, you will learn what variance is and standard deviation how to find Student's t-test and Mann-Whitney U-test, for which regression and factor analyzes,… More Buy for 280 rubles e-book

Purpose of the criterion

U - the Mann-Whitney test is designed to assess the differences between two samples in terms of level any sign measured starting from the order scale (not lower). It allows you to identify differences between small samples, when n 1, n 2 3 or n 1 = 2, n 2 5, and is more powerful than the Rosenbaum criterion.

This method determines whether the zone of overlapping values ​​between two rows of ordered values ​​is small enough. At the same time, the 1st row (group sample) is the row of values ​​in which the values, according to the preliminary estimate, are higher, and the 2nd row is the one where they are presumably lower.

The smaller the crossover area, the more likely the differences are to be significant. These differences are sometimes referred to as differences in location two samples.

The calculated (empirical) value of the criterion U reflects how large the zone of coincidence between the rows is. Therefore, the smaller U emp. the more likely that the differences are significant.

Criteria restrictions

    The trait must be measured on an ordinal, interval, or proportional scale.

    Samples must be independent.

    Each sample must contain at least 3 observations: n 1 ,n 2 3 ; it is allowed that there are 2 observations in one sample, but then there must be at least 5 of them in the second.

    Each sample should contain no more than 60 observations: n 1 ,n 2 60. However, already at n 1 ,n 2 20 ranking becomes quite laborious.

Algorithm for calculating the Mann-Whitney criterion.

    To calculate the criterion, it is necessary to mentally combine all the values ​​of the 1st sample and the 2nd sample into one common combined sample and arrange them.

It is convenient to make all calculations in a table (table 28), consisting of 4 columns. This table contains the ordered values ​​of the combined sample.

Wherein:

    merged sample values ​​are sorted in ascending order;

    the values ​​of each of the samples are recorded in their own column: the values ​​of the 1st sample are recorded in column No. 2, the values ​​of the 2nd sample are recorded in column No. 3;

    each value is written on a separate line;

    the total number of rows in this table is N=n 1 +n 2 , where n 1 is the number of subjects in the 1st sample, n 2 is the number of subjects in the 2nd sample

Table 28

R 1

R 2

    The values ​​of the combined sample are ranked according to the ranking rules, and the ranks R 1 corresponding to the values ​​of the 1st sample are written in column No. 1, the ranks R 2 corresponding to the values ​​of the 2nd sample are written in column No. 4,

    The sum of ranks is calculated separately for column No. 1 (for sample 1) and separately for column No. 4 (for sample 2). Be sure to check if the total rank sum matches the calculated rank sum for the pooled sample.

    Determine the larger of the two rank sums. Let's denote it as T x.

    Determine the calculated value of the criterion U by the formula:

where n 1 - the number of subjects in sample 1,

n 2 - the number of subjects in sample 2,

T x - the larger of the two rank sums,

n x - the number of subjects in the sample with a larger sum of ranks.

    Output rule: Determine the critical values ​​of U according to the table of critical values ​​for the Mann-Whitney test.

If U emp. U cr. 0.05, the differences between the samples are not statistically significant.

If U emp. U cr. 0.05, the differences between the samples are statistically significant.

How less value U, the higher the reliability of differences.

Control questions:

    What are the conditions for applying the Student's criterion.

    What parameters of feature distributions do you need to know in order to calculate Student's t-test?

    Formulate a decision rule based on the results of calculations of the Student's criterion.

    Why is it necessary to simultaneously evaluate the variability of features in samples when calculating the Student's t-test?

    How can two variances be compared?

    In what cases is it necessary to introduce the Snedekor correction into the derivation rule for the Student's criterion?

    What are the conditions for applying the Rosenbuam criterion.

    Formulate a decision rule based on the results of calculations of the Rosenbaum criterion.

    List the conditions for applying the Mann-Whitney test.

    What is the total pooled sample when calculating the Mann-Whitney test.

    Formulate a decision rule based on the results of calculations of the Mann-Whitney criterion.

Independent practical task:

Study the Kruskal-Wallis criteria and the Jonkyer tendencies from textbooks on your own. Make a summary according to the scheme similar to the one used in the lectures.

Materials for studying the topic:

a) basic literature:

    Ermolaev O. Yu. Mathematical statistics for psychologists [Text]: textbook / O. Yu. Ermolaev. - 5th ed. - M.: MPSI: Flinta, 2011. - 336 p. - S. 101-124; 169-172.

    Nasledov A.D. Mathematical Methods psychological research: Analysis and interpretation of data [Text]: textbook / A. D. Nasledov. - 3rd ed., stereotype. - St. Petersburg: Speech, 2007. - 392 p. - S. 162-167; 173-176; 181-182.

    Sidorenko E. V. Methods of mathematical processing in psychology [Text] / E. V. Sidorenko. - St. Petersburg: Speech, 2010. - 350 p.: ill. - S. 39-72.

b) additional literature:

    Glass J. Statistical Methods in pedagogy and psychology [Text]. / J. Glass, J. Stanley - M., 1976. - 494 p. - S. 265-280.

    Kuteinikov A.N. Mathematical methods in psychology [Text]: educational and methodological complex / A. N. Kuteinikov. - St. Petersburg: Speech, 2008. - 172 p.: tab. - S. 81-93.

    Sukhodolsky G.V. Fundamentals of mathematical statistics for psychologists [Text]: textbook / G. V. Sukhodolsky. - St. Petersburg: Publishing House of St. Petersburg State University, 1998. - 464 p. - S. 305-323.

Criterion U Mann - Whitney

Assigning a criterion. The criterion is designed to assess the differences between two samples by level any trait that can be quantified. It allows you to distinguish between small samples when P 1, n 2 > 3 or p L \u003d 2, p 2\u003e 5, and is more powerful than the criterion Q Rosenbaum.

This method determines if the area of ​​overlapping values ​​between two series is small enough. We remember that we call the 1st row (sample, group) the row of values ​​in which the values, according to a preliminary estimate, are higher, and the 2nd row is the one where they are supposedly lower.

The smaller the crossover area, the more likely it is that differences reliable. These differences are sometimes referred to as differences in location two samples. The empirical value of the criterion reflects how large the zone of coincidence between the rows is. That's why the less t/ 3Mn , especially it is likely that the differences reliable.

Hypotheses.

The level of non-verbal intelligence in the group of physics students is higher than in the group of psychology students.

Graphical representation of a criterionU. Pa fig. 7.25 shows three of the many options ratios of two series of values.

In option (a), the second row is lower than the first, and the rows almost do not intersect. Overlay area ( S j) too small to obscure differences between rows. There is a chance that the differences between them are significant. We can determine this exactly using the criterion U.

In variant (b), the second row is also lower than the first, but the area of ​​overlapping values ​​for the two rows is quite extensive (5 2). It may not yet reach a critical value, when the differences will have to be recognized as insignificant. But whether this is so can only be determined by exact calculation of the criterion U.

In option (c), the second row is lower than the first, but the overlap is so extensive (5 3) that the differences between the rows are obscured.

Rice. 7.25.

in two samples

Note. The overlap (5 t , S 2 , *$z) indicates the areas of possible overlap. Criteria restrictionsU.

  • 1. Each sample must contain at least three observations: n v p 2 > 3; it is allowed that there are two observations in one sample, but then there must be at least 5 of them in the second.
  • 2. Each sample should contain no more than 60 observations; p l, p 2 w, n 2 > 20 ranking becomes quite laborious.

Let us return to the results of the examination of students of physical and psychological faculties Leningrad University using D. Veksler's technique for measuring verbal and non-verbal intelligence. Using the criterion Q Rosenbaum was with high level significance, it was determined that the level of verbal intelligence in the sample of students of the Faculty of Physics is higher. Let us now try to establish whether this result is reproduced when comparing samples according to the level of non-verbal intelligence. The data are given in the table.

2 is below the level of the trait in sample 1 at a significantly significant level. The smaller the value u, the higher the significance of the differences.

Now let's do all this work on the material of our example. As a result of work on 1-6 steps of the algorithm, we will build a table (Table 7.4).

Table 7.4

Calculation of rank sums for samples of students of physical and psychological faculties

Physics students (P = 14)

Psychology students (n= 12)

Non-verbal intelligence score

Average 107.2

The total amount of ranks: 165 + 186 = 351. The calculated amount according to the formula (5.1) is as follows:

The equality of the real and estimated amounts is observed. We see that in terms of the level of non-verbal intelligence, a sample of psychology students is more “higher”. It is this sample that accounts for a large rank sum: 186. Now we are ready to formulate statistical hypotheses:

Self 0: a group of psychology students does not outperform a group of physics students in terms of non-verbal intelligence;

Me: a group of psychology students outperforms a group of physics students in terms of non-verbal intelligence.

In accordance with the next step of the algorithm, we determine the empirical value U :

Because in our case p l * p 2, calculate the empirical value U and for the second rank sum (165), substituting into formula (7.4) the corresponding p x.:

According to Appendix 8, we determine the critical values ​​for p l = 14, n 2 = 12:

We remember that the criterion U is one of two exceptions to general rule making a decision about the reliability of differences, namely, we can state significant differences if (/ emp U Kp 0 05 (at temp = 60, and sp > U Kf) about,05).

Hence, H 0 is taken as follows: the group of psychology students does not surpass the group of physics students in terms of the level of non-verbal intelligence.

Let us pay attention to the fact that for this case the Rosenbaum Q-criterion is not applicable, since the range of variability in the group of physicists is wider than in the group of psychologists: both the highest and the lowest values ​​of non-verbal intelligence fall on the group of physicists (see Table 7.4) .