A New Weighted Rank Correlation

: Problem Statement: There have been many cases in real life where two independent sources have ranked n objects, with the interest focused on agreement in the top rankings. Spearman's rho and Kendall's tau coefficients assigned equal weights to all rankings. As a result, the literature proposed several weighted correlation coefficients with emphasis on the top rankings, including the top-down, weighted Kendall's Tau and Blest’s correlation coefficient. Approach: This article introduced a new weighted rank correlation coefficient that was sensitive to agreement in the top rankings. It presented the limiting distribution under the null hypothesis of independence and provided a summary of quantiles of the exact null distribution for n = 3(1)9. Results: The article summarized the power comparison between the new weighted coefficient and other weighted coefficients, and showed that the new weighted rank correlation coefficient provided the locally most powerful rank test. Conclusions/Recommendations: The new weighted correlation should be used along with other weighted coefficients when the interest focused on agreement in the top rankings, in order to make an effective inference.


INTRODUCTION
Every year many students want to apply for postgraduate courses and research, leading to a large number of applicants to universities. Postgraduate committee can choose only few of them, according to some criteria such as GPA and the average of their grades in the major courses that they have studied before. Since the number of the applicants is large, the aim is to minimize the effort and the cost of interviewing all the candidates while choosing the best among them. In such cases, a measure which gives more weight to those who have higher grades is required. Many other cases in life require more weight for values in the top in order to reach decision. For instance, a couple of panels of judges in one of the Olympic game wants to choose the best participants.
For such cases, correlation measures that give more weighted for the top rankings were presented by [1,4,5] . To review these measures briefly, let {(X i , Y i ), 1 i n} be an independently and identically distributed (i.i.d.) sample from a bivariate distribution where q i is the rank of Y whose corresponding X has rank i among {X j }. Throughout we assume that no ties occur among the variables being considered. If ties occur, the average of weighted score can be used. Iman and Conover [4] introduced the top-down correlation coefficient, R t , as: Where, S i is Savage score [4] defined as: n i j=i S = 1 / j Shieh [5] proposed the weighted Kendall's Tau, R k , which is given by: Where, m is the number of top rankings taken into account and sgn(a) = -1, 0 or 1, if a < , = or > 0.
A graphical approach was proposed by [1] , leading to a correlation coefficient R b , which is given by: 2n 1 12 R = (n 1 i) q n 1 n(n 1) (n 1)

227
A new weighted rank correlation, R w , that depends on weighted scores, will be introduced along with its asymptotic distribution under the null hypothesis of independence. Then some exact and approximated quantiles of R w are summarized. Power comparisons between R w and other reviewed coefficients will be presented. Finally, an example is given for illustration.

MATERIALS AND METHODS
A new weighted rank correlation: Let (X i , Y i ), (1 i n) be an i.i.d. sample from a bivariate distribution and let (i, q i ), i = 1, 2,…, n, be paired rankings of n objects, where q i is the rank of the Y values whose corresponding X has rank i among all {X j }. We define weighted scores as: Where, i is the rank of the order observations in a sample of size n and 0<w<1.
The new weighted rank coefficient R w is obtained by computing the ordinary Pearson correlation coefficient, r, on the weighted scores, Where, The statistic R w has a maximum value of 1. However, its minimum possible value is only -1 for n = 2, similar as the top-down correlation [4] and for ∞ → n it increases from -1 towards approximately a value in the range from -2E -6 to -3E -4, depending on the value of w.
The asymptotic distribution of R w : Now, the asymptotic distribution of R w is derived under H 0 , the null hypothesis of independence. The alternative hypothesis of a positive dependence in the rankings can be detected using any of several statistics. The weighted rank correlation R w is more sensitive to agreement in the top ranks than to agreement in the bottom. For a test of H 0 that is equally sensitive to agreement among ranks at all levels, Spearman's rho or Kendall's tau correlation coefficient can be used. If the marginal distributions are normal and the alternative hypothesis is bivariate normal with positive correlation, the Pearson correlation coefficient, r , provides the most powerful test of H 0 against the alternative. Under H 0 , the asymptotic distribution of R w is given by the following theorem: Theorem 1: Under the null-hypothesis of independence, E(R w ) = 0, V(R w ) = 1/(n-1) and the asymptotic distribution of (n-1) 1/2 R w is the standard normal distribution.

Proof:
The mean and the variance of the R w , under H 0 , are computed as follows. Since , then by substituting in (5) we directly obtain that E(R w ) = 0. For the variance, a (R ,f )a (Q ,g) . That is, R w is written as a linar rank statistic. Under H 0 , using Theorem V.1.8 in Hájek and Šidák [3] , the distribution of the statistic R w for n → ∞ is asymptotically normal with mean 0 and variance 2 Exact and approximate quantiles of R w : When the null hypothesis is true, all permutations of ranks (I, q i ), 1 i n, are equally likely where w can take any value between 0 and 1, exclusive. Then, exact and approximate quantiles of R w can be computed for chosen values of w, say 0.3, 0.6 and 0.9. Exact quantiles for n = 3(1)9 are summarized in Table 1 and for large n, approximate quantiles are shown in Table 2.   (1), (2) and (3), respectively, are shown in Table 3. From Table 3, we note that R w has better power than other correlation coefficients, especially for w = 0.9 at small sample size (e.g., n = 8) and at significant level = 0.05, as shown in Fig. 1.

RESULTS AND DISCUSSION
Numerical Example: To illustrate our new weighted rank correlation, we use a data set, in Table 4, that was also used by [5] . The data set considers two techniques, A and B, used to select the most effective variables out of 20 variables for evaluation of some software packages. We see that the two techniques agree strongly on the top six variables. However, there is large disagreement between these techniques after that. In such circumstances, we may want to place more emphasize on the top rankings rather than equity over all ranking values. Therefore, we calculate some different weighted rank statistics, along with our weighted rank correlation at different weighted values. For each statistic the corresponding p-values are evaluated, these values are given in Table 5.
From Table 5 we can conclude that at different weight values, our weighted rank correlation and the  top-down correlation provide strong evidence (p-value <0.001) against the null hypothesis of independence of A and B. The criteria behind choosing the weight depends on the degree of emphasis the user may wish to apply to top ranks. However, we suggest the weight w = 0.9 since as shown in Table 3, our new weighted rank with w = 0.9 has higher power than other rank correlation coefficients.

CONCLUSION
This article proposed a new weighted rank correlation coefficient that was sensitive to agreement in the top rankings. Under the null hypothesis of independence, the proposed coefficient's limiting distribution was derived along with the exact and approximated quantiles for different sample sizes. As shown, the test that depended on the new weighted rank correlation coefficient was the locally most powerful rank test. Therefore, when interest focused on the top rankings, we recommended using the new weighted rank correlation coefficient, together with other weighted coefficients, to reach an effective decision.
A generalization of this article, when more than two independent sources rank n objects with focus on top rankings, known as a Concordance measure, will be presented somewhere else.