Friedman-Rafsky test

Use this test to compare the distributions of two quantitative data samples. Available in Excel using the XLSTAT software.

What is the Friedman-Rafsky test?

The Friedman-Rafsky test is a nonparametric two-sample test. The null hypothesis is:

H0: The X and Y samples follow the same distribution function i.e., Fx =Fy

This test is a multivariate generalization of the Wald-Wolfowitz test.

Nonparametric tests do not rely on any distribution. They can thus be applied even if parametric conditions of validity are not met. Check our guide to learn more about the differences between parametric and nonparametric tests.

If you are not sure about the test you should use, read our guide about choosing the appropriate test according to the situation.

How to run a Friedman-Rafsky test in XLSTAT?

The XLSTAT dialog box of the Friedman-Rafsky test is divided into several tabs that correspond to a variety of options, ranging from data selection to the display of results:

Data format:

Separated samples: select two tables (a sample with mm lines and another one with n lines) with possibly a different number of lines, but with the same number p of columns.
Merged samples: select a table with m+n lines and pp quantitative variables. Select the (binary) data identifying the samples to which the selected data values correspond.

Distance: this option allows you to select the metric you want to apply:

Euclidean distance,
Manhattan distance,
Chebychev distance,
Canberra distance.

Minimum spanning tree / Algorithm:

You can choose between three methods to compute the minimum spanning tree :

Chazelle (Soft-Heap) (by default): the Chazelle algorithm is a deterministic algorithm and is the fastest one (lowest asymptotic bounds) to compute a minimum spanning tree with a running time of O(mα(m,n)) where α is the classical functional inverse of Ackermann's function.
Kruskal: the Kruskal algorithm is one of the most used algorithms to compute a minimum spanning tree. This is a greedy algorithm that is recommended for small samples.
Boruvka: This is the first algorithm invented to compute a minimum spanning tree. It is also a greedy algorithm.

Which are the results of the XLSTAT Friedman-Rafsky test?

Descriptive statistics: the table of descriptive statistics shows the simple statistics for all the variables selected. The number of observations per variable and per sample, the minimum, the maximum, the quartiles, the mean, the variance, and the standard deviation are displayed.

Results regarding the Friedman-Rafsky test: this table shows detailed results of the test like the value of the statistic W.

Results regarding the minimum spanning tree: this table is displayed to give you a view of the minimum spanning tree. Four columns inform on the edges of the tree, nodes that the edge relies on, the weight (the distance between two nodes), and if the nodes come from the same sample.

Results regarding distance matrix: this table shows distances between each point of the two samples.