
Minkowski distance
Objective
The Minkowski distance (e.g. Kruskal 1964) is a generalised metric that includes others as special cases of the generalised form. Although theoretically infinite measures exist by varying the order of the equation just three have gained importance.
Equation

In the equation dMKD is the Minkowski distance between the data record i and j, k the index of a variable, n the total number of variables y and λ the order of the Minkowski metric. Although it is defined for any λ > 0, it is rarely used for values other than 1, 2 and ∞. As infinity can not be displayed in computer arithmetics the Minkowski metric is transformed for λ = ∞ and it becomes:
 Or in easier words the Minkowski metric of the order ∞ returns the distance along that axis on which the two objects show the greatest absolute difference.
|
The way distances are measured by the Minkowski metric of different orders between two objects with three variables (here displayed in a coordinate system with x-, y- and z-axes). The unfolded cube shows the way the different orders of the Minkowski metric measure the distance between the two points. |
Synonyms
Different names for the Minkowski distance or Minkowski metric arise form the order:
- λ = 1 is the Manhattan distance. Synonyms are L1-Norm, Taxicab or City-Block distance. For two vectors of ranked ordinal variables the Mahattan distance is sometimes called Footruler distance.
- λ = 2 is the Euclidean distance. Synonyms are L2-Norm or Ruler distance. For two vectors of ranked ordinal variables the Euclidean distance is sometimes called Spearman distance.
- λ = ∞ is the Chebyshev distance. Synonym are Lmax-Norm or Chessboard distance.
Usage
The Minkowski distance is often used when variables are measured on ratio scales with an absolute zero value. Variables with a wider range can overpower the result. Even a few outliers with high values bias the result and disregard the alikeness given by a couple of variables with a lower upper bound.
Algorithm
The algorithm controls whether the data input matrix is rectangular or not. If not the function returns FALSE and a defined, but empty output matrix. When the matrix is rectangular the Minkowski distance of the respective order is calculated. Therefore the dimensions of the respective arrays of the output matrix and the titles for the rows and columns set. As the result is a square matrix, which is mirrored along the diagonal only values for one triangular half and the diagonal are computed. When errors occur during computation the function returns FALSE.
Source
Function dist_Minkowski (InputMatrix : t2dVariantArrayDouble; MinkowskiOrder: Double; Var OutputMatrix : t2dVariantArrayDouble) : Boolean; // The function dist_MinkowskiDistance calculates the Minkowski distance of // the order given in MinkowskiOrder. The cases are expected in the rows. // The variables are expected in the columns. Function returns FALSE if // at least one cell can not be calculated. The result matrix is returned // in OutputMatrix. // (c) Dr. Jan Schulz, 04.May 2008; www.code10.info Var OutputMatrixSize : Integer; InputCols : Integer; InputRows : Integer; RunnerY : Integer; RunnerX : Integer; i : Integer; Summed : Double; FirstVal : Double; SecondVal : Double; Begin // if one dimension is zero or matrix not rectangular or Minkowski order is zero If Not mtx_IsRectangular (InputMatrix, InputRows, InputCols) Or (MinkowskiOrder = 0) THen Begin //create an empty matrix, return FALSE and exit mtx_Create (OutputMatrix, 1, 1, 0, 'Erroneous Minkowski distance matrix'); dist_Minkowski := False; Exit; end;
// Minkowski order needs to be positive MinkowskiOrder := Abs (MinkowskiOrder);
// let's expect the best case ... dist_Minkowski := True;
// define and set the row dimension of the result matrix OutputMatrixSize := High (InputMatrix.Cells) + 1; SetLength (OutputMatrix.Cells, OutputMatrixSize);
// create the column dimension of the array For RunnerY := Low (OutputMatrix.Cells) to High (Outputmatrix.Cells) do Begin SetLength (OutputMatrix.Cells [RunnerY], OutputMatrixSize); end;
// define title of matrix OutputMatrix.MatrixName := 'Minkowski distance matrix of order ' + FloatToStr (MinkowskiOrder);
// Set Row/Col-Title for the new matrix SetLength (OutputMatrix.RowTitle, OutputMatrixSize); SetLength (OutPutMatrix.ColTitle, OutPutMatrixSize); For RunnerY := Low (InputMatrix.RowTitle) to High (InputMatrix.RowTitle) do Begin // names for rows and columns are the same in this triangualary matrix OutputMatrix.RowTitle [RunnerY] := InputMatrix.RowTitle [RunnerY]; OutputMatrix.ColTitle [RunnerY] := InputMatrix.RowTitle [RunnerY]; end;
// compare every object For RunnerY := Low (OutputMatrix.Cells) to High (OutputMatrix.Cells) do Begin //with every other object For RunnerX := Low (OutputMatrix.Cells) to RunnerY do Begin Summed := 0;
//include all variables in analysis For i := 0 to High (InputMatrix.Cells [0]) do Begin FirstVal := InputMatrix.Cells [RunnerX, i]; SecondVal := InputMatrix.Cells [RunnerY, i];
// add the power of the absolute difference Summed := Summed + Power (Abs (FirstVal - SecondVal), MinkowskiOrder); end;
Summed := Power (Summed, 1/MinkowskiOrder); // set the calculated value on both sides of diagonal and diagonal itself OutputMatrix.Cells [RunnerX, RunnerY] := Summed; OutputMatrix.Cells [RunnerY, RunnerX] := Summed; end; end; end;
Example
For a data matrix aInputMatrix of the type t2dVariantArrayDouble, populated with:
Data
|
Var1
|
Var2
|
Var3
|
Case1
|
1
|
1
|
1
|
Case2
|
1
|
1
|
0
|
Case3
|
2
|
2
|
2
|
Case4
|
10
|
10
|
10
|
Case5
|
11
|
11
|
11
|
Case6
|
10
|
5
|
0
|
the call of:
aBooleanVar := dist_Minkowski (aInputMatrix, 1, aOutputMatrix);
returns the respective Minkowski matrix of the first order in aOutputMatrix:
Manhattan
distance
|
Case1
|
Case2
|
Case3
|
Case4
|
Case5
|
Case6
|
Case1
|
0
|
1
|
3
|
27
|
30
|
14
|
Case2
|
1
|
0
|
4
|
28
|
31
|
13
|
Case3
|
3
|
3
|
0
|
24
|
27
|
13
|
Case4
|
27
|
28
|
24
|
0
|
3
|
15
|
Case5
|
30
|
31
|
27
|
3
|
0
|
18
|
Case6
|
14
|
13
|
13
|
15
|
18
|
0
|
and the call of:
aBooleanVar := dist_Minkowski (aInputMatrix, 2, aOutputMatrix);
returns the respective Minkowski matrix of the second order in aOutputMatrix:
Euclidean
distance
|
Case1
|
Case2
|
Case3
|
Case4
|
Case5
|
Case6
|
Case1
|
0
|
1.000
|
1.732
|
15.588
|
17.321
|
9.899
|
Case2
|
1.000
|
0
|
2.449
|
16.186
|
17.916
|
9.849
|
Case3
|
1.732
|
2.449
|
0
|
13.856
|
15.588
|
8.775
|
Case4
|
15.588
|
16.186
|
13.856
|
0
|
1.732
|
11.180
|
Case5
|
17.321
|
17.916
|
15.588
|
1.732
|
0
|
12.570
|
Case6
|
9.899
|
9.849
|
8.775
|
11.180
|
12.570
|
0
|
Characteristic for the Minkowski distance is to represent the absolute distance between objects independently from their distance to the origin. This is contrary to several other distance or similarity/dissimilarity measurements. Thus, the distance between the objects Case1 and Case3 is the same as between Case4 and Case5 for the above data matrix, when investigated by the Minkowski metric.
Literature
Kruskal J.B. (1964): Multidimensional scaling by optimizing goodness of fit to a non metric hypothesis. Psychometrika 29(1):1-27.
|