Reliability of multivariate measures of geometric distance with special reference to the Penrose distance

Authors

  • Maciej Henneberg

DOI:

https://doi.org/10.18778/1898-6773.50.1.05

Abstract

The author briefly describes the idea of multivariate geometric distance (eq. (1), (1a)) and states that it may be, for comparisons between groups, treated formally as a variance of one set of averages around the other. Since the averages of variables characterizing groups are computed for a limited number of observations, they are in fact only estimates with a certain degree of error. According to well known statistical theorems variance of sample means around the mean of the general population is a function of standard deviation and sample size (eq. (6-10)). While we deal with normalized values (S.D.=1) this variance which square root is called a standard error becomes simply proportional to inverse sample size. Therefore a multivariate geometrical distance computed from sample averages (of C2H type) can be perceived as consisting of two parts: variance between averages of general populations from which the samples are respectively taken and variance of random error (eq. (11)). This allows corrections of geometrical distance values for sample sizes to be made (eq. (12)). Furthermore, if one is looking for general similarity or dissimilarity between two populations it must be taken into account that a measure of distance is computed on a limited number of variables (m) only. Appropriate correction for number of variables may be obtained in a form of confidence limits computation (eq. (13) and following). ,, Distances” estimated directly from samples are labelled Ĉ2, corrected for sample size only C2 and corrected for both sample size and number of variables ζ2 with appropriate indices: H — „shape distance”, Q — „size distance”, P — „generalized distance” and R — ,,generalized “distance” corrected for intercorrelation between variables. Numerical examples of statistical evaluation are given. When computing confidence interval limits for ,,generalized’ measures it should be taken into account that combination of probabilities of error for their constituent parts occurs and thus appropriate error estimates should not be multiplied eg. by 1,96 for 95% confidence interval but by 0,76 (see Przykład I and Przykład II). From general considerations and numerical results it may be concluded that for the range of sample sizes and variable numbers most commonly encountered in physical anthropology studies confidence intervals, of „generalized” measures of distance are quite wide and hence their reliability low, putting their practical usefulness to serious doubt. More reliable and easier testable turns out to be a simple geometrical distance. The third part of this paper deals with empirical proofs of sample size influence on distance values. First (table 1) are presented relationships between sample sizes and individual Ĉ2R distances computed by other authors. It may be seen that relationship is clear-cut and statistically significant. The same sort of relationship for D2 of Mahalanobis (table 2) turned out to be insignificant. The other kind of proof is provided by correlation between average sample size (arithmetic, geometric and harmonic means) and average distance calculated for the same set of sample groupings (table 3 and 4). This last result puts in doubt claims of some authors that a decrease of biological distance between groups occurred through time; Before arriving at such a conclusion they should correct their results for effects of sample sizes. The overall conclusion of the paper is that tests of statistical significance should be applied to various measures of distance and that the best of these measures is D2 of Mahalanobis. If this measure cannot be used for whatever reasons it is most advisable to use simple measure of geometrical distance corrected for sample sizes.

Downloads

Download data is not yet available.

References

Anderson T. W., 1958, Introduction to Multivariate Statistical Analysis, Nowy Jork.
View in Google Scholar

Caliński T., Z. Kaczmarek, 1976, A note on the calculation and use of the generalized distance: between multivariate samples, Zesz. Nauk. UAM, Geografia, 8, 7.
View in Google Scholar

Constandse-Westermann T. S., 1972, Coefficients of Biological Distance, Anthrop. Publ. Oosterhout.
View in Google Scholar

Czekanowski J., 1909, Zur Differentialdiagnose der Neandertalgruppe, Korrespondenzblatt. der Deutsch. Ges. fūr Anthrop., 40, 44.
View in Google Scholar

Jasicki B., S. Panek, P. Sikora, E. Stołyhwo, 1962, Zarys antropologii, Warszawa.
View in Google Scholar

KnussmannR., 1962, Moderne statistische Verfahren in der Rassenkunde, [w:] Die Neue Rassen-kunde (red. I. Schwidetzky), Stuttgart, 233.
View in Google Scholar

Knussmann R., 1967, Penrose-Abstand und Diskriminanzanałyse, Homo, 18, 134.
View in Google Scholar

Mahalanobis P. C., 1930, On tests and measures of group divergence, J. Asiat. Soc. Beng., 26, 541.
View in Google Scholar

Penrose L. S., 1954, Distance size and shape, Ann. Eug., 18, 337.
View in Google Scholar

Piontek J., 1979, Procesy mikroewolucyjne w europejskich populacjach ludzkich, Poznań.
View in Google Scholar

Piontek J., M. Kaczmarek, 1981, Badania etnograficzne w antropologii: Próba nowego spojrzenia, Przegl. Antrop., 47, 129.
View in Google Scholar

Rahman N. A., 1962, On the sampling distribution of the studentized Penrose measure of distance, Ann. Hum. Genet., 26, 97.
View in Google Scholar DOI: https://doi.org/10.1111/j.1469-1809.1962.tb01315.x

Rósing F. W., 1975, Die Fränkische Bevölkerung von Mannheim- Vogelstang (6.-7. Jh.) und die: Merowingerzeitlichen Germanengruppen Europas, Hamburg.
View in Google Scholar

Rösing F. W., I. Schwidetzky, 1977, Vergleichend-statistische Untersuchungen zur Anthropologie des friihen Mittelalters (500 - 1000 n.d.Z.), Homo, 28, 65.
View in Google Scholar

Roth-Lutra K. H., 1970, Vergleichend-statistische Untersuchungen zur Anthropologie des Früh- und Hochmittelalters in Europa I., Homo, 21, 104.
View in Google Scholar

Roth-Lutra K. H., 1971, Vergleichend-statistische Untersuchungen zur Anthropologie des Frūh- und Hochmittelalters in Europa II., Homo, 22, 84.
View in Google Scholar

Schwidetzky I., 1967a, Erfahrungen mit dem Penrose-Abstand, Homo, 18, 140.
View in Google Scholar

Schwidetzky 1., 1967b, Ergebnisse der Penrose-Analyse: Das Gesamtmaterial, Homo, 18, 174.
View in Google Scholar

Schwidetzky I., 1972, Vergleichend-statistische Untersuchungen zur Anthropologie der Eisenzeit, Homo, 23, 245.
View in Google Scholar

Schwidetzky I., F. W. Rósing, 1975, Vergleichend-statistische Untersuchungen zur Anthro-Pologie der Rämerzeit (0 - 500 u.Z.), Homo, 26, 193.
View in Google Scholar

Svób I, 1978, Genatyka populacji, Warszawa.
View in Google Scholar

Thoma A., 1978, Distance de forme entre groupes, Bull. et Mém. de la Soc. d’Anthrop. de Paris, 5, XIII, 15.
View in Google Scholar DOI: https://doi.org/10.3406/bmsap.1978.1902

Wanke A., 1955, Indywidualne określanie taksonomiczne, Przegl. Antrop., 21, 968.
View in Google Scholar

Published

1984-03-30

How to Cite

Henneberg, M. (1984). Reliability of multivariate measures of geometric distance with special reference to the Penrose distance. Anthropological Review, 50(1), 65–80. https://doi.org/10.18778/1898-6773.50.1.05

Issue

Section

Articles

Most read articles by the same author(s)

1 2 3 4 > >>