Proximity Problems
From Wikimization
Line 64: | Line 64: | ||
Let <math>\,\|~\|_F\,</math> denote the Frobenius norm and <math>\,\Delta\,</math> a given symmetric <math>\,n\times n\,</math> matrix of squared dissimilarities. Let <math>\,W=W(\Delta)\,</math> and <math>\,W_s=W_s\,(\Delta)</math>. | Let <math>\,\|~\|_F\,</math> denote the Frobenius norm and <math>\,\Delta\,</math> a given symmetric <math>\,n\times n\,</math> matrix of squared dissimilarities. Let <math>\,W=W(\Delta)\,</math> and <math>\,W_s=W_s\,(\Delta)</math>. | ||
+ | |||
Classical MDS can be defined by the optimization problem '''(P)''': | Classical MDS can be defined by the optimization problem '''(P)''': | ||
Line 78: | Line 79: | ||
\end{array}</math> | \end{array}</math> | ||
- | The following explicit solution to problem '''(P)''' (respectively problem '''(P_s)''') is well known: | + | The following explicit solution to problem '''(P)''' (respectively problem '''(P_s)''') is well known: |
- | let <math>\,\lambda_1\geqslant\ldots\geqslant\lambda_n\,</math> denote the eigenvalues of <math>\,W\,<math> ( | + | let <math>\,\lambda_1\geqslant\ldots\geqslant\lambda_n\,</math> denote the eigenvalues of <math>\,W\,</math> (respectively of <math>\,W_s\,</math>) |
and <math>\,v_1\ldots v_n\,</math> denote the corresponding eigenvectors. | and <math>\,v_1\ldots v_n\,</math> denote the corresponding eigenvectors. | ||
- | Assume that the <math>p</math> largest eigenvalues are positive. | + | |
+ | Assume that the <math>\,p\,</math> largest eigenvalues are positive. | ||
Then | Then | ||
Line 124: | Line 126: | ||
'''Result.''' The following inequality holds: | '''Result.''' The following inequality holds: | ||
- | Given <math>\,p\leqslant n-1\,</math>, for any <math>\,B\in\Omega_n(p)\,</math>, let <math>\,D=\text{diag}(B)e^ | + | Given <math>\,p\leqslant n-1\,</math>, for any <math>\,B\in\Omega_n(p)\,</math>, let <math>\,D=\text{diag}(B)e^{\rm T}+e\;\text{diag}(B)^{\rm T}-2B\,</math>. |
Then | Then | ||
Line 145: | Line 147: | ||
;Theorem. | ;Theorem. | ||
- | For any <math>\,s\in\Bbb R^n\,</math> such that <math>\,s^te=1\,</math> and for any <math>p</math> we have | + | For any <math>\,s\in\Bbb R^n\,</math> such that <math>\,s^te=1\,</math> and for any <math>\,p\,</math> we have |
<math>f \leqslant f_s</math> | <math>f \leqslant f_s</math> | ||
Line 152: | Line 154: | ||
'''Proof.''' We show that for all <math>\,i\,</math>, <math>\,|\lambda_i(W)|\leqslant|\lambda_i(W_s)|\,</math>. | '''Proof.''' We show that for all <math>\,i\,</math>, <math>\,|\lambda_i(W)|\leqslant|\lambda_i(W_s)|\,</math>. | ||
Toward this end, we consider the two cases: | Toward this end, we consider the two cases: | ||
- | * If | + | * If <math>\,W\,</math> is psd then <math>\,W_s\,</math> is psd and the inequality became <math>\,\lambda_i(W)\leqslant \lambda_i(W_s)\,</math>. But |
- | <math>\lambda_i(W)=\lambda_i(JW_sJ)\leqslant \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)<math> | + | <math>\lambda_i(W)=\lambda_i(JW_sJ)\leqslant \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)</math> |
because <math>\,\lambda_1(J)=1\,</math>. | because <math>\,\lambda_1(J)=1\,</math>. | ||
- | * If <math>\,W\,<math> is not psd, then using the definition of <math>\,J\,<math>: | + | * If <math>\,W\,</math> is not psd, then using the definition of <math>\,J\,</math>: |
<math>\begin{array}{rcl} | <math>\begin{array}{rcl} | ||
Line 169: | Line 171: | ||
<math>\lambda_1(-\frac{1}{n}ee^tW_sJW_s)= 0</math> | <math>\lambda_1(-\frac{1}{n}ee^tW_sJW_s)= 0</math> | ||
- | because <math>\,J\,</math> and <math>\,W_s^2\,<math> are PSD we have | + | because <math>\,J\,</math> and <math>\,W_s^2\,</math> are PSD we have |
- | <math>\lambda_i(W_sJW_s)\leqslant\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2) \ | + | <math>\lambda_i(W_sJW_s)\leqslant\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\square</math> |
- | + | = Modified Gower problem = | |
In this Section we consider the following problem : given a non Euclidean matrix, can we find an $s$ that maximizes the total squared real distances {\bf from the points to the centroid given by} $s$ in the fitted configuration. What is this choice of $s$ ? | In this Section we consider the following problem : given a non Euclidean matrix, can we find an $s$ that maximizes the total squared real distances {\bf from the points to the centroid given by} $s$ in the fitted configuration. What is this choice of $s$ ? | ||
Revision as of 16:08, 3 July 2008
Contents |
Abstract
The aim of this short paper is to give an algebraic result that relate two criteria in multidimensional scaling...
- Key-words
- Euclidean distance, Multidimensional scaling, strain, sstress, comparing criteria.
Introduction
We consider an pre-distance matrix defined as a real symmetric matrix where for and for all .
is said to be Euclidean distance matrix of dimension if there exist points in such that
where denotes Euclidean norm. Denote by the set of Euclidean distance matrices of dimension .
A problem common to various sciences is to find the Euclidean distance matrix closest, in some sense, to a given predistance matrix . There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable versus absolute distance .
During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.
Mathematical preliminaries
It is well known that if and only if the symmetric matrix
is positive semidefinite with , where $\,e\,$ is a vector of ones and $\,s\,$ is any vector such that .
This result was proved by Gower... as a generalization of an earlier result of Schoenberg... Later Gower considered the particular choices and where is the vector from the standard basis. In what follows, when then matrix will be denoted by :
We see no compelling reason to prefer one particular over another. Each has its own coherent interpretation. Neither can we say any particular problem formulation produces generally better results than another. Dattorro...
The aim of this short paper is to clarify that point...
We shall also use the notation
so the equations (2) and (3) can be written:
It is easy to verify the following properties:
Classical MDS
Given , let denote the set of Euclidean distance matrices of dimension and denote the closed set of symmetric matrices that are positive semidefinite and have rank no greater than .
Let denote the Frobenius norm and a given symmetric matrix of squared dissimilarities. Let and .
Classical MDS can be defined by the optimization problem (P):
Problem (P) can be viewed as a particular case of a more general optimization problem (P_s):
The following explicit solution to problem (P) (respectively problem (P_s)) is well known: let denote the eigenvalues of (respectively of ) and denote the corresponding eigenvectors.
Assume that the largest eigenvalues are positive. Then
is a global minimum of problem (P) (respectively of problem (P_s)). Furthermore, the minimum value for problem (P) is
and for problem (P_s)
In Section 5, we will prove that for any squared dissimilarity matrix we have
that is, at the minimum, the strain criterion always gives smaller value than criterion (P_s). In order to show this result we shall use
- Lemma
Let denote the eigenvalues of any symmetric matrix in decreasing order.
- For all $A,B$ :
- For all positive semidefinite :
Proof. see, for instance, Wilkinson...
Comparing strain and sstress
In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:
Result. The following inequality holds: Given , for any , let . Then
Proof. Let ; we have
Writing we get
Main result
In this section we show an inequality involving the criteria in (12) and in (13).
- Theorem.
For any such that and for any we have
Proof. We show that for all , .
Toward this end, we consider the two cases:
- If is psd then is psd and the inequality became . But
because .
- If is not psd, then using the definition of :
But
because and are PSD we have
Modified Gower problem
In this Section we consider the following problem : given a non Euclidean matrix, can we find an $s$ that maximizes the total squared real distances {\bf from the points to the centroid given by} $s$ in the fitted configuration. What is this choice of $s$ ?
This problem can be written as an optimization problem in the following maneer. First let remark that if $\Delta$ is not Euclidean the number of negative eigenvalues of $W_s=W_s(\Delta)$ does not depend on $s$. Let call this number $p$.
The total squared real distances {\bf from the points to the centroid given by} $s$ in the fitted configuration can be written as \begin{equation} \sum_{i=1}^p\lambda_i(W_s) \end{equation} where $\lambda_i(W_s)$ denotes the $i$th eigenvalue of $W_s$. But by a well known result we have $$ \sum_{i=1}^p\lambda_i(W_s)=\max_{X^tX=\mathbf{I_p}}\;\text{trace}(X^tW_sX) $$ where $X$ is an $n\times p$ matrix and $\mathbf{I_p}$ is the $p\times p$ identity matrix. So the final optimisation problem can be written as \begin{equation} \max_{s^te=1,s\geqslant 0}\max_{X^tX=\mathbf{I_p}}\;\text{trace}(X^tW_sX) \end{equation} where $$ X^tW_sX=X^tWX-X^tWse^tX-X^tes^tWX+X^tes^tWse^tX $$
\noindent{\bf Question :} Is it true that at the optimum, the problem $(20)$ is equivalent to the problem \begin{equation} \max_{X^tX=\mathbf{I_p}}\max_{s^te=1,s\geqslant 0}\;\text{trace}(X^tW_sX) \end{equation}
??? %\appendix %\section*{Appendix} %In this appendix, we will show that \begin{flushleft}{\Large\bf References} \end{flushleft}
\vspace{3mm}\noindent [1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.
\vspace{3mm}\noindent [2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. Nrth-Holland, Amsterdam, pp. 285-316 (chapter 13).
\vspace{3mm}\noindent [3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.
\vspace{3mm}\noindent [4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.
\vspace{3mm}\noindent [5] Schoenberg, I.J., 1935. Remarks to Maurice Fr\'echet's article Sur la d\'efinition axiomatique d'une classe d'espaces distanci\'es vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.
\vspace{3mm}\noindent [6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.
\vspace{3mm}\noindent [7] Trosset, M.W., 1997. Numerical algorithms for multidimensional scaling. In: Klar, R., Opitz, P. (Eds.), Classification and Knowledge Organization. Springer, Berlin, pp. 80-92. % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent % %\vspace{3mm}\noindent \end{document}