Proximity Problems

From Wikimization

(Difference between revisions)
Jump to: navigation, search
(Mathematical preliminaries)
Current revision (19:23, 28 September 2016) (edit) (undo)
(Comparing strain and sstress)
 
(8 intermediate revisions not shown.)
Line 6: Line 6:
= Introduction =
= Introduction =
-
We consider an <math>\,n\times n\,</math> matrix <math>\,D=[d_{ij}]\,</math> defined as a real symmetric matrix that is ''hollow'' <math>\,d_{ii}=0\,</math> for <math>\,i=1\ldots n\,</math> and nonnegative <math>\,d_{ij}\geqslant 0\,</math> for all <math>\,i,j\,</math>.
+
We consider an <math>\,n\times n\,</math> matrix <math>\,D=[d_{ij}]\,</math> defined as a real symmetric matrix that is ''hollow'' <math>\,d_{ii}=0\,</math> for <math>\,i=1\ldots n\,</math> and nonnegative <math>\,d_{ij}\geq 0\,</math> for all <math>\,i,j\,</math>.
<math>D\,</math> is said to be Euclidean distance matrix of dimension <math>\,p\,</math> if there exists a list of points <math>\,\{z_1\ldots z_n\}\,</math> in
<math>D\,</math> is said to be Euclidean distance matrix of dimension <math>\,p\,</math> if there exists a list of points <math>\,\{z_1\ldots z_n\}\,</math> in
-
<math>\,\Bbb R^p\,</math> <math>\,(p\leqslant n-1)\,</math> such that
+
<math>\,\mathbb{R}^p\,</math> <math>\,(p\leq n-1)\,</math> such that
<math>d_{ij}=\|z_i-z_j\|^2 \quad\forall\,i,j=1\ldots n</math>
<math>d_{ij}=\|z_i-z_j\|^2 \quad\forall\,i,j=1\ldots n</math>
where <math>\,\|~\|\,</math> denotes Euclidean norm.
where <math>\,\|~\|\,</math> denotes Euclidean norm.
-
Denote by <math>\,\Bbb{EDM}^n(p)\,</math> the set of all <math>\,n\times n\,</math> Euclidean distance matrices of dimension <math>\,p\,</math>.
+
Denote by <math>\,\mathbb{EDM}^n(p)\,</math> the set of all <math>\,n\times n\,</math> Euclidean distance matrices of dimension <math>\,p\,</math>.
-
A problem common to various sciences is to find the Euclidean distance matrix <math>\,D\in\Bbb{EDM}^n(p)\,</math> closest, in some sense, to a given ''predistance matrix'' <math>\,\Delta=[\delta_{ij}]\,</math> defined to be any symmetric hollow nonnegative real matrix.
+
A problem common to various sciences is to find the Euclidean distance matrix <math>\,D\in\mathbb{EDM}^n(p)\,</math> closest, in some sense, to a given ''predistance matrix'' <math>\,\Delta=[\delta_{ij}]\,</math> defined to be any symmetric hollow nonnegative real matrix.
There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable <math>\,d_{ij}\,</math> versus absolute distance <math>\,\sqrt{d_{ij}}\,</math>.
There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable <math>\,d_{ij}\,</math> versus absolute distance <math>\,\sqrt{d_{ij}}\,</math>.
During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an
During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an
-
<math>\,\Bbb{EDM}^n(p)\,</math> in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.
+
<math>\,\mathbb{EDM}^n(p)\,</math> in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.
= Mathematical preliminaries =
= Mathematical preliminaries =
-
It is well known that <math>\,D\in\Bbb{EDM}^n(p)\,</math> if and only if the symmetric <math>\,n\times n\,</math> matrix
+
It is well known that <math>\,D\in\mathbb{EDM}^n(p)\,</math> if and only if the symmetric <math>\,n\times n\,</math> matrix
<math>W_s(D)=-\frac{1}{2}(I-es^{\rm T})D(I-se^{\rm T})\qquad(1)</math>
<math>W_s(D)=-\frac{1}{2}(I-es^{\rm T})D(I-se^{\rm T})\qquad(1)</math>
-
is positive semidefinite with <math>\,\text{rank}(W_s(D))\leqslant p\,</math>, where <math>\,e\,</math> is a vector of ones and <math>\,s\,</math> is any vector such that
+
is positive semidefinite with <math>\,{\text rank}(W_s(D))\leq p\,</math>, where <math>\,e\,</math> is a vector of ones and <math>\,s\,</math> is any vector such that
<math>\,s^{\rm T}e=1\,</math>.
<math>\,s^{\rm T}e=1\,</math>.
Line 66: Line 66:
= Classical Multidimensional Scaling =
= Classical Multidimensional Scaling =
-
Given <math>\,p\leqslant n\,</math>, let <math>\,\Bbb S_+^n(p)\,</math> denote the closed set of symmetric <math>\,n\times n\,</math> matrices that are positive semidefinite and have rank no greater than <math>\,p\,</math>.
+
Given <math>\,p\leq n\,</math>, let <math>\,\mathbb{S}_+^n(p)\,</math> denote the closed set of symmetric <math>\,n\times n\,</math> matrices that are positive semidefinite and have rank no greater than <math>\,p\,</math>.
-
Let <math>\,\|~\|_{\rm F}\,</math> denote the Frobenius norm and <math>\,\Delta\,</math> a given symmetric <math>\,n\times n\,</math> matrix of squared dissimilarities. Let <math>\,W=W(\Delta)\,</math> and <math>\,W_s=W_s\,(\Delta)</math>.
+
Let <math>\,||.||_{\rm F}\,</math> denote the Frobenius norm and <math>\,\Delta\,</math> a given symmetric <math>\,n\times n\,</math> matrix of squared dissimilarities. Let <math>\,W=W(\Delta)\,</math> and <math>\,W_s=W_s\,(\Delta)</math>.
Classical MDS can be defined by the optimization problem
Classical MDS can be defined by the optimization problem
<math>\begin{array}{rl}
<math>\begin{array}{rl}
-
\text{minimize}_B&\|W-B\|_{\rm F}^2\\
+
\text{minimize}_B&||W-B||_{\rm F}^2\\
-
\text{subject to}&B\in\Bbb S_+^n(p)
+
\text{subject to}&B\in\mathbb{S}_+^n(p)
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~\textbf{(P)}</math>
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~\textbf{(P)}</math>
Line 80: Line 80:
<math>\begin{array}{rl}
<math>\begin{array}{rl}
-
\mbox{minimize}_B&\|W_s-B\|_{\rm F}^2\\
+
\mbox{minimize}_B&||W_s-B||_{\rm F}^2\\
-
\text{subject to}&B\in\Bbb S_+^n(p)
+
\text{subject to}&B\in\mathbb{S}_+^n(p)
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~(\textbf{P}_\textbf{s)}</math>
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~(\textbf{P}_\textbf{s)}</math>
The following explicit solution to problem '''(P)''' (respectively problem '''(P<sub>s</sub>)''') is well known:
The following explicit solution to problem '''(P)''' (respectively problem '''(P<sub>s</sub>)''') is well known:
-
let <math>\,\lambda_1\geqslant\ldots\geqslant\lambda_n\,</math> denote the eigenvalues of <math>\,W\,</math> (respectively of <math>\,W_s\,</math>)
+
let <math>\,\lambda_1\geq\ldots\geq\lambda_n\,</math> denote the eigenvalues of <math>\,W\,</math> (respectively of <math>\,W_s\,</math>)
and <math>\,v_1\ldots v_n\,</math> denote the corresponding eigenvectors.
and <math>\,v_1\ldots v_n\,</math> denote the corresponding eigenvectors.
Line 104: Line 104:
In Section '''5''', we will prove that for any squared dissimilarity matrix <math>\,\Delta\,</math> we have
In Section '''5''', we will prove that for any squared dissimilarity matrix <math>\,\Delta\,</math> we have
-
<math>f\leqslant f_s</math>
+
<math>f\leq f_s</math>
that is, at the minimum, the strain criterion always gives smaller value than criterion '''(P<sub>s</sub>)'''.
that is, at the minimum, the strain criterion always gives smaller value than criterion '''(P<sub>s</sub>)'''.
Line 110: Line 110:
;Lemma
;Lemma
-
Let <math>\,\lambda(C)\in\reals^n\,</math> denote the eigenvalues of any symmetric <math>\,n\times n\,</math> matrix in nonincreasing order.
+
Let <math>\,\lambda(C)\in\mathbb{R}^n\,</math> denote the eigenvalues of any symmetric <math>\,n\times n\,</math> matrix in nonincreasing order.
* For all <math>\,A,B\,</math>
* For all <math>\,A,B\,</math>
-
<math>\lambda_i(A+B) \leqslant \lambda_i(A)+\lambda_1(B)</math>
+
<math>\lambda_i(A+B) \leq \lambda_i(A)+\lambda_1(B)</math>
* For all positive semidefinite <math>\,A,B\,</math>
* For all positive semidefinite <math>\,A,B\,</math>
-
<math>\lambda_i(A\,B)\leqslant \lambda_i(A)\lambda_1(B)\qquad\diamond</math>
+
<math>\lambda_i(A\,B)\leq \lambda_i(A)\lambda_1(B)\qquad\diamond</math>
Line 126: Line 126:
In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:
In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:
-
<math>\begin{array}{rl}\text{minimize}&S(D)=\|\Delta-D\|^2\\
+
<math>\begin{array}{rl}\text{minimize}&S(D)=||\Delta-D||^2\\
-
\text{subject to}&D\in\Bbb{EDM}^n(p)
+
\text{subject to}&D\in\mathbb{EDM}^n(p)
\end{array}</math>
\end{array}</math>
'''Result.''' The following inequality holds:
'''Result.''' The following inequality holds:
-
Given <math>\,p\leqslant n-1\,</math>, for any <math>\,B\in\Bbb S_+^n(p)\,</math>, let <math>\,D=\text{diag}(B)e^{\rm T}+e\;\text{diag}(B)^{\rm T}-2B\,</math>.
+
Given <math>\,p\leq n-1\,</math>, for any <math>\,B\in\mathbb{S}_+^n(p)\,</math>, let <math>\,D={\text diag}(B)e^{\rm T}+e\;{\text diag}(B)^{\rm T}-2B\,</math>.
Then
Then
-
<math>\|\Delta-D\|^2 \geqslant 4\|W-B\|^2</math>
+
<math>||\Delta-D||^2 \geq 4||W-B||^2</math>
-
<br>'''Proof.''' Let <math>\,B\in\Bbb S_+^n(p)\,</math>; we have
+
<br>'''Proof.''' Let <math>\,B\in\mathbb{S}_+^n(p)\,</math>; we have
<math>\begin{array}{rcl}
<math>\begin{array}{rcl}
Line 145: Line 145:
Writing <math>\,a_{ij}=w_{ij}-b_{ij}\,</math> we get
Writing <math>\,a_{ij}=w_{ij}-b_{ij}\,</math> we get
-
<math>\sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geqslant 4\sum_i\sum_ja_{ij}^2\qquad\square</math>
+
<math>\sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geq 4\sum_i\sum_ja_{ij}^2\qquad\diamond</math>
= Main result =
= Main result =
Line 151: Line 151:
;Theorem.
;Theorem.
-
For any <math>\,s\in\Bbb R^n\,</math> such that <math>\,s^{\rm T}e=1\,</math> and for any <math>\,p\,</math> we have
+
For any <math>\,s\in\mathbb{R}^n\,</math> such that <math>\,s^{\rm T}e=1\,</math> and for any <math>\,p\,</math> we have
-
<math>\,f \leqslant f_s\qquad(18)</math>
+
<math>\,f \leq f_s\qquad(18)</math>
-
'''Proof.''' We show, for all <math>\,i\,</math>, that <math>\,|\lambda_i(W)|\leqslant|\lambda_i(W_s)|\,</math>.
+
'''Proof.''' We show, for all <math>\,i\,</math>, that <math>\,|\lambda_i(W)|\leq|\lambda_i(W_s)|\,</math>.
Toward that end, we consider two cases:
Toward that end, we consider two cases:
-
* If <math>\,W\,</math> is PSD then <math>\,W_s\,</math> is PSD and the inequality becomes <math>\,\lambda_i(W)\leqslant \lambda_i(W_s)\,</math>. But
+
* If <math>\,W\,</math> is PSD then <math>\,W_s\,</math> is PSD and the inequality becomes <math>\,\lambda_i(W)\leq \lambda_i(W_s)\,</math>. But
-
<math>\lambda_i(W)=\lambda_i(JW_sJ)\leqslant \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)</math>
+
<math>\lambda_i(W)=\lambda_i(JW_sJ)\leq \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)</math>
because <math>\,\lambda_1(J)=1\,</math>.
because <math>\,\lambda_1(J)=1\,</math>.
Line 168: Line 168:
<math>\begin{array}{rcl}
<math>\begin{array}{rcl}
\lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^{\rm T}W_sJW_s) \\
\lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^{\rm T}W_sJW_s) \\
-
&\leqslant& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)
+
&\leq& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)
\end{array}</math>
\end{array}</math>
Line 177: Line 177:
because <math>\,J\,</math> and <math>\,W_s^2\,</math> are PSD we have
because <math>\,J\,</math> and <math>\,W_s^2\,</math> are PSD we have
-
<math>\lambda_i(W_sJW_s)\leqslant\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\square</math>
+
<math>\lambda_i(W_sJW_s)\leq\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\diamond</math>
= Modified Gower problem =
= Modified Gower problem =
Line 190: Line 190:
where <math>\,\lambda_i(W_s)\,</math> denotes the <math>\,i^{\rm th}\,</math> eigenvalue of <math>\,W_s\,</math>.
where <math>\,\lambda_i(W_s)\,</math> denotes the <math>\,i^{\rm th}\,</math> eigenvalue of <math>\,W_s\,</math>.
-
But by a well known result... we have, for <math>\,X\!\in\reals^{n\times p}\,</math>
+
But by a well known result... we have, for <math>\,X\!\in\mathbb{R}^{n\times p}\,</math>
<math>\sum_{i=1}^p\lambda_i(W_s)=\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)</math>
<math>\sum_{i=1}^p\lambda_i(W_s)=\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)</math>
Line 208: Line 208:
= References =
= References =
- 
[1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.
[1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.
- 
[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. North-Holland, Amsterdam, pp. 285-316 (chapter 13).
[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. North-Holland, Amsterdam, pp. 285-316 (chapter 13).
- 
[3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.
[3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.
- 
[4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.
[4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.
- 
[5] Schoenberg, I.J., 1935. Remarks to Maurice Fréchet's article Sur la définition axiomatique d'une classe d'espaces distanciés vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.
[5] Schoenberg, I.J., 1935. Remarks to Maurice Fréchet's article Sur la définition axiomatique d'une classe d'espaces distanciés vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.
- 
[6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.
[6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.
-
 
- 
-
[7] Trosset, M.W., 1997. Numerical algorithms for multidimensional scaling. In: Klar, R., Opitz, P. (Eds.), Classification and Knowledge Organization. Springer, Berlin, pp. 80-92.
 

Current revision

by M. Bennani Dosse

Contents

Abstract

The aim of this short paper is to give an algebraic result that relate two criteria in multidimensional scaling...

Key-words
Euclidean distance, Multidimensional scaling, strain, sstress, comparing criteria.

Introduction

We consider an LaTeX: \,n\times n\, matrix LaTeX: \,D=[d_{ij}]\, defined as a real symmetric matrix that is hollow LaTeX: \,d_{ii}=0\, for LaTeX: \,i=1\ldots n\, and nonnegative LaTeX: \,d_{ij}\geq 0\, for all LaTeX: \,i,j\,.

LaTeX: D\, is said to be Euclidean distance matrix of dimension LaTeX: \,p\, if there exists a list of points LaTeX: \,\{z_1\ldots z_n\}\, in LaTeX: \,\mathbb{R}^p\, LaTeX: \,(p\leq n-1)\, such that

LaTeX: d_{ij}=\|z_i-z_j\|^2 \quad\forall\,i,j=1\ldots n

where LaTeX: \,\|~\|\, denotes Euclidean norm. Denote by LaTeX: \,\mathbb{EDM}^n(p)\, the set of all LaTeX: \,n\times n\, Euclidean distance matrices of dimension LaTeX: \,p\,.

A problem common to various sciences is to find the Euclidean distance matrix LaTeX: \,D\in\mathbb{EDM}^n(p)\, closest, in some sense, to a given predistance matrix LaTeX: \,\Delta=[\delta_{ij}]\, defined to be any symmetric hollow nonnegative real matrix. There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable LaTeX: \,d_{ij}\, versus absolute distance LaTeX: \,\sqrt{d_{ij}}\,.

During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an LaTeX: \,\mathbb{EDM}^n(p)\, in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.

Mathematical preliminaries

It is well known that LaTeX: \,D\in\mathbb{EDM}^n(p)\, if and only if the symmetric LaTeX: \,n\times n\, matrix

LaTeX: W_s(D)=-\frac{1}{2}(I-es^{\rm T})D(I-se^{\rm T})\qquad(1)

is positive semidefinite with LaTeX: \,{\text rank}(W_s(D))\leq p\,, where LaTeX: \,e\, is a vector of ones and LaTeX: \,s\, is any vector such that LaTeX: \,s^{\rm T}e=1\,.

This result was proved by Gower... as a generalization of an earlier result of Schoenberg... Later Gower considered the particular choices LaTeX: \,s=\frac{1}{n}e\, and LaTeX: \,s=e_i\, where LaTeX: \,e_i\, is the LaTeX: \,i^\text{th}\, vector from the standard basis. In what follows, when LaTeX: \,s=\frac{1}{n}e\, then matrix LaTeX: \,W_s(D)\, will be denoted by LaTeX: \,W(D)\,:

LaTeX: W(D)=-\frac{1}{2}(I-\frac{1}{n}ee^{\rm T})D(I-\frac{1}{n}ee^{\rm T})\qquad(2)

We see no compelling reason to prefer one particular LaTeX: \;s\, over another. Each has its own coherent interpretation. Neither can we say any particular problem formulation produces generally better results than another. Dattorro...

The aim of this short paper is to clarify that point...

We shall also use Wolkowicz' notation:

LaTeX: \begin{array}{rcl}
J_s &=& I-es^{\rm T}\qquad(3)\\
J   &=& I-\frac{1}{n}ee^{\rm T}\qquad(4)
\end{array}

so the equations (2) and (3) can be written:

LaTeX: \begin{array}{rcl}
W_s(D) &=& -\frac{1}{2}J_sDJ_s^{\rm T} \\
W(D)   &=& -\frac{1}{2}JDJ
\end{array}

It is easy to verify the following properties:

LaTeX: \begin{array}{c}
J^{\rm T}=J,\;J^2=J,\;Je=0\\
J_s^2=J_s,\;J_se=0,\;s^{\rm T}J_s=0\\
JJ_s=J,\;J_sJ=J_s\\
W=JW_sJ\\
W_s=J_sWJ_s^{\rm T}
\end{array}

Classical Multidimensional Scaling

Given LaTeX: \,p\leq n\,, let LaTeX: \,\mathbb{S}_+^n(p)\, denote the closed set of symmetric LaTeX: \,n\times n\, matrices that are positive semidefinite and have rank no greater than LaTeX: \,p\,.

Let LaTeX: \,||.||_{\rm F}\, denote the Frobenius norm and LaTeX: \,\Delta\, a given symmetric LaTeX: \,n\times n\, matrix of squared dissimilarities. Let LaTeX: \,W=W(\Delta)\, and LaTeX: \,W_s=W_s\,(\Delta).

Classical MDS can be defined by the optimization problem

LaTeX: \begin{array}{rl}
\text{minimize}_B&||W-B||_{\rm F}^2\\
\text{subject to}&B\in\mathbb{S}_+^n(p)
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~\textbf{(P)}

Problem (P) can be viewed as a particular case of a more general optimization problem

LaTeX: \begin{array}{rl}
\mbox{minimize}_B&||W_s-B||_{\rm F}^2\\
\text{subject to}&B\in\mathbb{S}_+^n(p)
\end{array}~~~~~~~~~~~~~~~~~~~~~~~~(\textbf{P}_\textbf{s)}

The following explicit solution to problem (P) (respectively problem (Ps)) is well known: let LaTeX: \,\lambda_1\geq\ldots\geq\lambda_n\, denote the eigenvalues of LaTeX: \,W\, (respectively of LaTeX: \,W_s\,) and LaTeX: \,v_1\ldots v_n\, denote the corresponding eigenvectors.

Assume that the LaTeX: \,p\, largest eigenvalues are positive. Then

LaTeX: B^\star=\sum_{i=1}^p\lambda_iv_iv_i^{\rm T}

is a global minimum of problem (P) (respectively of problem (Ps)). Furthermore, the minimum value for problem (P) is

LaTeX: f=\sum_{i=p+1}^n\lambda_i^2(W)

and for problem (Ps)

LaTeX: f_s=\sum_{i=p+1}^n\lambda_i^2(W_s)

In Section 5, we will prove that for any squared dissimilarity matrix LaTeX: \,\Delta\, we have

LaTeX: f\leq f_s

that is, at the minimum, the strain criterion always gives smaller value than criterion (Ps). In order to show this result we shall use

Lemma

Let LaTeX: \,\lambda(C)\in\mathbb{R}^n\, denote the eigenvalues of any symmetric LaTeX: \,n\times n\, matrix in nonincreasing order.

  • For all LaTeX: \,A,B\,

LaTeX: \lambda_i(A+B) \leq \lambda_i(A)+\lambda_1(B)

  • For all positive semidefinite LaTeX: \,A,B\,

LaTeX: \lambda_i(A\,B)\leq \lambda_i(A)\lambda_1(B)\qquad\diamond


Proof. see, for instance, Wilkinson...

Comparing strain and sstress

In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:

LaTeX: \begin{array}{rl}\text{minimize}&S(D)=||\Delta-D||^2\\
\text{subject to}&D\in\mathbb{EDM}^n(p)
\end{array}

Result. The following inequality holds: Given LaTeX: \,p\leq n-1\,, for any LaTeX: \,B\in\mathbb{S}_+^n(p)\,, let LaTeX: \,D={\text diag}(B)e^{\rm T}+e\;{\text diag}(B)^{\rm T}-2B\,. Then

LaTeX: ||\Delta-D||^2 \geq 4||W-B||^2


Proof. Let LaTeX: \,B\in\mathbb{S}_+^n(p)\,; we have

LaTeX: \begin{array}{rcl}
\delta_{ij} &=& w_{ii}+w_{jj}-2w_{ij}\\
d_{ij} &=& b_{ii}+b_{jj}-2b_{ij}
\end{array}

Writing LaTeX: \,a_{ij}=w_{ij}-b_{ij}\, we get

LaTeX: \sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geq 4\sum_i\sum_ja_{ij}^2\qquad\diamond

Main result

In this section we show an inequality involving the criteria LaTeX: \,f\, in (12) and LaTeX: \,f_s\, in (13).

Theorem.

For any LaTeX: \,s\in\mathbb{R}^n\, such that LaTeX: \,s^{\rm T}e=1\, and for any LaTeX: \,p\, we have

LaTeX: \,f \leq f_s\qquad(18)


Proof. We show, for all LaTeX: \,i\,, that LaTeX: \,|\lambda_i(W)|\leq|\lambda_i(W_s)|\,. Toward that end, we consider two cases:

  • If LaTeX: \,W\, is PSD then LaTeX: \,W_s\, is PSD and the inequality becomes LaTeX: \,\lambda_i(W)\leq \lambda_i(W_s)\,. But

LaTeX: \lambda_i(W)=\lambda_i(JW_sJ)\leq \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)

because LaTeX: \,\lambda_1(J)=1\,.

  • If LaTeX: \,W\, is not PSD then, using the definition of LaTeX: \,J\,:

LaTeX: \begin{array}{rcl}
\lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^{\rm T}W_sJW_s) \\
</p>
<pre>              &\leq& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)
</pre>
<p>\end{array}

But

LaTeX: \lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)= 0

because LaTeX: \,J\, and LaTeX: \,W_s^2\, are PSD we have

LaTeX: \lambda_i(W_sJW_s)\leq\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\diamond

Modified Gower problem

In this Section we consider the following problem: Given a nonEuclidean matrix, can we find an LaTeX: \,s\, that maximizes the total squared real distances from the points to the centroid given by LaTeX: \,s\, in the fitted configuration. What is this choice of LaTeX: \,s\,?

This problem can be written as an optimization problem in the following manner. First note that if LaTeX: \,\Delta\, is not Euclidean, then the number of negative eigenvalues of LaTeX: \,W_s=W_s(\Delta)\, does not depend on LaTeX: \,s\,; call that number LaTeX: \,p\,.

The total squared-real distances from the points to the centroid given by LaTeX: \,s\, in the fitted configuration can be written as

LaTeX: \sum_{i=1}^p\lambda_i(W_s)\qquad(19)

where LaTeX: \,\lambda_i(W_s)\, denotes the LaTeX: \,i^{\rm th}\, eigenvalue of LaTeX: \,W_s\,. But by a well known result... we have, for LaTeX: \,X\!\in\mathbb{R}^{n\times p}\,

LaTeX: \sum_{i=1}^p\lambda_i(W_s)=\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)

So the final optimization problem can be written as

LaTeX: \max_{s^{\rm T}e=1,\,s\succeq 0}\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(20)

where

LaTeX: \,X^{\rm T}W_sX=X^{\rm T}WX-X^{\rm T}Wse^{\rm T}X-X^{\rm T}es^{\rm T}WX+X^{\rm T}es^{\rm T}Wse^{\rm T}X\,

Question

Is it true that at the optimum, the problem (20) is equivalent to the problem...

LaTeX: \max_{X^{\rm T}X=I}\max_{s^{\rm T}e=1,\,s\succeq 0}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(21)

References

[1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.

[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. North-Holland, Amsterdam, pp. 285-316 (chapter 13).

[3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.

[4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.

[5] Schoenberg, I.J., 1935. Remarks to Maurice Fréchet's article Sur la définition axiomatique d'une classe d'espaces distanciés vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.

[6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.

Personal tools