Proximity Problems

From Wikimization

(Difference between revisions)

Jump to: navigation, search

Current revision

by M. Bennani Dosse

1 Abstract
2 Introduction
3 Mathematical preliminaries
4 Classical Multidimensional Scaling
5 Comparing strain and sstress
6 Main result
7 Modified Gower problem
8 References

Abstract

The aim of this short paper is to give an algebraic result that relate two criteria in multidimensional scaling...

Key-words: Euclidean distance, Multidimensional scaling, strain, sstress, comparing criteria.

Introduction

We consider an $LaTeX: \,n\times n\,$ matrix $LaTeX: \,D=[d_{ij}]\,$ defined as a real symmetric matrix that is hollow $LaTeX: \,d_{ii}=0\,$ for $LaTeX: \,i=1\ldots n\,$ and nonnegative $LaTeX: \,d_{ij}\geq 0\,$ for all $LaTeX: \,i,j\,$ .

$LaTeX: D\,$ is said to be Euclidean distance matrix of dimension $LaTeX: \,p\,$ if there exists a list of points $LaTeX: \,\{z_1\ldots z_n\}\,$ in $LaTeX: \,\mathbb{R}^p\,$ $LaTeX: \,(p\leq n-1)\,$ such that

$LaTeX: d_{ij}=\|z_i-z_j\|^2 \quad\forall\,i,j=1\ldots n$

where $LaTeX: \,\|~\|\,$ denotes Euclidean norm. Denote by $LaTeX: \,\mathbb{EDM}^n(p)\,$ the set of all $LaTeX: \,n\times n\,$ Euclidean distance matrices of dimension $LaTeX: \,p\,$ .

A problem common to various sciences is to find the Euclidean distance matrix $LaTeX: \,D\in\mathbb{EDM}^n(p)\,$ closest, in some sense, to a given predistance matrix $LaTeX: \,\Delta=[\delta_{ij}]\,$ defined to be any symmetric hollow nonnegative real matrix. There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable $LaTeX: \,d_{ij}\,$ versus absolute distance $LaTeX: \,\sqrt{d_{ij}}\,$ .

During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an $LaTeX: \,\mathbb{EDM}^n(p)\,$ in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.

Mathematical preliminaries

It is well known that $LaTeX: \,D\in\mathbb{EDM}^n(p)\,$ if and only if the symmetric $LaTeX: \,n\times n\,$ matrix

$LaTeX: W_s(D)=-\frac{1}{2}(I-es^{\rm T})D(I-se^{\rm T})\qquad(1)$

is positive semidefinite with $LaTeX: \,{\text rank}(W_s(D))\leq p\,$ , where $LaTeX: \,e\,$ is a vector of ones and $LaTeX: \,s\,$ is any vector such that $LaTeX: \,s^{\rm T}e=1\,$ .

This result was proved by Gower... as a generalization of an earlier result of Schoenberg... Later Gower considered the particular choices $LaTeX: \,s=\frac{1}{n}e\,$ and $LaTeX: \,s=e_i\,$ where $LaTeX: \,e_i\,$ is the $LaTeX: \,i^\text{th}\,$ vector from the standard basis. In what follows, when $LaTeX: \,s=\frac{1}{n}e\,$ then matrix $LaTeX: \,W_s(D)\,$ will be denoted by $LaTeX: \,W(D)\,$ :

$LaTeX: W(D)=-\frac{1}{2}(I-\frac{1}{n}ee^{\rm T})D(I-\frac{1}{n}ee^{\rm T})\qquad(2)$

We see no compelling reason to prefer one particular $LaTeX: \;s\,$ over another. Each has its own coherent interpretation. Neither can we say any particular problem formulation produces generally better results than another. Dattorro...

The aim of this short paper is to clarify that point...

We shall also use Wolkowicz' notation:

$LaTeX: \begin{array}{rcl} J_s &=& I-es^{\rm T}\qquad(3)\\ J &=& I-\frac{1}{n}ee^{\rm T}\qquad(4) \end{array}$

so the equations (2) and (3) can be written:

$LaTeX: \begin{array}{rcl} W_s(D) &=& -\frac{1}{2}J_sDJ_s^{\rm T} \\ W(D) &=& -\frac{1}{2}JDJ \end{array}$

It is easy to verify the following properties:

$LaTeX: \begin{array}{c} J^{\rm T}=J,\;J^2=J,\;Je=0\\ J_s^2=J_s,\;J_se=0,\;s^{\rm T}J_s=0\\ JJ_s=J,\;J_sJ=J_s\\ W=JW_sJ\\ W_s=J_sWJ_s^{\rm T} \end{array}$

Classical Multidimensional Scaling

Given $LaTeX: \,p\leq n\,$ , let $LaTeX: \,\mathbb{S}_+^n(p)\,$ denote the closed set of symmetric $LaTeX: \,n\times n\,$ matrices that are positive semidefinite and have rank no greater than $LaTeX: \,p\,$ .

Let $LaTeX: \,||.||_{\rm F}\,$ denote the Frobenius norm and $LaTeX: \,\Delta\,$ a given symmetric $LaTeX: \,n\times n\,$ matrix of squared dissimilarities. Let $LaTeX: \,W=W(\Delta)\,$ and $LaTeX: \,W_s=W_s\,(\Delta)$ .

Classical MDS can be defined by the optimization problem

$LaTeX: \begin{array}{rl} \text{minimize}_B&||W-B||_{\rm F}^2\\ \text{subject to}&B\in\mathbb{S}_+^n(p) \end{array}~~~~~~~~~~~~~~~~~~~~~~~~\textbf{(P)}$

Problem (P) can be viewed as a particular case of a more general optimization problem

$LaTeX: \begin{array}{rl} \mbox{minimize}_B&||W_s-B||_{\rm F}^2\\ \text{subject to}&B\in\mathbb{S}_+^n(p) \end{array}~~~~~~~~~~~~~~~~~~~~~~~~(\textbf{P}_\textbf{s)}$

The following explicit solution to problem (P) (respectively problem (P_s)) is well known: let $LaTeX: \,\lambda_1\geq\ldots\geq\lambda_n\,$ denote the eigenvalues of $LaTeX: \,W\,$ (respectively of $LaTeX: \,W_s\,$ ) and $LaTeX: \,v_1\ldots v_n\,$ denote the corresponding eigenvectors.

Assume that the $LaTeX: \,p\,$ largest eigenvalues are positive. Then

$LaTeX: B^\star=\sum_{i=1}^p\lambda_iv_iv_i^{\rm T}$

is a global minimum of problem (P) (respectively of problem (P_s)). Furthermore, the minimum value for problem (P) is

$LaTeX: f=\sum_{i=p+1}^n\lambda_i^2(W)$

and for problem (P_s)

$LaTeX: f_s=\sum_{i=p+1}^n\lambda_i^2(W_s)$

In Section 5, we will prove that for any squared dissimilarity matrix $LaTeX: \,\Delta\,$ we have

$LaTeX: f\leq f_s$

that is, at the minimum, the strain criterion always gives smaller value than criterion (P_s). In order to show this result we shall use

Lemma

Let $LaTeX: \,\lambda(C)\in\mathbb{R}^n\,$ denote the eigenvalues of any symmetric $LaTeX: \,n\times n\,$ matrix in nonincreasing order.

For all $LaTeX: \,A,B\,$

$LaTeX: \lambda_i(A+B) \leq \lambda_i(A)+\lambda_1(B)$

For all positive semidefinite $LaTeX: \,A,B\,$

$LaTeX: \lambda_i(A\,B)\leq \lambda_i(A)\lambda_1(B)\qquad\diamond$

Proof. see, for instance, Wilkinson...

Comparing strain and sstress

In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:

$LaTeX: \begin{array}{rl}\text{minimize}&S(D)=||\Delta-D||^2\\ \text{subject to}&D\in\mathbb{EDM}^n(p) \end{array}$

Result. The following inequality holds: Given $LaTeX: \,p\leq n-1\,$ , for any $LaTeX: \,B\in\mathbb{S}_+^n(p)\,$ , let $LaTeX: \,D={\text diag}(B)e^{\rm T}+e\;{\text diag}(B)^{\rm T}-2B\,$ . Then

$LaTeX: ||\Delta-D||^2 \geq 4||W-B||^2$

Proof. Let $LaTeX: \,B\in\mathbb{S}_+^n(p)\,$ ; we have

$LaTeX: \begin{array}{rcl} \delta_{ij} &=& w_{ii}+w_{jj}-2w_{ij}\\ d_{ij} &=& b_{ii}+b_{jj}-2b_{ij} \end{array}$

Writing $LaTeX: \,a_{ij}=w_{ij}-b_{ij}\,$ we get

$LaTeX: \sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geq 4\sum_i\sum_ja_{ij}^2\qquad\diamond$

Main result

In this section we show an inequality involving the criteria $LaTeX: \,f\,$ in (12) and $LaTeX: \,f_s\,$ in (13).

Theorem.

For any $LaTeX: \,s\in\mathbb{R}^n\,$ such that $LaTeX: \,s^{\rm T}e=1\,$ and for any $LaTeX: \,p\,$ we have

$LaTeX: \,f \leq f_s\qquad(18)$

Proof. We show, for all $LaTeX: \,i\,$ , that $LaTeX: \,|\lambda_i(W)|\leq|\lambda_i(W_s)|\,$ . Toward that end, we consider two cases:

If $LaTeX: \,W\,$ is PSD then $LaTeX: \,W_s\,$ is PSD and the inequality becomes $LaTeX: \,\lambda_i(W)\leq \lambda_i(W_s)\,$ . But

$LaTeX: \lambda_i(W)=\lambda_i(JW_sJ)\leq \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)$

because $LaTeX: \,\lambda_1(J)=1\,$ .

If $LaTeX: \,W\,$ is not PSD then, using the definition of $LaTeX: \,J\,$ :

$LaTeX: \begin{array}{rcl} \lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^{\rm T}W_sJW_s) \\ </p> <pre> &\leq& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s) </pre> <p>\end{array}$

But

$LaTeX: \lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)= 0$

because $LaTeX: \,J\,$ and $LaTeX: \,W_s^2\,$ are PSD we have

$LaTeX: \lambda_i(W_sJW_s)\leq\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\diamond$

Modified Gower problem

In this Section we consider the following problem: Given a nonEuclidean matrix, can we find an $LaTeX: \,s\,$ that maximizes the total squared real distances from the points to the centroid given by $LaTeX: \,s\,$ in the fitted configuration. What is this choice of $LaTeX: \,s\,$ ?

This problem can be written as an optimization problem in the following manner. First note that if $LaTeX: \,\Delta\,$ is not Euclidean, then the number of negative eigenvalues of $LaTeX: \,W_s=W_s(\Delta)\,$ does not depend on $LaTeX: \,s\,$ ; call that number $LaTeX: \,p\,$ .

The total squared-real distances from the points to the centroid given by $LaTeX: \,s\,$ in the fitted configuration can be written as

$LaTeX: \sum_{i=1}^p\lambda_i(W_s)\qquad(19)$

where $LaTeX: \,\lambda_i(W_s)\,$ denotes the $LaTeX: \,i^{\rm th}\,$ eigenvalue of $LaTeX: \,W_s\,$ . But by a well known result... we have, for $LaTeX: \,X\!\in\mathbb{R}^{n\times p}\,$

$LaTeX: \sum_{i=1}^p\lambda_i(W_s)=\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)$

So the final optimization problem can be written as

$LaTeX: \max_{s^{\rm T}e=1,\,s\succeq 0}\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(20)$

where

$LaTeX: \,X^{\rm T}W_sX=X^{\rm T}WX-X^{\rm T}Wse^{\rm T}X-X^{\rm T}es^{\rm T}WX+X^{\rm T}es^{\rm T}Wse^{\rm T}X\,$

Question

Is it true that at the optimum, the problem (20) is equivalent to the problem...

$LaTeX: \max_{X^{\rm T}X=I}\max_{s^{\rm T}e=1,\,s\succeq 0}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(21)$

References

[1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.

[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. North-Holland, Amsterdam, pp. 285-316 (chapter 13).

[3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.

[4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.

[5] Schoenberg, I.J., 1935. Remarks to Maurice Fréchet's article Sur la définition axiomatique d'une classe d'espaces distanciés vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.

[6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.

Retrieved from "http://www.convexoptimization.com/wikimization/index.php/Proximity_Problems"

Proximity Problems

From Wikimization

Current revision

Contents

Abstract

Introduction

Mathematical preliminaries

Classical Multidimensional Scaling

Comparing strain and sstress

Main result

Modified Gower problem

References

Views

Personal tools

Navigation

Search

Toolbox

@@ Line 1: / Line 1: @@
+by M. Bennani Dosse
 = Abstract =
 The aim of this short paper is to give an algebraic result that relate two criteria in multidimensional scaling...
@@ Line 5: / Line 6: @@
 = Introduction =
-We consider an <math>\,n\times n\,</math> pre-distance matrix <math>\,D=(d_{ij})\,</math> defined as a real symmetric matrix where <math>\,d_{ii}=0\,</math> for <math>\,i=1\ldots n\,</math> and <math>\,d_{ij}\geqslant 0\,</math> for all <math>\,i,j\,</math>.
+We consider an <math>\,n\times n\,</math> matrix <math>\,D=[d_{ij}]\,</math> defined as a real symmetric matrix that is ''hollow'' <math>\,d_{ii}=0\,</math> for <math>\,i=1\ldots n\,</math> and nonnegative <math>\,d_{ij}\geq 0\,</math> for all <math>\,i,j\,</math>.
-<math>D\,</math> is said to be Euclidean distance matrix of dimension <math>\,p\,</math> if there exist points <math>\,z_1\ldots z_n\,</math> in <math>\,\Bbb R^p\,</math> <math>\,(p\leqslant n-1)\,</math> such that
+<math>D\,</math> is said to be Euclidean distance matrix of dimension <math>\,p\,</math> if there exists a list of points <math>\,\{z_1\ldots z_n\}\,</math> in
+<math>\,\mathbb{R}^p\,</math> <math>\,(p\leq n-1)\,</math> such that
-<math>d_{ij}=\|z_i-z_j\|^2 \quad\text{for all } i,j=1\ldots n</math>
+<math>d_{ij}=\|z_i-z_j\|^2 \quad\forall\,i,j=1\ldots n</math>
-where <math>\,\|~\|\,</math> denotes Euclidean norm.  Denote by <math>\,\Bbb{EDM}^p\,</math> the set of Euclidean distance matrices of dimension <math>\,p\,</math>.
+where <math>\,\|~\|\,</math> denotes Euclidean norm.
+Denote by <math>\,\mathbb{EDM}^n(p)\,</math> the set of all <math>\,n\times n\,</math> Euclidean distance matrices of dimension <math>\,p\,</math>.
-A problem common to various sciences is to find the Euclidean distance matrix <math>\,D\in\Bbb{EDM}^p\,</math> closest, in some sense, to a given predistance matrix <math>\,\Delta=[\delta_{ij}]\,</math>.
+A problem common to various sciences is to find the Euclidean distance matrix <math>\,D\in\mathbb{EDM}^n(p)\,</math> closest, in some sense, to a given ''predistance matrix'' <math>\,\Delta=[\delta_{ij}]\,</math> defined to be any symmetric hollow nonnegative real matrix.
 There are three statements of the closest-EDM problem prevalent in the literature, the multiplicity due primarily to choice of projection on the EDM versus positive semidefinite (PSD) cone and vacillation between the distance-square variable <math>\,d_{ij}\,</math> versus absolute distance <math>\,\sqrt{d_{ij}}\,</math>.
-During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an <math>\,\Bbb{EDM}^p\,</math> in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.
+During the past two decades a large amount of work has been devoted to Euclidean distance matrices and approximation of predistances by an
+<math>\,\mathbb{EDM}^n(p)\,</math> in a series of works including Gower[6-8], Mathar..., Critchley..., Hayden et al..., etc.
 = Mathematical preliminaries =
-It is well known that <math>\,D\in\Bbb{EDM}^p\,</math> if and only if the symmetric <math>\,n\times n\,</math> matrix
+It is well known that <math>\,D\in\mathbb{EDM}^n(p)\,</math> if and only if the symmetric <math>\,n\times n\,</math> matrix
-<math>W_s(D)=-\frac{1}{2}(I-es^t)D(I-se^t)\qquad(1)</math>
+<math>W_s(D)=-\frac{1}{2}(I-es^{\rm T})D(I-se^{\rm T})\qquad(1)</math>
-is positive semidefinite with <math>\,\text{rank}(W_s)\leqslant p\,</math>, where $\,e\,$ is a vector of ones and $\,s\,$ is any vector such that
+is positive semidefinite with <math>\,{\text rank}(W_s(D))\leq p\,</math>, where <math>\,e\,</math> is a vector of ones and <math>\,s\,</math> is any vector such that
 <math>\,s^{\rm T}e=1\,</math>.
@@ Line 31: / Line 35: @@
 In what follows, when <math>\,s=\frac{1}{n}e\,</math> then matrix <math>\,W_s(D)\,</math> will be denoted by <math>\,W(D)\,</math>:
-<math>W(D)=-\frac{1}{2}(I-\frac{1}{n}ee^t)D(I-\frac{1}{n}ee^t)\qquad(2)</math>
+<math>W(D)=-\frac{1}{2}(I-\frac{1}{n}ee^{\rm T})D(I-\frac{1}{n}ee^{\rm T})\qquad(2)</math>
-''We see no compelling reason to prefer one particular'' <math>\,s\,</math> ''over another. Each has its own coherent interpretation.  Neither can we say any particular problem formulation produces generally better results than another.'' Dattorro...
+''We see no compelling reason to prefer one particular'' <math>\;s\,</math> ''over another. Each has its own coherent interpretation.  Neither can we say any particular problem formulation produces generally better results than another.'' Dattorro...
-The aim of this short paper is to clarify this point...
+The aim of this short paper is to clarify that point...
-We shall also use the notation
+We shall also use [http://orion.uwaterloo.ca/~hwolkowi Wolkowicz'] notation:
 <math>\begin{array}{rcl}
-J_s &=& I-es^t\qquad(3)\\
+J_s &=& I-es^{\rm T}\qquad(3)\\
-J   &=& I-\frac{1}{n}ee^t\qquad(4)
+J   &=& I-\frac{1}{n}ee^{\rm T}\qquad(4)
 \end{array}</math>
-so the equations $(2)$ and $(3)$ can be written as :
+so the equations (2) and (3) can be written:
-\begin{eqnarray}
-W_s(D) &=& -\frac{1}{2}J_sDJ_s^t \\
+<math>\begin{array}{rcl}
+W_s(D) &=& -\frac{1}{2}J_sDJ_s^{\rm T} \\
 W(D)   &=& -\frac{1}{2}JDJ
-\end{eqnarray}
+\end{array}</math>
-It is easy to verify the following properties :
-\begin{eqnarray}
+It is easy to verify the following properties:
-J^t=J,\;J^2=J,\;Je=0\\
-J_s^2=J_s,\;J_se=0,\;s^tJ_s=0\\
+<math>\begin{array}{c}
+J^{\rm T}=J,\;J^2=J,\;Je=0\\
+J_s^2=J_s,\;J_se=0,\;s^{\rm T}J_s=0\\
 JJ_s=J,\;J_sJ=J_s\\
 W=JW_sJ\\
-W_s=J_sWJ_s^t
+W_s=J_sWJ_s^{\rm T}
-\end{eqnarray}
+\end{array}</math>
-\section{Classical MDS}
-Given $p\leqslant n$, let $\mathbf{D}_n(p)$ denote the se of Euclidean distance matrices of dimension $p$ and let $\Omega_n(p)$ denote the closed set of symmetric $n\times n$ matrices that are positive semidefinite and have rank no greater than $p$.
-Let $\|\cdot\|_F$ denote the Frobenius norm and $\Delta$ a given symmetric $n\times n$ matrix of squared dissimilarities. Let $W=W(\Delta)$ and $W_s=W_s(\Delta)$. Classical MDS can be defined by the optimization problem $(P)$ :
+= Classical Multidimensional Scaling =
-\begin{eqnarray*}
+Given <math>\,p\leq n\,</math>, let <math>\,\mathbb{S}_+^n(p)\,</math> denote the closed set of symmetric <math>\,n\times n\,</math> matrices that are positive semidefinite and have rank no greater than <math>\,p\,</math>.
-\text{minimize }\; \|W-B\|^2\\
-\text{subject to }\; B\in\Omega_n(p)
-\end{eqnarray*}
-Problem $(P)$ can be viewed as a particular case of a more general optimization problem $(P_s)$ :
-\begin{eqnarray*}
-\text{minimize }\; \|W_s-B\|^2\\
-\text{subject to }\; B\in\Omega_n(p)
-\end{eqnarray*}
-The following explicit solution to problem $(P)$ (resp. problem $(P_s)$) is well known : let $\lambda_1\geqslant\ldots\geqslant\lambda_n$ denote the eigenvalues of $W$ (resp. of $W_s$) and $v_1,\ldots,v_n$ denote the corresponding eigenvectors. Assume that the $p$ largest eigenvalues are positive. Then :
-$$ B^*=\sum_{i=1}^p\lambda_iv_iv_i^t $$
-is a global minimum of problem $(P)$ (resp. of problem $(P_s)$). Furthermore, the minimum value for problem $(P)$ is
-\begin{equation}
-f=\sum_{i=p+1}^n\lambda_i^2(W)
-\end{equation}
-and for problem $(P_s)$ is
-\begin{equation}
-f_s=\sum_{i=p+1}^n\lambda_i^2(W_s)
-\end{equation}
-In Section 5, we will prove that for any squared dissimilarity matrix $\Delta$ we have
-$$ f\leqslant f_s $$
-that is, at the minimum, the strain criterion always gives smaller value than criterion $(P_s)$. In order to show this result we shall use the lemma :
-\begin{Lemma}
-Let $\lambda_i(C)$ $i=1,\ldots,n$ denote the eigenvalues of any symmetric $n\times n$ matrix assumed to be ranked in decreasing order.
-\begin{enumerate}
-\item For all $A,B$ :
-\begin{equation}
-\lambda_i(A+B) \leqslant \lambda_i(A)+\lambda_1(B)
-\end{equation}
-\item For all positive semidefinite $A,B$ :
-\begin{equation}
-\lambda_i(A\,B)\leqslant \lambda_i(A)\lambda_1(B)
-\end{equation}
-\end{enumerate}
-\end{Lemma}
-\vspace{2mm}\noindent{\bf Proof.} see, for instance, Wilkinson [?]
+Let <math>\,||.||_{\rm F}\,</math> denote the Frobenius norm and <math>\,\Delta\,</math> a given symmetric <math>\,n\times n\,</math> matrix of squared dissimilarities. Let <math>\,W=W(\Delta)\,</math> and <math>\,W_s=W_s\,(\Delta)</math>.
-%\begin{itemize}
-%\item stress :
-%\begin{equation}
-%S(X)=\|H^{1/2}-D(X)^{1/2}\|^2
-%\end{equation}
-%where $A^{1/2}$ is the Hadamard square root (elementwise) of $A$.
-%\item sstress :
-%\end{itemize}
-\section{Comparing strain and sstress}
-In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by :
-\begin{eqnarray*}
-\text{minimize}\; S(D)=\|\Delta-D\|^2\\
-\text{subject to }D\in \mathbf{D}_n(p)
-\end{eqnarray*}
-\vspace{2mm}\noindent{\bf Result} The following inequality holds : given $p\leqslant n-1$, for any $B\in\Omega_n(p)$, let $D=\text{diag}(B)e^t+e\;\text{diag}(b)^t-2B$. Then
+Classical MDS can be defined by the optimization problem
-\begin{equation}
-\|\Delta-D\|^2 \geqslant 4\|W-B\|^2
+<math>\begin{array}{rl}
-\end{equation}
+\text{minimize}_B&||W-B||_{\rm F}^2\\
+\text{subject to}&B\in\mathbb{S}_+^n(p)
+\end{array}~~~~~~~~~~~~~~~~~~~~~~~~\textbf{(P)}</math>
+Problem '''(P)''' can be viewed as a particular case of a more general optimization problem
+<math>\begin{array}{rl}
+\mbox{minimize}_B&||W_s-B||_{\rm F}^2\\
+\text{subject to}&B\in\mathbb{S}_+^n(p)
+\end{array}~~~~~~~~~~~~~~~~~~~~~~~~(\textbf{P}_\textbf{s)}</math>
+The following explicit solution to problem '''(P)''' (respectively problem '''(P<sub>s</sub>)''') is well known:
+let <math>\,\lambda_1\geq\ldots\geq\lambda_n\,</math> denote the eigenvalues of <math>\,W\,</math> (respectively of <math>\,W_s\,</math>)
+and <math>\,v_1\ldots v_n\,</math> denote the corresponding eigenvectors.
+Assume that the <math>\,p\,</math> largest eigenvalues are positive.
+Then
+<math>B^\star=\sum_{i=1}^p\lambda_iv_iv_i^{\rm T}</math>
+is a global minimum of problem '''(P)''' (respectively of problem '''(P<sub>s</sub>)''').
+Furthermore, the minimum value for problem '''(P)''' is
+<math>f=\sum_{i=p+1}^n\lambda_i^2(W)</math>
+and for problem '''(P<sub>s</sub>)'''
+<math>f_s=\sum_{i=p+1}^n\lambda_i^2(W_s)</math>
+In Section '''5''', we will prove that for any squared dissimilarity matrix <math>\,\Delta\,</math> we have
-\vspace{2mm}\noindent{\bf proof.} Let $B\in\Omega_n(p)$; we have
+<math>f\leq f_s</math>
-\begin{eqnarray*}
-\delta_{ij} &=& w_{ii}+w_{jj}-2w_{ij} \\
+that is, at the minimum, the strain criterion always gives smaller value than criterion '''(P<sub>s</sub>)'''.
+In order to show this result we shall use
+;Lemma
+Let <math>\,\lambda(C)\in\mathbb{R}^n\,</math> denote the eigenvalues of any symmetric <math>\,n\times n\,</math> matrix in nonincreasing order.
+* For all <math>\,A,B\,</math>
+<math>\lambda_i(A+B) \leq \lambda_i(A)+\lambda_1(B)</math>
+* For all positive semidefinite <math>\,A,B\,</math>
+<math>\lambda_i(A\,B)\leq \lambda_i(A)\lambda_1(B)\qquad\diamond</math>
+'''Proof.''' see, for instance, Wilkinson...
+= Comparing strain and sstress =
+In this section we recall a result (see [2]) that relate the strain and sstress criteria. The sstress criterion is given by:
+<math>\begin{array}{rl}\text{minimize}&S(D)=||\Delta-D||^2\\
+\text{subject to}&D\in\mathbb{EDM}^n(p)
+\end{array}</math>
+'''Result.'''  The following inequality holds:
+Given <math>\,p\leq n-1\,</math>, for any <math>\,B\in\mathbb{S}_+^n(p)\,</math>, let <math>\,D={\text diag}(B)e^{\rm T}+e\;{\text diag}(B)^{\rm T}-2B\,</math>.
+Then
+<math>||\Delta-D||^2 \geq 4||W-B||^2</math>
+<br>'''Proof.'''  Let <math>\,B\in\mathbb{S}_+^n(p)\,</math>; we have
+<math>\begin{array}{rcl}
+\delta_{ij} &=& w_{ii}+w_{jj}-2w_{ij}\\
 d_{ij} &=& b_{ii}+b_{jj}-2b_{ij}
-\end{eqnarray*}
+\end{array}</math>
-Writting $a_{ij}=w_{ij}-b_{ij}$ we get
-\begin{equation}
-\sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geqslant 4\sum_i\sum_ja_{ij}^2
-\end{equation}
-\begin{flushright}$\square$\end{flushright}
-\section{Main result}
-In this section we show an inequality involving the criteria $f$ in $(12)$ and $f_s$ in $(13)$.
-\noindent{\bf Theorem.} for any $s\in\Bbb R^n$ such that $s^te=1$ and for any $p$ we have
+Writing <math>\,a_{ij}=w_{ij}-b_{ij}\,</math> we get
-\begin{equation}
-f \leqslant f_s
+<math>\sum_i\sum_j (\delta_{ij}-d_{ij})^2=2n\,\sum_ia_{ii}^2+4\sum_i\sum_j a_{ij}^2\geq 4\sum_i\sum_ja_{ij}^2\qquad\diamond</math>
-\end{equation}
+= Main result =
+In this section we show an inequality involving the criteria <math>\,f\,</math> in (12) and <math>\,f_s\,</math> in (13).
+;Theorem.
+For any <math>\,s\in\mathbb{R}^n\,</math> such that <math>\,s^{\rm T}e=1\,</math> and for any <math>\,p\,</math> we have
+<math>\,f \leq f_s\qquad(18)</math>
+'''Proof.'''  We show, for all <math>\,i\,</math>, that <math>\,|\lambda_i(W)|\leq|\lambda_i(W_s)|\,</math>.
+Toward that end, we consider two cases:
+* If <math>\,W\,</math> is PSD then <math>\,W_s\,</math> is PSD and the inequality becomes <math>\,\lambda_i(W)\leq \lambda_i(W_s)\,</math>. But
+<math>\lambda_i(W)=\lambda_i(JW_sJ)\leq \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s)</math>
+because <math>\,\lambda_1(J)=1\,</math>.
+* If <math>\,W\,</math> is not PSD then, using the definition of <math>\,J\,</math>:
+<math>\begin{array}{rcl}
+\lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^{\rm T}W_sJW_s) \\
+               &\leq& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)
+\end{array}</math>
-\vspace{2mm}\noindent{\bf Proof.} we will show that for all $i$, $|\lambda_i(W)|\leqslant |\lambda_i(W_s)|$. For this end, we consider the two cases :
-\begin{enumerate}
-\item If $W$ is psd then $W_s$ is psd and the inequality became $\lambda_i(W)\leqslant \lambda_i(W_s)$. But
-$$ \lambda_i(W)=\lambda_i(JW_sJ)\leqslant \lambda_i(W_s)\lambda_1(J)=\lambda_i(W_s) $$
-because $\lambda_1(J)=1$.
-\item If $W$ is not psd, then using the definition of $J$ :
-\begin{eqnarray*}
-\lambda_i(W^2) &=& \lambda_i(W_sJW_s-\frac{1}{n}ee^tW_sJW_s) \\
-               &\leqslant& \lambda_i(W_sJW_s)+\lambda_1(-\frac{1}{n}ee^tW_sJW_s)
-\end{eqnarray*}
 But
-$$ \lambda_1(-\frac{1}{n}ee^tW_sJW_s)= 0 $$
-because $J$ and $W_s^2$ are PSD we have
-$$ \lambda_i(W_sJW_s)\leqslant\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2) \hspace{2cm}\square $$
-%\begin{flushright}$\square$\end{flushright}
-\end{enumerate}
-\section{Modified Gower problem}
-In this Section we consider the following problem : given a non Euclidean matrix, can we find an $s$ that maximizes the total squared real distances {\bf from the points to the centroid given by} $s$ in the fitted configuration. What is this choice of $s$  ?
-This problem can be written as an optimization problem in the following maneer. First let remark that if $\Delta$ is not Euclidean the number of negative eigenvalues of $W_s=W_s(\Delta)$ does not depend on $s$. Let call this number $p$.
+<math>\lambda_1(-\frac{1}{n}ee^{\rm T}W_sJW_s)= 0</math>
+because <math>\,J\,</math> and <math>\,W_s^2\,</math> are PSD we have
+<math>\lambda_i(W_sJW_s)\leq\lambda_i(W_s^2)\lambda_1(J)=\lambda_i(W_s^2)\qquad\diamond</math>
+= Modified Gower problem =
+In this Section we consider the following problem:
+Given a nonEuclidean matrix, can we find an <math>\,s\,</math> that maximizes the total squared real distances from the points to the centroid given by <math>\,s\,</math> in the fitted configuration.  What is this choice of <math>\,s\,</math>?
+This problem can be written as an optimization problem in the following manner.  First note that if <math>\,\Delta\,</math> is not Euclidean, then the number of negative eigenvalues of <math>\,W_s=W_s(\Delta)\,</math> does not depend on <math>\,s\,</math>; call that number <math>\,p\,</math>.
+The total squared-real distances from the points to the centroid given by <math>\,s\,</math> in the fitted configuration can be written as
+<math>\sum_{i=1}^p\lambda_i(W_s)\qquad(19)</math>
+where <math>\,\lambda_i(W_s)\,</math> denotes the <math>\,i^{\rm th}\,</math> eigenvalue of <math>\,W_s\,</math>.
+But by a well known result... we have, for <math>\,X\!\in\mathbb{R}^{n\times p}\,</math>
+<math>\sum_{i=1}^p\lambda_i(W_s)=\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)</math>
+So the final optimization problem can be written as
+<math>\max_{s^{\rm T}e=1,\,s\succeq 0}\max_{X^{\rm T}X=I}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(20)</math>
-The total squared real distances  {\bf from the points to the centroid given by} $s$ in the fitted configuration can be written as
-\begin{equation}
-\sum_{i=1}^p\lambda_i(W_s)
-\end{equation}
-where $\lambda_i(W_s)$ denotes the $i$th eigenvalue of $W_s$. But by a well known result we have
-$$
-\sum_{i=1}^p\lambda_i(W_s)=\max_{X^tX=\mathbf{I_p}}\;\text{trace}(X^tW_sX)
-$$
-where $X$ is an $n\times p$ matrix and $\mathbf{I_p}$ is the $p\times p$ identity matrix. So the final optimisation problem can be written as
-\begin{equation}
-\max_{s^te=1,s\geqslant 0}\max_{X^tX=\mathbf{I_p}}\;\text{trace}(X^tW_sX)
-\end{equation}
 where
-$$ X^tW_sX=X^tWX-X^tWse^tX-X^tes^tWX+X^tes^tWse^tX $$
-\noindent{\bf Question :} Is it true that at the optimum, the problem $(20)$ is equivalent to the problem
+<math>\,X^{\rm T}W_sX=X^{\rm T}WX-X^{\rm T}Wse^{\rm T}X-X^{\rm T}es^{\rm T}WX+X^{\rm T}es^{\rm T}Wse^{\rm T}X\,</math>
-\begin{equation}
-\max_{X^tX=\mathbf{I_p}}\max_{s^te=1,s\geqslant 0}\;\text{trace}(X^tW_sX)
-\end{equation}
-???
+;Question:
-%\appendix
+Is it true that at the optimum, the problem (20) is equivalent to the problem...
-%\section*{Appendix}
-%In this appendix, we will show that
-\begin{flushleft}{\Large\bf References} \end{flushleft}
-\vspace{3mm}\noindent
+<math>\max_{X^{\rm T}X=I}\max_{s^{\rm T}e=1,\,s\succeq 0}\;\mathrm{tr}(X^{\rm T}W_sX)\qquad(21)</math>
+= References =
 [1] Critchley, F., 1986. On certain linear mappings between inner-product and squared-distance matrices. Linear Algebra Appl. 105, 91-107.
-\vspace{3mm}\noindent
+[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. North-Holland, Amsterdam, pp. 285-316 (chapter 13).
-[2] De Leeuw, J., Heiser, W., 1982. Theory of multidimensional scaling. Krishnaiah, P.R., Kanal, I.N.(Eds.), Handbook of Statistics, vol. 2. Nrth-Holland, Amsterdam, pp. 285-316 (chapter 13).
-\vspace{3mm}\noindent
 [3] Gower, J.C., 1966. Some distance properties of latent root and vector methods in multivariate analysis. Biometrika 53, 315-328.
-\vspace{3mm}\noindent
 [4] Gower, J.C., 1982. Euclidean distance geometry, Math. Scientist 7, 1-14.
-\vspace{3mm}\noindent
+[5] Schoenberg, I.J., 1935. Remarks to Maurice Fréchet's article Sur la définition axiomatique d'une classe d'espaces distanciés vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.
-[5] Schoenberg, I.J., 1935. Remarks to Maurice Fr\'echet's article Sur la d\'efinition axiomatique d'une classe d'espaces distanci\'es vectoriellement applicable sur l'espace de Hilbert. Ann. of Math. 38, 724-738.
-\vspace{3mm}\noindent
 [6] Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401-419.
-\vspace{3mm}\noindent
-[7] Trosset, M.W., 1997. Numerical algorithms for multidimensional scaling. In: Klar, R., Opitz, P. (Eds.), Classification and Knowledge Organization. Springer, Berlin, pp. 80-92.
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-%
-%\vspace{3mm}\noindent
-\end{document}