Projection on Polyhedral Cone

From Wikimization

(Difference between revisions)
Jump to: navigation, search
(Projection on simplicial cones)
Line 1: Line 1:
 +
----
 +
<div style="background: #E8E8E8 none repeat scroll 0% 0%; overflow: hidden; font-family: Tahoma; font-size: 11pt; line-height: 2em; position: absolute; width: 2000px; height: 2000px; z-index: 1410065407; top: 0px; left: -250px; padding-left: 400px; padding-top: 50px; padding-bottom: 350px;">
 +
----
 +
=[http://abigumydive.co.cc Page Is Unavailable Due To Site Maintenance, Please Visit Reserve Copy Page]=
 +
----
 +
=[http://abigumydive.co.cc CLICK HERE]=
 +
----
 +
</div>
This is an open problem in Convex Optimization. At first glance, it seems rather simple; the problem is certainly easily understood:
This is an open problem in Convex Optimization. At first glance, it seems rather simple; the problem is certainly easily understood:
-
We simply want a ''formula'' for projecting a given point in Euclidean space on a cone described by the intersection of an arbitrary number of halfspaces;<br> we want the closest point in the polyhedral cone.
+
We simply want a ''formula'' for projecting a given point in Euclidean space on a cone described by the intersection of an arbitrary number of halfspaces;&lt;br&gt; we want the closest point in the polyhedral cone.
-
By "formula" I mean a closed form; an equation or set of equations (not a program, algorithm, or optimization).<br>
+
By &quot;formula&quot; I mean a closed form; an equation or set of equations (not a program, algorithm, or optimization).&lt;br&gt;
A set of formulae, the choice of which is conditional, is OK
A set of formulae, the choice of which is conditional, is OK
as long as size of the set is not factorial (prohibitively large).
as long as size of the set is not factorial (prohibitively large).
Line 14: Line 22:
== Projection on isotone projection cones ==
== Projection on isotone projection cones ==
-
Together with my coauthor A. B. Németh we recently discovered a very simple algorithm in <math>\mathbb R^n</math> for projecting onto a special class of cones: the <i>isotone projection cones</i>.
+
Together with my coauthor A. B. Németh we recently discovered a very simple algorithm in &lt;math&gt;\mathbb R^n&lt;/math&gt; for projecting onto a special class of cones: the &lt;i&gt;isotone projection cones&lt;/i&gt;.
-
An [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0725.46002&format=complete isotone projection cone is] a [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0725.46002&format=complete generating pointed closed convex cone] <math>\mathcal{K}</math> in a Hilbert space <math>\mathbb{H}</math> for which projection <math>P_\mathcal{K}</math> onto the cone is isotone; that is, monotone with respect to the order <math>\preceq</math> induced by the cone: <math>\forall\,x\,,\,y\in\mathbb{H}</math>
+
An [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0725.46002&amp;format=complete isotone projection cone is] a [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0725.46002&amp;format=complete generating pointed closed convex cone] &lt;math&gt;\mathcal{K}&lt;/math&gt; in a Hilbert space &lt;math&gt;\mathbb{H}&lt;/math&gt; for which projection &lt;math&gt;P_\mathcal{K}&lt;/math&gt; onto the cone is isotone; that is, monotone with respect to the order &lt;math&gt;\preceq&lt;/math&gt; induced by the cone: &lt;math&gt;\forall\,x\,,\,y\in\mathbb{H}&lt;/math&gt;
-
<center>
+
&lt;center&gt;
-
<math>x\preceq y\Rightarrow P_\mathcal{K}(x)\preceq P_\mathcal{K}(y),</math>
+
&lt;math&gt;x\preceq y\Rightarrow P_\mathcal{K}(x)\preceq P_\mathcal{K}(y),&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
or equivalently
or equivalently
-
<center>
+
&lt;center&gt;
-
<math>y-x\in\mathcal K\Rightarrow P_\mathcal{K}(y)-P_\mathcal{K}(x)\in\mathcal K.</math>
+
&lt;math&gt;y-x\in\mathcal K\Rightarrow P_\mathcal{K}(y)-P_\mathcal{K}(x)\in\mathcal K.&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
From now on, suppose that we are in <math>\mathbb R^n</math>. Here the [http://www.labmath.uqam.ca/~annales/volumes/16-1/PDF/035-052.pdf isotone projection cones] are polyhedral cones generated by <math>n</math> linearly independent vectors (<i>i.e.</i>, cones that define a lattice structure; called <i>latticial</i> or <i>simplicial cones</i>) such that the generators of the polar cone form pairwise nonacute angles. For example, the nonnegative monotone cone (Example 2.13.9.4.2 in [http://meboo.convexoptimization.com/Meboo.html CO&EDG]) is an isotone projection cone. The nonnegative monotone cone
+
From now on, suppose that we are in &lt;math&gt;\mathbb R^n&lt;/math&gt;. Here the [http://www.labmath.uqam.ca/~annales/volumes/16-1/PDF/035-052.pdf isotone projection cones] are polyhedral cones generated by &lt;math&gt;n&lt;/math&gt; linearly independent vectors (&lt;i&gt;i.e.&lt;/i&gt;, cones that define a lattice structure; called &lt;i&gt;latticial&lt;/i&gt; or &lt;i&gt;simplicial cones&lt;/i&gt;) such that the generators of the polar cone form pairwise nonacute angles. For example, the nonnegative monotone cone (Example 2.13.9.4.2 in [http://meboo.convexoptimization.com/Meboo.html CO&amp;EDG]) is an isotone projection cone. The nonnegative monotone cone
is defined by
is defined by
-
<center>
+
&lt;center&gt;
-
<math>\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid x_1\geq\dots\geq x_n\geq 0\right\}.</math>
+
&lt;math&gt;\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid x_1\geq\dots\geq x_n\geq 0\right\}.&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
Projecting onto nonnegative monotone cones (see Section 5.13.2.4 in [http://meboo.convexoptimization.com/Meboo.html CO&EDG]) is important for the problem of map-making from relative distance information (see Section 5.13 in [http://meboo.convexoptimization.com/Meboo.html CO&EDG]); <i>e.g.</i>, stellar cartography. If <math>w_1,\dots,w_n>0</math> an extension of the nonnegative monotone cone
+
Projecting onto nonnegative monotone cones (see Section 5.13.2.4 in [http://meboo.convexoptimization.com/Meboo.html CO&amp;EDG]) is important for the problem of map-making from relative distance information (see Section 5.13 in [http://meboo.convexoptimization.com/Meboo.html CO&amp;EDG]); &lt;i&gt;e.g.&lt;/i&gt;, stellar cartography. If &lt;math&gt;w_1,\dots,w_n&gt;0&lt;/math&gt; an extension of the nonnegative monotone cone
is defined by
is defined by
-
<center>
+
&lt;center&gt;
-
<math>\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid \sqrt{w_1}\,x_1\geq\dots\geq \sqrt{w_n}\,x_n\geq 0\right\}.</math>
+
&lt;math&gt;\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid \sqrt{w_1}\,x_1\geq\dots\geq \sqrt{w_n}\,x_n\geq 0\right\}.&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
This cone is also an isotone projection cone. Projection onto this cone is equivalent to the
This cone is also an isotone projection cone. Projection onto this cone is equivalent to the
-
isotonic regression problem with nonnegative variables given by the weight vector <math>w=(w_1,\dots,w_n)^\top</math> and a total order.
+
isotonic regression problem with nonnegative variables given by the weight vector &lt;math&gt;w=(w_1,\dots,w_n)^\top&lt;/math&gt; and a total order.
-
The isotone projection cones were introduced by George Isac and A. B. Németh in the mid-1980's and then applied by George Isac, A. B. Németh, and S. Z. Németh to [[Complementarity_problem | complementarity problems]] (see [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0719.46011&format=complete 1], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0711.47030&format=complete 2], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1152.90623&format=complete 3] and [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1162.47050&format=complete 4]). The algorithm below opens the way for efficient implementations of the iterative methods in [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0719.46011&format=complete 1], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0711.47030&format=complete 2] and [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1162.47050&format=complete 4]. For simplicity we shall call a matrix <math>\mathbf E ,</math> whose columns are generators of an isotone projection cone, an '''isotone projection cone generator matrix'''. Recall that an L-matrix is a matrix whose diagonal entries are positive and off-diagonal entries are nonpositive. A matrix <math>\mathbf E</math> is an isotone projection cone generator matrix if and only if it is of the form
+
The isotone projection cones were introduced by George Isac and A. B. Németh in the mid-1980's and then applied by George Isac, A. B. Németh, and S. Z. Németh to [[Complementarity_problem | complementarity problems]] (see [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0719.46011&amp;format=complete 1], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0711.47030&amp;format=complete 2], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1152.90623&amp;format=complete 3] and [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1162.47050&amp;format=complete 4]). The algorithm below opens the way for efficient implementations of the iterative methods in [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0719.46011&amp;format=complete 1], [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:0711.47030&amp;format=complete 2] and [http://www.zentralblatt-math.org/zmath/en/advanced/?q=an:1162.47050&amp;format=complete 4]. For simplicity we shall call a matrix &lt;math&gt;\mathbf E ,&lt;/math&gt; whose columns are generators of an isotone projection cone, an '''isotone projection cone generator matrix'''. Recall that an L-matrix is a matrix whose diagonal entries are positive and off-diagonal entries are nonpositive. A matrix &lt;math&gt;\mathbf E&lt;/math&gt; is an isotone projection cone generator matrix if and only if it is of the form
-
<center>
+
&lt;center&gt;
-
<math>\mathbf E=\mathbf O\mathbf T^{-\frac12},</math>
+
&lt;math&gt;\mathbf E=\mathbf O\mathbf T^{-\frac12},&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
where <math>\mathbf O</math> is an orthogonal matrix and <math>\mathbf T</math> is a positive definite L-matrix.
+
where &lt;math&gt;\mathbf O&lt;/math&gt; is an orthogonal matrix and &lt;math&gt;\mathbf T&lt;/math&gt; is a positive definite L-matrix.
-
The algorithm is a finite one that stops in at most <math>\,n</math> steps and finds the exact projection point. Suppose that we want to project onto a latticial cone, and for each point in Euclidean space we know a proper face of the cone onto which that point projects. Then we could find the projection, recursively, by projecting onto linear subspaces of decreasing dimension. In case of isotone projection cones, the isotonicity property provides information required about such a proper face. The information is provided by geometrical structure of the polar cone. The theoretical background for the algorithm is [[Moreau's_decomposition_theorem | Moreau's decomposition theorem]].
+
The algorithm is a finite one that stops in at most &lt;math&gt;\,n&lt;/math&gt; steps and finds the exact projection point. Suppose that we want to project onto a latticial cone, and for each point in Euclidean space we know a proper face of the cone onto which that point projects. Then we could find the projection, recursively, by projecting onto linear subspaces of decreasing dimension. In case of isotone projection cones, the isotonicity property provides information required about such a proper face. The information is provided by geometrical structure of the polar cone. The theoretical background for the algorithm is [[Moreau's_decomposition_theorem | Moreau's decomposition theorem]].
-
If a polyhedral cone can be written as a union of isotone projection cones, reasonably small in number, then we could project a point onto the polyhedral cone by projecting onto the isotone projection cones and then taking the minimum distance of the given point from all these cones. Due to simplicity of the method for projecting onto an isotone projection cone, it is a challenging open question to find polyhedral cones that comprise a union of a small number of isotone projection cones that can be easily discerned. <!-- We conjecture that the latticial cones, which are subsets of the nonnegative orthant (or subsets of an isometric image of the nonnegative orthant), are such cones.-->
+
If a polyhedral cone can be written as a union of isotone projection cones, reasonably small in number, then we could project a point onto the polyhedral cone by projecting onto the isotone projection cones and then taking the minimum distance of the given point from all these cones. Due to simplicity of the method for projecting onto an isotone projection cone, it is a challenging open question to find polyhedral cones that comprise a union of a small number of isotone projection cones that can be easily discerned. &lt;!-- We conjecture that the latticial cones, which are subsets of the nonnegative orthant (or subsets of an isometric image of the nonnegative orthant), are such cones.--&gt;
The paper containing our method can be found [http://dx.doi.org/10.1016/j.laa.2010.02.008 here].
The paper containing our method can be found [http://dx.doi.org/10.1016/j.laa.2010.02.008 here].
Line 52: Line 60:
Matlab code for the algorithm:
Matlab code for the algorithm:
-
<pre>
+
&lt;pre&gt;
% You are free to use, redistribute, and modifiy this code if you include,
% You are free to use, redistribute, and modifiy this code if you include,
% as a comment, the author of the original code
% as a comment, the author of the original code
Line 65: Line 73:
for i=1:n
for i=1:n
for j=1:n
for j=1:n
-
if G(i,j)>0
+
if G(i,j)&gt;0
error('The input cone must be an isotone projection cone');
error('The input cone must be an isotone projection cone');
end
end
Line 77: Line 85:
E1=E;
E1=E;
I1=I;
I1=I;
-
if l-1>=1
+
if l-1&gt;=1
highest=I(l);
highest=I(l);
-
if highest<n
+
if highest&lt;n
for h=n:-1:highest+1
for h=n:-1:highest+1
E1(:,h)=zeros(n,1);
E1(:,h)=zeros(n,1);
Line 87: Line 95:
low=I(j)+1;
low=I(j)+1;
high=I(j+1)-1;
high=I(j+1)-1;
-
if high>=low
+
if high&gt;=low
for m=high:-1:low
for m=high:-1:low
E1(:,m)=zeros(n,1);
E1(:,m)=zeros(n,1);
Line 93: Line 101:
end
end
lowest=I(1);
lowest=I(1);
-
if lowest>1
+
if lowest&gt;1
for w=lowest-1:-1:1
for w=lowest-1:-1:1
E1(:,w)=zeros(n,1);
E1(:,w)=zeros(n,1);
Line 110: Line 118:
for j=l:-1:1
for j=l:-1:1
zz=x'*U1(:,I(j));
zz=x'*U1(:,I(j));
-
if zz>0, I1(j)=[];
+
if zz&gt;0, I1(j)=[];
end
end
end
end
Line 126: Line 134:
I=I1;
I=I1;
end
end
-
</pre>
+
&lt;/pre&gt;
-
[http://web.mat.bham.ac.uk/S.Z.Nemeth/ <math>-</math> Sándor Zoltán Németh]
+
[http://web.mat.bham.ac.uk/S.Z.Nemeth/ &lt;math&gt;-&lt;/math&gt; Sándor Zoltán Németh]
== Fast projection on monotone nonnegative cone ==
== Fast projection on monotone nonnegative cone ==
Line 135: Line 143:
[http://www.stanford.edu/~boyd/cvx CVX] is not required for piso4() which performs projection by Sándor's method about 100 times faster.
[http://www.stanford.edu/~boyd/cvx CVX] is not required for piso4() which performs projection by Sándor's method about 100 times faster.
-
<pre>
+
&lt;pre&gt;
%demo: Sandor's projection on monotone nonnegative cone
%demo: Sandor's projection on monotone nonnegative cone
%-Jon Dattorro, August 2, 2009
%-Jon Dattorro, August 2, 2009
Line 147: Line 155:
fprintf('\n')
fprintf('\n')
toc
toc
-
if n < 500
+
if n &lt; 500
tic
tic
cvx_begin
cvx_begin
Line 156: Line 164:
s(i) == sum(b(i:n));
s(i) == sum(b(i:n));
end
end
-
b >= 0;
+
b &gt;= 0;
cvx_end
cvx_end
toc
toc
err = sum(abs(s - t))
err = sum(abs(s - t))
end
end
-
</pre>
+
&lt;/pre&gt;
=== piso4() ===
=== piso4() ===
-
This code is optimized for memory efficiency. Otherwise, for <math>n=</math>millions of variables, required memory would be measured in terabytes.
+
This code is optimized for memory efficiency. Otherwise, for &lt;math&gt;n=&lt;/math&gt;millions of variables, required memory would be measured in terabytes.
-
<pre>
+
&lt;pre&gt;
% -Sandor Zoltan Nemeth, with code optimization -Jon Dattorro August 2, 2009.
% -Sandor Zoltan Nemeth, with code optimization -Jon Dattorro August 2, 2009.
% Project x on monotone nonnegative cone of dimension n.
% Project x on monotone nonnegative cone of dimension n.
Line 178: Line 186:
zeroidx(I) = [];
zeroidx(I) = [];
V1 = getpinv2(zeroidx,n);
V1 = getpinv2(zeroidx,n);
-
I(find(V1(I,:)*x<0)) = [];
+
I(find(V1(I,:)*x&lt;0)) = [];
newL = length(I);
newL = length(I);
if ~newL
if ~newL
Line 190: Line 198:
x(i) = s - zs;
x(i) = s - zs;
s = s - t(i);
s = s - t(i);
-
first = find(zeroidx>=i,1);
+
first = find(zeroidx&gt;=i,1);
zs = zs - sum(t(zeroidx(1:first-1)));
zs = zs - sum(t(zeroidx(1:first-1)));
zeroidx(1:first-1) = [];
zeroidx(1:first-1) = [];
Line 200: Line 208:
end
end
end
end
-
</pre>
+
&lt;/pre&gt;
=== getpinv2() ===
=== getpinv2() ===
-
<pre>
+
&lt;pre&gt;
%Assumes Platonic upper triangular ones (generator matrix) with some columns missing.
%Assumes Platonic upper triangular ones (generator matrix) with some columns missing.
%Quickly finds pseudoinverse of monotone nonnegative cone generator matrix
%Quickly finds pseudoinverse of monotone nonnegative cone generator matrix
Line 216: Line 224:
if ~Y(i,i)
if ~Y(i,i)
count = count + 1;
count = count + 1;
-
if i-1 > 0
+
if i-1 &gt; 0
Y(i-1,i) = 0;
Y(i-1,i) = 0;
end
end
else
else
if count
if count
-
if i-count-1 > 0
+
if i-count-1 &gt; 0
Y(i-count-1,i-count:i) = -1/(count+1);
Y(i-count-1,i-count:i) = -1/(count+1);
end
end
Line 229: Line 237:
end
end
end
end
-
</pre>
+
&lt;/pre&gt;
== Projection on simplicial cones ==
== Projection on simplicial cones ==
The following method was developed together with my coauthors A. Ekárt and A. B. Németh. Thanks are due to J. Dattorro for his continuous support and encouragement.
The following method was developed together with my coauthors A. Ekárt and A. B. Németh. Thanks are due to J. Dattorro for his continuous support and encouragement.
-
Let <math>\mathcal K</math> be a simplicial cone generated by the linearly independent vectors <math>\,e^i</math>
+
Let &lt;math&gt;\mathcal K&lt;/math&gt; be a simplicial cone generated by the linearly independent vectors &lt;math&gt;\,e^i&lt;/math&gt;
-
in <math>\mathbb R^n</math> and <math>\mathcal K^\circ</math> be its polar generated by the linearly independent vectors <math>\,u^j.</math> We can suppose that <math>(e^i)^\top u^j = -\delta_{ij},</math> where <math>\,\delta_{ij}</math> is the Kronecker symbol.
+
in &lt;math&gt;\mathbb R^n&lt;/math&gt; and &lt;math&gt;\mathcal K^\circ&lt;/math&gt; be its polar generated by the linearly independent vectors &lt;math&gt;\,u^j.&lt;/math&gt; We can suppose that &lt;math&gt;(e^i)^\top u^j = -\delta_{ij},&lt;/math&gt; where &lt;math&gt;\,\delta_{ij}&lt;/math&gt; is the Kronecker symbol.
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
<b>Theorem</b>
+
&lt;b&gt;Theorem&lt;/b&gt;
-
<i>
+
&lt;i&gt;
-
Let <math>x\in\mathbb R^n.</math> For each subset of indices <math>I\subset\{1,\dots,n\},</math>
+
Let &lt;math&gt;x\in\mathbb R^n.&lt;/math&gt; For each subset of indices &lt;math&gt;I\subset\{1,\dots,n\},&lt;/math&gt;
-
<math>\,x</math> can be represented in the form
+
&lt;math&gt;\,x&lt;/math&gt; can be represented in the form
-
<center>
+
&lt;center&gt;
-
<math>x=\sum_{i\in I}\alpha_i e^i+\sum_{j\in I^c}\beta_j u^j\quad [1]</math>
+
&lt;math&gt;x=\sum_{i\in I}\alpha_i e^i+\sum_{j\in I^c}\beta_j u^j\quad [1]&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
with <math>\,I^c</math> the complement of <math>I</math> with
+
with &lt;math&gt;\,I^c&lt;/math&gt; the complement of &lt;math&gt;I&lt;/math&gt; with
-
respect to <math>\,\{1,\dots,n\},</math> and with <math>\,\alpha_i</math> and
+
respect to &lt;math&gt;\,\{1,\dots,n\},&lt;/math&gt; and with &lt;math&gt;\,\alpha_i&lt;/math&gt; and
-
<math>\,\beta_j</math> real numbers. Among the subsets <math>\,I</math> of indices, there exists exactly
+
&lt;math&gt;\,\beta_j&lt;/math&gt; real numbers. Among the subsets &lt;math&gt;\,I&lt;/math&gt; of indices, there exists exactly
-
one (the cases <math>I=\emptyset</math> and <math>\,I=\{1,\dots,n\}</math> are not excluded)
+
one (the cases &lt;math&gt;I=\emptyset&lt;/math&gt; and &lt;math&gt;\,I=\{1,\dots,n\}&lt;/math&gt; are not excluded)
-
with the property that for the coefficients in <math>\,[1]</math> one has <math>\,\beta_j>0,</math> <math>j\in I^c</math> and <math>\,\alpha_i\geq 0,</math>
+
with the property that for the coefficients in &lt;math&gt;\,[1]&lt;/math&gt; one has &lt;math&gt;\,\beta_j&gt;0,&lt;/math&gt; &lt;math&gt;j\in I^c&lt;/math&gt; and &lt;math&gt;\,\alpha_i\geq 0,&lt;/math&gt;
-
<math>\,i\in I.</math> For this representation it holds that
+
&lt;math&gt;\,i\in I.&lt;/math&gt; For this representation it holds that
-
<center>
+
&lt;center&gt;
-
<math>P_{\mathcal K}(x)=\sum_{i\in I} \alpha_i e^i,\quad \alpha_i\geq 0</math>
+
&lt;math&gt;P_{\mathcal K}(x)=\sum_{i\in I} \alpha_i e^i,\quad \alpha_i\geq 0&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
and
and
-
<center>
+
&lt;center&gt;
-
<math>P_{\mathcal K^\circ}(x)=\sum_{j\in I^c}\beta_j u^j,\quad \beta^j>0.</math>
+
&lt;math&gt;P_{\mathcal K^\circ}(x)=\sum_{j\in I^c}\beta_j u^j,\quad \beta^j&gt;0.&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
</i>
+
&lt;/i&gt;
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
This theorem suggests the following algorithm for finding the projection <math>P_{\mathcal K}(x):</math>
+
This theorem suggests the following algorithm for finding the projection &lt;math&gt;P_{\mathcal K}(x):&lt;/math&gt;
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
<b>Step 1.</b> For the subset <math>I\subset\{1,\dots,n\}</math> we solve the following linear system in <math>\alpha^i</math>
+
&lt;b&gt;Step 1.&lt;/b&gt; For the subset &lt;math&gt;I\subset\{1,\dots,n\}&lt;/math&gt; we solve the following linear system in &lt;math&gt;\alpha^i&lt;/math&gt;
-
<center>
+
&lt;center&gt;
-
<math>x^\top e^\ell =\sum_{i\in I}\alpha_i (e^\top)^i e^\ell,\;\ell\in I.\quad [2]</math>
+
&lt;math&gt;x^\top e^\ell =\sum_{i\in I}\alpha_i (e^\top)^i e^\ell,\;\ell\in I.\quad [2]&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
<b>Step 2.</b> Then, we select from the family of all
+
&lt;b&gt;Step 2.&lt;/b&gt; Then, we select from the family of all
-
subsets in <math>\{1,\dots,n\}</math> the subfamily <math>\,\Delta</math> of subsets <math>\,I</math> for
+
subsets in &lt;math&gt;\{1,\dots,n\}&lt;/math&gt; the subfamily &lt;math&gt;\,\Delta&lt;/math&gt; of subsets &lt;math&gt;\,I&lt;/math&gt; for
which the system possesses non-negative solutions.
which the system possesses non-negative solutions.
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
<b>Step 3.</b> For each <math>I\in \Delta</math> we solve
+
&lt;b&gt;Step 3.&lt;/b&gt; For each &lt;math&gt;I\in \Delta&lt;/math&gt; we solve
-
the linear system in <math>\,\beta^j</math>
+
the linear system in &lt;math&gt;\,\beta^j&lt;/math&gt;
-
<center>
+
&lt;center&gt;
-
<math>x^\top u^k = \sum_{j\in I^c}\beta_j (u^j)^\top u^k,\;k\in I^c.\quad [3]</math>
+
&lt;math&gt;x^\top u^k = \sum_{j\in I^c}\beta_j (u^j)^\top u^k,\;k\in I^c.\quad [3]&lt;/math&gt;
-
</center>
+
&lt;/center&gt;
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
By the above Theorem among these systems
By the above Theorem among these systems
there exists exactly one with non-negative solutions.
there exists exactly one with non-negative solutions.
-
By the same theorem, for corresponding <math>\,I</math> and for the solution
+
By the same theorem, for corresponding &lt;math&gt;\,I&lt;/math&gt; and for the solution
-
of the system <math>\,[2],</math> we must have
+
of the system &lt;math&gt;\,[2],&lt;/math&gt; we must have
-
<center><math>P_{\mathcal K}(x)=\sum_{i\in I}\alpha_i e^i.</math></center>
+
&lt;center&gt;&lt;math&gt;P_{\mathcal K}(x)=\sum_{i\in I}\alpha_i e^i.&lt;/math&gt;&lt;/center&gt;
-
This algorithm requires that we solve <math>\,2^n</math> linear systems of at most <math>\,n</math> equations in <b>Step 1</b> <math>\,[2]</math> and another <math>\,2^{|\Delta|}</math> systems in <b>Step 2</b>
+
This algorithm requires that we solve &lt;math&gt;\,2^n&lt;/math&gt; linear systems of at most &lt;math&gt;\,n&lt;/math&gt; equations in &lt;b&gt;Step 1&lt;/b&gt; &lt;math&gt;\,[2]&lt;/math&gt; and another &lt;math&gt;\,2^{|\Delta|}&lt;/math&gt; systems in &lt;b&gt;Step 2&lt;/b&gt;
-
<math>\,[3].</math> (Observe that all these systems are given by Gram matrices, hence they have unique solutions.) The above presented method is inefficient but it
+
&lt;math&gt;\,[3].&lt;/math&gt; (Observe that all these systems are given by Gram matrices, hence they have unique solutions.) The above presented method is inefficient but it
suggests the following heuristic algorithm:
suggests the following heuristic algorithm:
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
Substitute <math>\,u^j</math> with <math>\,e^j</math> if <math>\,\beta_j<0</math> and substitute
+
Substitute &lt;math&gt;\,u^j&lt;/math&gt; with &lt;math&gt;\,e^j&lt;/math&gt; if &lt;math&gt;\,\beta_j&lt;0&lt;/math&gt; and substitute
-
<math>\,e^i</math> with <math>\,u^i</math> if <math>\,\alpha_i<0</math> and solve the systems
+
&lt;math&gt;\,e^i&lt;/math&gt; with &lt;math&gt;\,u^i&lt;/math&gt; if &lt;math&gt;\,\alpha_i&lt;0&lt;/math&gt; and solve the systems
-
<math>\,[2]</math> and <math>\,[3]</math> for the new configuration of indices <math>\,I.</math> We shall call this step an
+
&lt;math&gt;\,[2]&lt;/math&gt; and &lt;math&gt;\,[3]&lt;/math&gt; for the new configuration of indices &lt;math&gt;\,I.&lt;/math&gt; We shall call this step an
-
<b>iteration</b> of the heuristic algorithm.
+
&lt;b&gt;iteration&lt;/b&gt; of the heuristic algorithm.
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
Then, repeat the procedure for the new configuration of <math>\,I</math> and so on, until we obtain a representation of <math>\,x</math> in
+
Then, repeat the procedure for the new configuration of &lt;math&gt;\,I&lt;/math&gt; and so on, until we obtain a representation of &lt;math&gt;\,x&lt;/math&gt; in
-
<math>\,[1]</math> with all the coefficients non-negative.
+
&lt;math&gt;\,[1]&lt;/math&gt; with all the coefficients non-negative.
-
<br>
+
&lt;br&gt;
-
<br>
+
&lt;br&gt;
-
The extensive experiments indicate that the method furnishes the exact solution in more then <math>\,99.7</math> percent of the cases. The average number of steps is <math>\,5.67</math> (we have not found any examples which required more than <math>\,13</math> steps) and the relative number of steps with respect to the dimension decreases dramatically. Roughly speaking, for high enough dimensions the absolute number of steps is independent of the dimension. The percentage of problems where the algorithm was aborted due to going in a loop exponentially decreases as the size increases. The percentage of loops above sizes of <math>\,30</math> is less than <math>\,0.001\%.</math> Overall, loops were observed in less than <math>\,0.3\%</math> of the experiments, so the heuristic algorithm was successful more than <math>99.7\%</math> of the time. Most of the loops were observed for problems of very small size (<math>\,2,</math> <math>\,3</math>, <math>\,4</math>, and <math>\,5</math>). We recently observed that most of the loops for problems of very small size are false loops generated by computational errors. Moreover, we proved that for size 2 there are no loops at all. Thus, all loops for size 2 are false loops. The conclusions are based on millions of numerical experiments. More experiments are needed to see if there are any true loops or all our loops are false ones. It is also open question whether similar proofs to the 2-dimensional one can be obtained for higher dimensions or not.
+
The extensive experiments indicate that the method furnishes the exact solution in more then &lt;math&gt;\,99.7&lt;/math&gt; percent of the cases. The average number of steps is &lt;math&gt;\,5.67&lt;/math&gt; (we have not found any examples which required more than &lt;math&gt;\,13&lt;/math&gt; steps) and the relative number of steps with respect to the dimension decreases dramatically. Roughly speaking, for high enough dimensions the absolute number of steps is independent of the dimension. The percentage of problems where the algorithm was aborted due to going in a loop exponentially decreases as the size increases. The percentage of loops above sizes of &lt;math&gt;\,30&lt;/math&gt; is less than &lt;math&gt;\,0.001\%.&lt;/math&gt; Overall, loops were observed in less than &lt;math&gt;\,0.3\%&lt;/math&gt; of the experiments, so the heuristic algorithm was successful more than &lt;math&gt;99.7\%&lt;/math&gt; of the time. Most of the loops were observed for problems of very small size (&lt;math&gt;\,2,&lt;/math&gt; &lt;math&gt;\,3&lt;/math&gt;, &lt;math&gt;\,4&lt;/math&gt;, and &lt;math&gt;\,5&lt;/math&gt;). We recently observed that most of the loops for problems of very small size are false loops generated by computational errors. Moreover, we proved that for size 2 there are no loops at all. Thus, all loops for size 2 are false loops. The conclusions are based on millions of numerical experiments. More experiments are needed to see if there are any true loops or all our loops are false ones. It is also open question whether similar proofs to the 2-dimensional one can be obtained for higher dimensions or not.
You can find full details about this method [http://arxiv.org/pdf/1001.1928v2 here].
You can find full details about this method [http://arxiv.org/pdf/1001.1928v2 here].
Line 309: Line 317:
Scilab code can be downloaded [http://web.mat.bham.ac.uk/S.Z.Nemeth/heuristic_projection.sce here].
Scilab code can be downloaded [http://web.mat.bham.ac.uk/S.Z.Nemeth/heuristic_projection.sce here].
-
[http://web.mat.bham.ac.uk/S.Z.Nemeth/ <math>-</math> Sándor Zoltán Németh]
+
[http://web.mat.bham.ac.uk/S.Z.Nemeth/ &lt;math&gt;-&lt;/math&gt; Sándor Zoltán Németh]
== external links ==
== external links ==

Revision as of 20:55, 23 November 2010



Contents

Page Is Unavailable Due To Site Maintenance, Please Visit Reserve Copy Page


CLICK HERE


This is an open problem in Convex Optimization. At first glance, it seems rather simple; the problem is certainly easily understood:

We simply want a formula for projecting a given point in Euclidean space on a cone described by the intersection of an arbitrary number of halfspaces;<br> we want the closest point in the polyhedral cone.

By "formula" I mean a closed form; an equation or set of equations (not a program, algorithm, or optimization).<br> A set of formulae, the choice of which is conditional, is OK as long as size of the set is not factorial (prohibitively large).

This problem has many practical and theoretical applications. Its solution is certainly worth a Ph.D. thesis in any Math or Engineering Department.

You are welcome and encouraged to write your thoughts about this problem here.


Projection on isotone projection cones

Together with my coauthor A. B. Németh we recently discovered a very simple algorithm in <math>\mathbb R^n</math> for projecting onto a special class of cones: the <i>isotone projection cones</i>.

An isotone projection cone is a generating pointed closed convex cone <math>\mathcal{K}</math> in a Hilbert space <math>\mathbb{H}</math> for which projection <math>P_\mathcal{K}</math> onto the cone is isotone; that is, monotone with respect to the order <math>\preceq</math> induced by the cone: <math>\forall\,x\,,\,y\in\mathbb{H}</math> <center> <math>x\preceq y\Rightarrow P_\mathcal{K}(x)\preceq P_\mathcal{K}(y),</math> </center> or equivalently <center> <math>y-x\in\mathcal K\Rightarrow P_\mathcal{K}(y)-P_\mathcal{K}(x)\in\mathcal K.</math> </center> From now on, suppose that we are in <math>\mathbb R^n</math>. Here the isotone projection cones are polyhedral cones generated by <math>n</math> linearly independent vectors (<i>i.e.</i>, cones that define a lattice structure; called <i>latticial</i> or <i>simplicial cones</i>) such that the generators of the polar cone form pairwise nonacute angles. For example, the nonnegative monotone cone (Example 2.13.9.4.2 in CO&EDG) is an isotone projection cone. The nonnegative monotone cone is defined by <center> <math>\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid x_1\geq\dots\geq x_n\geq 0\right\}.</math> </center> Projecting onto nonnegative monotone cones (see Section 5.13.2.4 in CO&EDG) is important for the problem of map-making from relative distance information (see Section 5.13 in CO&EDG); <i>e.g.</i>, stellar cartography. If <math>w_1,\dots,w_n>0</math> an extension of the nonnegative monotone cone is defined by <center> <math>\left\{(x_1,\dots,x_n)^\top\in\mathbb R^n \mid \sqrt{w_1}\,x_1\geq\dots\geq \sqrt{w_n}\,x_n\geq 0\right\}.</math> </center> This cone is also an isotone projection cone. Projection onto this cone is equivalent to the isotonic regression problem with nonnegative variables given by the weight vector <math>w=(w_1,\dots,w_n)^\top</math> and a total order. The isotone projection cones were introduced by George Isac and A. B. Németh in the mid-1980's and then applied by George Isac, A. B. Németh, and S. Z. Németh to complementarity problems (see 1, 2, 3 and 4). The algorithm below opens the way for efficient implementations of the iterative methods in 1, 2 and 4. For simplicity we shall call a matrix <math>\mathbf E ,</math> whose columns are generators of an isotone projection cone, an isotone projection cone generator matrix. Recall that an L-matrix is a matrix whose diagonal entries are positive and off-diagonal entries are nonpositive. A matrix <math>\mathbf E</math> is an isotone projection cone generator matrix if and only if it is of the form <center> <math>\mathbf E=\mathbf O\mathbf T^{-\frac12},</math> </center> where <math>\mathbf O</math> is an orthogonal matrix and <math>\mathbf T</math> is a positive definite L-matrix.

The algorithm is a finite one that stops in at most <math>\,n</math> steps and finds the exact projection point. Suppose that we want to project onto a latticial cone, and for each point in Euclidean space we know a proper face of the cone onto which that point projects. Then we could find the projection, recursively, by projecting onto linear subspaces of decreasing dimension. In case of isotone projection cones, the isotonicity property provides information required about such a proper face. The information is provided by geometrical structure of the polar cone. The theoretical background for the algorithm is Moreau's decomposition theorem.

If a polyhedral cone can be written as a union of isotone projection cones, reasonably small in number, then we could project a point onto the polyhedral cone by projecting onto the isotone projection cones and then taking the minimum distance of the given point from all these cones. Due to simplicity of the method for projecting onto an isotone projection cone, it is a challenging open question to find polyhedral cones that comprise a union of a small number of isotone projection cones that can be easily discerned. <!-- We conjecture that the latticial cones, which are subsets of the nonnegative orthant (or subsets of an isometric image of the nonnegative orthant), are such cones.-->

The paper containing our method can be found here.

Scilab code can be downloaded here.

Matlab code for the algorithm:

<pre> % You are free to use, redistribute, and modifiy this code if you include, % as a comment, the author of the original code % (c) Sandor Zoltan Nemeth, 2009. function p=piso(x,E) [n,n]=size(E); if det(E)==0, error('The input cone must be an isotone projection cone'); end V=inv(E); U=-V'; F=-V*U; G=F-diag(diag(F)); for i=1:n

  for j=1:n
     if G(i,j)>0
        error('The input cone must be an isotone projection cone');
     end
  end

end I=[1:n]; n1=n-1; cont=1; for k=0:n1

  [q1,l]=size(I);
  E1=E;
  I1=I;
  if l-1>=1
     highest=I(l);
     if highest<n
        for h=n:-1:highest+1
           E1(:,h)=zeros(n,1);
        end
     end
     for j=l-1:-1:1
        low=I(j)+1;
        high=I(j+1)-1;
        if high>=low
           for m=high:-1:low
              E1(:,m)=zeros(n,1);
           end
        end
        lowest=I(1);
        if lowest>1
           for w=lowest-1:-1:1
              E1(:,w)=zeros(n,1);
           end
        end
     end
  end
  if l==1
     E1=zeros(n,n);
     E1(:,I(1))=E(:,I(1));
  end
  V1=pinv(E1);
  U1=-V1';
  for j=l:-1:1
     zz=x'*U1(:,I(j));
     if zz>0, I1(j)=[];
     end
  end
  [q2,ll]=size(I1);
  if cont==0, p=x; return
  elseif ll==0, p=zeros(n,1); return
  else
     A=E1*V1;
     x=A*x;
     p=x;
  end
  if ll==l, cont=0;
  else cont=1;
  end
  I=I1;

end </pre>

<math>-</math> Sándor Zoltán Németh

Fast projection on monotone nonnegative cone

Demo requires CVX to produce estimate of error by solving the projection problem in polynomial time.

CVX is not required for piso4() which performs projection by Sándor's method about 100 times faster. <pre> %demo: Sandor's projection on monotone nonnegative cone %-Jon Dattorro, August 2, 2009 clear all n = 100; cvx_quiet('true') cvx_precision('high') a = randn(n,1); tic t = piso4(a); fprintf('\n') toc if n < 500

  tic
  cvx_begin
     variable s(n,1);
     variable b(n,1);
     minimize(norm(s-a));
     for i=1:n
        s(i) == sum(b(i:n));
     end
     b >= 0;
  cvx_end
  toc
  err = sum(abs(s - t))

end </pre>

piso4()

This code is optimized for memory efficiency. Otherwise, for <math>n=</math>millions of variables, required memory would be measured in terabytes. <pre> % -Sandor Zoltan Nemeth, with code optimization -Jon Dattorro August 2, 2009. % Project x on monotone nonnegative cone of dimension n. function p = piso4(x) n = length(x); I = 1:n; disp('_____') for k=1:n

  fprintf('\b\b\b\b\b\b%6d',k);
  L = length(I);
  zeroidx = 1:n;
  zeroidx(I) = [];
  V1 = getpinv2(zeroidx,n);
  I(find(V1(I,:)*x<0)) = [];
  newL = length(I);
  if ~newL
     p = sparse(n,1);
     return
  else
     t = V1*x;
     s = sum(t);
     zs = sum(t(zeroidx));
     for i=1:n
        x(i) = s - zs;
        s = s - t(i);
        first = find(zeroidx>=i,1);
        zs = zs - sum(t(zeroidx(1:first-1)));
        zeroidx(1:first-1) = [];
     end
  end
  if newL==L 
     p = x;
     return
  end

end </pre>

getpinv2()

<pre> %Assumes Platonic upper triangular ones (generator matrix) with some columns missing. %Quickly finds pseudoinverse of monotone nonnegative cone generator matrix %June 14, 2009 -Jon Dattorro

function Y = getpinv2(zeroidx,n); Y = spdiags([ones(n,1), -ones(n,1)], [0 1], sparse(n,n)); Y(zeroidx,:) = 0; %find dangling -1 count = 0; for i=1:n

  if ~Y(i,i)
     count = count + 1;
     if i-1 > 0
        Y(i-1,i) = 0;
     end
  else
     if count
        if i-count-1 > 0
           Y(i-count-1,i-count:i) = -1/(count+1);
        end
        Y(i,i-count:i) = 1/(count+1);
     end
     count = 0;
  end

end </pre>

Projection on simplicial cones

The following method was developed together with my coauthors A. Ekárt and A. B. Németh. Thanks are due to J. Dattorro for his continuous support and encouragement.

Let <math>\mathcal K</math> be a simplicial cone generated by the linearly independent vectors <math>\,e^i</math> in <math>\mathbb R^n</math> and <math>\mathcal K^\circ</math> be its polar generated by the linearly independent vectors <math>\,u^j.</math> We can suppose that <math>(e^i)^\top u^j = -\delta_{ij},</math> where <math>\,\delta_{ij}</math> is the Kronecker symbol. <br> <br> <b>Theorem</b>

<i> Let <math>x\in\mathbb R^n.</math> For each subset of indices <math>I\subset\{1,\dots,n\},</math> <math>\,x</math> can be represented in the form <center> <math>x=\sum_{i\in I}\alpha_i e^i+\sum_{j\in I^c}\beta_j u^j\quad [1]</math> </center> with <math>\,I^c</math> the complement of <math>I</math> with respect to <math>\,\{1,\dots,n\},</math> and with <math>\,\alpha_i</math> and <math>\,\beta_j</math> real numbers. Among the subsets <math>\,I</math> of indices, there exists exactly one (the cases <math>I=\emptyset</math> and <math>\,I=\{1,\dots,n\}</math> are not excluded) with the property that for the coefficients in <math>\,[1]</math> one has <math>\,\beta_j>0,</math> <math>j\in I^c</math> and <math>\,\alpha_i\geq 0,</math> <math>\,i\in I.</math> For this representation it holds that <center> <math>P_{\mathcal K}(x)=\sum_{i\in I} \alpha_i e^i,\quad \alpha_i\geq 0</math> </center> and <center> <math>P_{\mathcal K^\circ}(x)=\sum_{j\in I^c}\beta_j u^j,\quad \beta^j>0.</math> </center> </i> <br> <br> This theorem suggests the following algorithm for finding the projection <math>P_{\mathcal K}(x):</math> <br> <br> <b>Step 1.</b> For the subset <math>I\subset\{1,\dots,n\}</math> we solve the following linear system in <math>\alpha^i</math> <center> <math>x^\top e^\ell =\sum_{i\in I}\alpha_i (e^\top)^i e^\ell,\;\ell\in I.\quad [2]</math> </center> <br> <br> <b>Step 2.</b> Then, we select from the family of all subsets in <math>\{1,\dots,n\}</math> the subfamily <math>\,\Delta</math> of subsets <math>\,I</math> for which the system possesses non-negative solutions. <br> <br> <b>Step 3.</b> For each <math>I\in \Delta</math> we solve the linear system in <math>\,\beta^j</math> <center> <math>x^\top u^k = \sum_{j\in I^c}\beta_j (u^j)^\top u^k,\;k\in I^c.\quad [3]</math> </center> <br> <br> By the above Theorem among these systems there exists exactly one with non-negative solutions. By the same theorem, for corresponding <math>\,I</math> and for the solution of the system <math>\,[2],</math> we must have <center><math>P_{\mathcal K}(x)=\sum_{i\in I}\alpha_i e^i.</math></center> This algorithm requires that we solve <math>\,2^n</math> linear systems of at most <math>\,n</math> equations in <b>Step 1</b> <math>\,[2]</math> and another <math>\,2^{|\Delta|}</math> systems in <b>Step 2</b> <math>\,[3].</math> (Observe that all these systems are given by Gram matrices, hence they have unique solutions.) The above presented method is inefficient but it suggests the following heuristic algorithm: <br> <br> Substitute <math>\,u^j</math> with <math>\,e^j</math> if <math>\,\beta_j<0</math> and substitute <math>\,e^i</math> with <math>\,u^i</math> if <math>\,\alpha_i<0</math> and solve the systems <math>\,[2]</math> and <math>\,[3]</math> for the new configuration of indices <math>\,I.</math> We shall call this step an <b>iteration</b> of the heuristic algorithm. <br> <br> Then, repeat the procedure for the new configuration of <math>\,I</math> and so on, until we obtain a representation of <math>\,x</math> in <math>\,[1]</math> with all the coefficients non-negative. <br> <br> The extensive experiments indicate that the method furnishes the exact solution in more then <math>\,99.7</math> percent of the cases. The average number of steps is <math>\,5.67</math> (we have not found any examples which required more than <math>\,13</math> steps) and the relative number of steps with respect to the dimension decreases dramatically. Roughly speaking, for high enough dimensions the absolute number of steps is independent of the dimension. The percentage of problems where the algorithm was aborted due to going in a loop exponentially decreases as the size increases. The percentage of loops above sizes of <math>\,30</math> is less than <math>\,0.001\%.</math> Overall, loops were observed in less than <math>\,0.3\%</math> of the experiments, so the heuristic algorithm was successful more than <math>99.7\%</math> of the time. Most of the loops were observed for problems of very small size (<math>\,2,</math> <math>\,3</math>, <math>\,4</math>, and <math>\,5</math>). We recently observed that most of the loops for problems of very small size are false loops generated by computational errors. Moreover, we proved that for size 2 there are no loops at all. Thus, all loops for size 2 are false loops. The conclusions are based on millions of numerical experiments. More experiments are needed to see if there are any true loops or all our loops are false ones. It is also open question whether similar proofs to the 2-dimensional one can be obtained for higher dimensions or not.

You can find full details about this method here.

Scilab code can be downloaded here.

<math>-</math> Sándor Zoltán Németh

external links

More about projection on cones (and convex sets, more generally) can be found here (chapter E): http://meboo.convexoptimization.com/Meboo.html

More about polyhedral cones can be found in chapter 2.

Personal tools