This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen
Bug 50 - HouseholderSequence API cleanup
Summary: HouseholderSequence API cleanup
Status: RESOLVED FIXED
Alias: None
Product: Eigen
Classification: Unclassified
Component: Householder (show other bugs)
Version: 3.0
Hardware: All All
: --- Unknown
Assignee: Benoit Jacob
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 3.0-beta3
  Show dependency treegraph
 
Reported: 2010-10-16 04:22 UTC by Benoit Jacob
Modified: 2019-12-04 09:39 UTC (History)
1 user (show)



Attachments
proposed API changes (14.93 KB, patch)
2010-12-26 19:40 UTC, Benoit Jacob
no flags Details | Diff
Patch (relative to Benoit's patch 57) to remove .setLength(); does not work (2.71 KB, patch)
2010-12-29 18:32 UTC, Jitse Niesen
no flags Details | Diff
Patch (relative to Benoit's patch 57) to hide .setTrans() (1.59 KB, patch)
2011-01-01 16:21 UTC, Jitse Niesen
jacob.benoit.1: review+
Details | Diff

Description Benoit Jacob 2010-10-16 04:22:01 UTC
Is it really good to have these constructors taking all these integer parameters? Is "actualVectors" really a good name? Investigate
Comment 1 Jitse Niesen 2010-12-12 15:37:09 UTC
I don't understand the API fully yet (I'm writing documentation while I try to make sense of it), but here are some thoughts.

The full constructor takes two integer arguments, one of which is where the Householder sequence starts (shift) and one of which is the length of the Householder sequence (actualVectors). I don't see why the latter is necessary; isn't that the same as the length of the vector of Householder coefficients? If the user has a vector of length L but only wants to construct a sequence of length M, then I think it's cleaner to let the user truncate the vector with .head(L) instead of passing another argument. 

In any case, actualVectors seems a terrible name - for one, it is an integer and not (a) vector(s). How about calling the two integer arguments sequenceStart and sequenceLength?

There is also a boolean argument, trans. However, this seems to be used only internally (to construct the transpose/adjoint of a HouseholderSequence). This suggests putting it last, and possibly moving it to an internal constructor.
Comment 2 Benoit Jacob 2010-12-22 04:54:49 UTC
Hi Jitse, thanks for the comments and sorry for not replying earlier. I was busy, and now I'm on vacation. Sorry for the fail. I'll try to reply ASAP.
Comment 3 Benoit Jacob 2010-12-26 09:41:45 UTC
(In reply to comment #1)
> The full constructor takes two integer arguments, one of which is where the
> Householder sequence starts (shift) and one of which is the length of the
> Householder sequence (actualVectors). I don't see why the latter is necessary;
> isn't that the same as the length of the vector of Householder coefficients?

No, that can be different: actualVectors is the number of Householder vectors that will actually be used to form the Householder sequence, so it's _at most_ the length of the first householder vector but can be smaller. See in ColPivHouseholderQR.h, the actualVectors parameter passed there is the number of nonzero pivots.

> If
> the user has a vector of length L but only wants to construct a sequence of
> length M, then I think it's cleaner to let the user truncate the vector with
> .head(L) instead of passing another argument.

In order to construct a householdersequence, you need more than a vector, you need a triangular part of a matrix. Typically, you have the lower triangle in a matrix, and you construct a sequence of householder vectors from the columns. See HouseholderQR.h.

> In any case, actualVectors seems a terrible name - for one, it is an integer
> and not (a) vector(s).

I agree, let's rename it.

> How about calling the two integer arguments
> sequenceStart and sequenceLength?

sequenceLength is a good name.

But I don't like sequenceStart as it wrongly puts it on an equal footing with it, whereas it's a quite different thing. I'd keep the name 'shift'.

> There is also a boolean argument, trans. However, this seems to be used only
> internally (to construct the transpose/adjoint of a HouseholderSequence). This
> suggests putting it last, and possibly moving it to an internal constructor.

Good point!

Also, for actualVectors/sequenceLength and for shift, since these are often not explicitly specified, we could consider not allowing to specify them in the constructor, and instead adding methods allowing to set them and returning a ref to *this to allow for method chaining, e.g.:

   return householderSequence(...).setLength(...).setShift(...);
Comment 4 Benoit Jacob 2010-12-26 19:40:07 UTC
Created attachment 57 [details]
proposed API changes

Here's a patch implementing the proposed changes.

actualVectors is renamed to length.

All fancy ctors are removed. Instead, use setTrans(), setLength(), setShift().

I kept setTrans() public since it is used e.g. by ColPivHouseholderQR.h.
Comment 5 Jitse Niesen 2010-12-29 18:32:33 UTC
Created attachment 60 [details]
Patch (relative to Benoit's patch 57) to remove .setLength(); does not work

(In reply to comment #3)
> In order to construct a householdersequence, you need more than a vector, you
> need a triangular part of a matrix. Typically, you have the lower triangle in a
> matrix, and you construct a sequence of householder vectors from the columns.
> See HouseholderQR.h.

You're going too fast for me. I'm afraid I still don't understand why the length needs to be specified. As you say, the constructor takes a matrix and a vector. Say the matrix is m-by-n and the vector has length p. The columns of the matrix are the Householder vectors, so each one has length n. This is the dimension of the space that the HouseholderSequence acts on. But I don't get how the other pieces of data - n, p, and the length as set by .setLength() - interact. 

My guess was that only the first _length_ columns of the matrix of Householder vectors and only the first _length_ elements of the vector of Householder coefficients are used. I thought that we could simplify the API by assuming that length equals p. This means in ColPivHouseholderQR::householderQ(), replacing 
   HouseholderSequenceType(m_qr, m_hCoeffs.conjugate())
      .setLength(m_nonzero_pivots);
by
   HouseholderSequenceType(m_qr, m_hCoeffs.conjugate().head(m_nonzero_pivots))

I prepared a patch to illustrate my idea. However, it doesn't work. The qr_colpivoting test segfaults. I don't understand why; the segfault happens in HouseholderSequence::evalTo(), where m_coeffs.coeff(k) is accessed, even though k < m_coeffs.size(). Perhaps m_coeffs is a reference to a temporary which has disappeared?

Anyway, I guess I don't really understand how HouseholderSequence works.
 
> But I don't like sequenceStart as it wrongly puts it on an equal footing with
> it, whereas it's a quite different thing. I'd keep the name 'shift'.

Yeah, shift is a good name, so let's keep it.
 
> Also, for actualVectors/sequenceLength and for shift, since these are often not
> explicitly specified, we could consider not allowing to specify them in the
> constructor, and instead adding methods allowing to set them and returning a
> ref to *this to allow for method chaining, e.g.:
> 
>    return householderSequence(...).setLength(...).setShift(...);

I like this.
Comment 6 Benoit Jacob 2010-12-29 19:05:14 UTC
(In reply to comment #5)
> You're going too fast for me. I'm afraid I still don't understand why the
> length needs to be specified. As you say, the constructor takes a matrix and a
> vector. Say the matrix is m-by-n and the vector has length p. The columns of
> the matrix are the Householder vectors, so each one has length n.

... no no no, this isn't how this works. But it's my fault since I never documented this stuff at all. Let me try now.

To understand HouseholderSequence, the best is to first understand how Householder QR works. We have a n*p matrix A that we want to turn into a triangular matrix R by applying unitary transformations to A on the left.

We first look at A.col(0) and find a n*n unitary U_0 (a Householder transformation) such that (U_0 * A).col(0) is zero except for its top coefficient, so that U_0 * A looks like this:

 x x ... x
 0 x ... x
 : :     :
 : :     :
 0 x ... x

We then apply the same process to the remaining matrix (without the first row and the first column). So this time we are dealing with a (n-1)*(p-1) matrix, and the new Householder unitary U_1 we're constructing is (n-1)*(n-1), however we're extending it to a n*n unitary by letting it act trivially on the first coordinate i.e.

 1 0
 0 U_1

and by abuse of language, we still call that U_1.

The matrix U_1 U_0 A looks like:

 x x x ... x
 0 x x ... x
 0 0 x ... x
 : : :     :
 : : :     :
 0 0 x ... x

We continue this process until the resulting matrix U_n ... U_1 U_0 A is triangular, this gives us R, and Q is U_0* ... U_n*.

That was a high-level overview of the Householder QR algorithm. What I skipped is how these Householder transformations U_i are computed, but the important point is that each U_i is technically a i*i unitary, and the corresponding Householder vector is of size (n-i), not n. Moreover, these (n-i)-size vectors are stored in 2 parts: a size (n-i-1) "essential part" and a separate single "householder coefficient".

These householder vectors "essential parts" are stored in the zeroed lower triangle of A.

When we want to apply the corresponding Householder transformations all at once, i.e. to apply Q, we read the householder coefficients and we read the householder essential parts from the lower part of A, and apply a certain algorithm to compute the transformation.

This is where HouseholderSequence comes into play: it takes a reference to a vector of householder coeffs, and a reference to a matrix where to read the essential parts from the lower triangle from, and allows to easily compute and apply the resulting unitary transformation.

Thus, HouseholderSequence (at least the default "left" variant) really only cares about the lower triangle of the matrix passed to it.

Now let's explain "length".

If you are doing HouseholderQR on a n*p matrix, obviously there are n Householder transformations, so your householder sequence has length n.

However, if you're doing ColPivHouseholderQR, it may find that your matrix is non-full-rank and terminate early. In this case, only the (rank) first Householder vectors have been computed. You then need a way to only apply the (rank) first Householder transformations. This is what HouseholderSequence::setLength() allows you to do.

In short, HouseholderSequence::setLength() allows you to specify how many Householder transformations to read to construct the resulting unitary transformation.

> This is the
> dimension of the space that the HouseholderSequence acts on. But I don't get
> how the other pieces of data - n, p, and the length as set by .setLength() -
> interact.

In the special case of ColPivHouseholderQR, think of the length as the rank of the matrix.

In general, think of it as number of Householder transformations in the sequence. A householdersequence is just a sequence of householder transformations, and although it's often the case that one wants to interprete all of the lower triangle of a matrix as householder vectors, there are exceptions.

> 
> My guess was that only the first _length_ columns of the matrix of Householder
> vectors and only the first _length_ elements of the vector of Householder
> coefficients are used.

Yes! That's it.

 I thought that we could simplify the API by assuming
> that length equals p. This means in ColPivHouseholderQR::householderQ(),
> replacing 
>    HouseholderSequenceType(m_qr, m_hCoeffs.conjugate())
>       .setLength(m_nonzero_pivots);
> by
>    HouseholderSequenceType(m_qr, m_hCoeffs.conjugate().head(m_nonzero_pivots))
> 
> I prepared a patch to illustrate my idea. However, it doesn't work. The
> qr_colpivoting test segfaults. I don't understand why; the segfault happens in
> HouseholderSequence::evalTo(), where m_coeffs.coeff(k) is accessed, even though
> k < m_coeffs.size(). Perhaps m_coeffs is a reference to a temporary which has
> disappeared?

Perhaps householdersequence uses the matrix, not the vector, to compute the default length, so you'd have to pass a block in m_qr.

I know that technically one could emulate setLength by constructing Householdersequences from blocks. It just isn't easy to use, and leads to more template instantiations.

> 
> Anyway, I guess I don't really understand how HouseholderSequence works.

Hope the above clarifies it.
Comment 7 Benoit Jacob 2010-12-29 19:06:32 UTC
> each U_i is technically a i*i unitary

I meant (n-i)*(n-i).
Comment 8 Benoit Jacob 2010-12-30 15:19:30 UTC
Pushed.
Comment 9 Jitse Niesen 2011-01-01 16:21:28 UTC
Created attachment 62 [details]
Patch (relative to Benoit's patch 57) to hide .setTrans()

Benoit, thanks for your extensive explanation. I will use it as a basis to document the class.

(In reply to comment #6)
> 
> I know that technically one could emulate setLength by constructing
> Householdersequences from blocks. It just isn't easy to use, and leads to more
> template instantiations.

Okay, I understand it now.

However, should HouseholderSequence really expose the .setTrans() function? The .transpose() function is standard throughout Eigen and does something pretty similar; just replace 
   HouseholderSequence(...).setTrans()
by
   HouseholderSequence(...).transpose()

The disadvantage is that the latter creates two HouseholderSequence objects, but I don't think that's a big deal.

I attach a patch to fix ideas.
Comment 10 Benoit Jacob 2011-01-04 14:24:38 UTC
Comment on attachment 62 [details]
Patch (relative to Benoit's patch 57) to hide .setTrans()

(In reply to comment #9)
> However, should HouseholderSequence really expose the .setTrans() function? The
> .transpose() function is standard throughout Eigen and does something pretty
> similar; just replace 
>    HouseholderSequence(...).setTrans()
> by
>    HouseholderSequence(...).transpose()

Indeed, I had forgotten, but they do the same thing.

> 
> The disadvantage is that the latter creates two HouseholderSequence objects,
> but I don't think that's a big deal.

Indeed not a big deal: HouseholderSequence objects are lightweight, just like expressions which we are returning by value all the time. Optimizing compilers are expected to be able to optimize them away.

Please push.
Comment 11 Nobody 2019-12-04 09:39:00 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/50.

Note You need to log in before you can comment on or make changes to this bug.