Is it really good to have these constructors taking all these integer parameters? Is "actualVectors" really a good name? Investigate
I don't understand the API fully yet (I'm writing documentation while I try to make sense of it), but here are some thoughts. The full constructor takes two integer arguments, one of which is where the Householder sequence starts (shift) and one of which is the length of the Householder sequence (actualVectors). I don't see why the latter is necessary; isn't that the same as the length of the vector of Householder coefficients? If the user has a vector of length L but only wants to construct a sequence of length M, then I think it's cleaner to let the user truncate the vector with .head(L) instead of passing another argument. In any case, actualVectors seems a terrible name - for one, it is an integer and not (a) vector(s). How about calling the two integer arguments sequenceStart and sequenceLength? There is also a boolean argument, trans. However, this seems to be used only internally (to construct the transpose/adjoint of a HouseholderSequence). This suggests putting it last, and possibly moving it to an internal constructor.
Hi Jitse, thanks for the comments and sorry for not replying earlier. I was busy, and now I'm on vacation. Sorry for the fail. I'll try to reply ASAP.
(In reply to comment #1) > The full constructor takes two integer arguments, one of which is where the > Householder sequence starts (shift) and one of which is the length of the > Householder sequence (actualVectors). I don't see why the latter is necessary; > isn't that the same as the length of the vector of Householder coefficients? No, that can be different: actualVectors is the number of Householder vectors that will actually be used to form the Householder sequence, so it's _at most_ the length of the first householder vector but can be smaller. See in ColPivHouseholderQR.h, the actualVectors parameter passed there is the number of nonzero pivots. > If > the user has a vector of length L but only wants to construct a sequence of > length M, then I think it's cleaner to let the user truncate the vector with > .head(L) instead of passing another argument. In order to construct a householdersequence, you need more than a vector, you need a triangular part of a matrix. Typically, you have the lower triangle in a matrix, and you construct a sequence of householder vectors from the columns. See HouseholderQR.h. > In any case, actualVectors seems a terrible name - for one, it is an integer > and not (a) vector(s). I agree, let's rename it. > How about calling the two integer arguments > sequenceStart and sequenceLength? sequenceLength is a good name. But I don't like sequenceStart as it wrongly puts it on an equal footing with it, whereas it's a quite different thing. I'd keep the name 'shift'. > There is also a boolean argument, trans. However, this seems to be used only > internally (to construct the transpose/adjoint of a HouseholderSequence). This > suggests putting it last, and possibly moving it to an internal constructor. Good point! Also, for actualVectors/sequenceLength and for shift, since these are often not explicitly specified, we could consider not allowing to specify them in the constructor, and instead adding methods allowing to set them and returning a ref to *this to allow for method chaining, e.g.: return householderSequence(...).setLength(...).setShift(...);
Created attachment 57 [details] proposed API changes Here's a patch implementing the proposed changes. actualVectors is renamed to length. All fancy ctors are removed. Instead, use setTrans(), setLength(), setShift(). I kept setTrans() public since it is used e.g. by ColPivHouseholderQR.h.
Created attachment 60 [details] Patch (relative to Benoit's patch 57) to remove .setLength(); does not work (In reply to comment #3) > In order to construct a householdersequence, you need more than a vector, you > need a triangular part of a matrix. Typically, you have the lower triangle in a > matrix, and you construct a sequence of householder vectors from the columns. > See HouseholderQR.h. You're going too fast for me. I'm afraid I still don't understand why the length needs to be specified. As you say, the constructor takes a matrix and a vector. Say the matrix is m-by-n and the vector has length p. The columns of the matrix are the Householder vectors, so each one has length n. This is the dimension of the space that the HouseholderSequence acts on. But I don't get how the other pieces of data - n, p, and the length as set by .setLength() - interact. My guess was that only the first _length_ columns of the matrix of Householder vectors and only the first _length_ elements of the vector of Householder coefficients are used. I thought that we could simplify the API by assuming that length equals p. This means in ColPivHouseholderQR::householderQ(), replacing HouseholderSequenceType(m_qr, m_hCoeffs.conjugate()) .setLength(m_nonzero_pivots); by HouseholderSequenceType(m_qr, m_hCoeffs.conjugate().head(m_nonzero_pivots)) I prepared a patch to illustrate my idea. However, it doesn't work. The qr_colpivoting test segfaults. I don't understand why; the segfault happens in HouseholderSequence::evalTo(), where m_coeffs.coeff(k) is accessed, even though k < m_coeffs.size(). Perhaps m_coeffs is a reference to a temporary which has disappeared? Anyway, I guess I don't really understand how HouseholderSequence works. > But I don't like sequenceStart as it wrongly puts it on an equal footing with > it, whereas it's a quite different thing. I'd keep the name 'shift'. Yeah, shift is a good name, so let's keep it. > Also, for actualVectors/sequenceLength and for shift, since these are often not > explicitly specified, we could consider not allowing to specify them in the > constructor, and instead adding methods allowing to set them and returning a > ref to *this to allow for method chaining, e.g.: > > return householderSequence(...).setLength(...).setShift(...); I like this.
(In reply to comment #5) > You're going too fast for me. I'm afraid I still don't understand why the > length needs to be specified. As you say, the constructor takes a matrix and a > vector. Say the matrix is m-by-n and the vector has length p. The columns of > the matrix are the Householder vectors, so each one has length n. ... no no no, this isn't how this works. But it's my fault since I never documented this stuff at all. Let me try now. To understand HouseholderSequence, the best is to first understand how Householder QR works. We have a n*p matrix A that we want to turn into a triangular matrix R by applying unitary transformations to A on the left. We first look at A.col(0) and find a n*n unitary U_0 (a Householder transformation) such that (U_0 * A).col(0) is zero except for its top coefficient, so that U_0 * A looks like this: x x ... x 0 x ... x : : : : : : 0 x ... x We then apply the same process to the remaining matrix (without the first row and the first column). So this time we are dealing with a (n-1)*(p-1) matrix, and the new Householder unitary U_1 we're constructing is (n-1)*(n-1), however we're extending it to a n*n unitary by letting it act trivially on the first coordinate i.e. 1 0 0 U_1 and by abuse of language, we still call that U_1. The matrix U_1 U_0 A looks like: x x x ... x 0 x x ... x 0 0 x ... x : : : : : : : : 0 0 x ... x We continue this process until the resulting matrix U_n ... U_1 U_0 A is triangular, this gives us R, and Q is U_0* ... U_n*. That was a high-level overview of the Householder QR algorithm. What I skipped is how these Householder transformations U_i are computed, but the important point is that each U_i is technically a i*i unitary, and the corresponding Householder vector is of size (n-i), not n. Moreover, these (n-i)-size vectors are stored in 2 parts: a size (n-i-1) "essential part" and a separate single "householder coefficient". These householder vectors "essential parts" are stored in the zeroed lower triangle of A. When we want to apply the corresponding Householder transformations all at once, i.e. to apply Q, we read the householder coefficients and we read the householder essential parts from the lower part of A, and apply a certain algorithm to compute the transformation. This is where HouseholderSequence comes into play: it takes a reference to a vector of householder coeffs, and a reference to a matrix where to read the essential parts from the lower triangle from, and allows to easily compute and apply the resulting unitary transformation. Thus, HouseholderSequence (at least the default "left" variant) really only cares about the lower triangle of the matrix passed to it. Now let's explain "length". If you are doing HouseholderQR on a n*p matrix, obviously there are n Householder transformations, so your householder sequence has length n. However, if you're doing ColPivHouseholderQR, it may find that your matrix is non-full-rank and terminate early. In this case, only the (rank) first Householder vectors have been computed. You then need a way to only apply the (rank) first Householder transformations. This is what HouseholderSequence::setLength() allows you to do. In short, HouseholderSequence::setLength() allows you to specify how many Householder transformations to read to construct the resulting unitary transformation. > This is the > dimension of the space that the HouseholderSequence acts on. But I don't get > how the other pieces of data - n, p, and the length as set by .setLength() - > interact. In the special case of ColPivHouseholderQR, think of the length as the rank of the matrix. In general, think of it as number of Householder transformations in the sequence. A householdersequence is just a sequence of householder transformations, and although it's often the case that one wants to interprete all of the lower triangle of a matrix as householder vectors, there are exceptions. > > My guess was that only the first _length_ columns of the matrix of Householder > vectors and only the first _length_ elements of the vector of Householder > coefficients are used. Yes! That's it. I thought that we could simplify the API by assuming > that length equals p. This means in ColPivHouseholderQR::householderQ(), > replacing > HouseholderSequenceType(m_qr, m_hCoeffs.conjugate()) > .setLength(m_nonzero_pivots); > by > HouseholderSequenceType(m_qr, m_hCoeffs.conjugate().head(m_nonzero_pivots)) > > I prepared a patch to illustrate my idea. However, it doesn't work. The > qr_colpivoting test segfaults. I don't understand why; the segfault happens in > HouseholderSequence::evalTo(), where m_coeffs.coeff(k) is accessed, even though > k < m_coeffs.size(). Perhaps m_coeffs is a reference to a temporary which has > disappeared? Perhaps householdersequence uses the matrix, not the vector, to compute the default length, so you'd have to pass a block in m_qr. I know that technically one could emulate setLength by constructing Householdersequences from blocks. It just isn't easy to use, and leads to more template instantiations. > > Anyway, I guess I don't really understand how HouseholderSequence works. Hope the above clarifies it.
> each U_i is technically a i*i unitary I meant (n-i)*(n-i).
Pushed.
Created attachment 62 [details] Patch (relative to Benoit's patch 57) to hide .setTrans() Benoit, thanks for your extensive explanation. I will use it as a basis to document the class. (In reply to comment #6) > > I know that technically one could emulate setLength by constructing > Householdersequences from blocks. It just isn't easy to use, and leads to more > template instantiations. Okay, I understand it now. However, should HouseholderSequence really expose the .setTrans() function? The .transpose() function is standard throughout Eigen and does something pretty similar; just replace HouseholderSequence(...).setTrans() by HouseholderSequence(...).transpose() The disadvantage is that the latter creates two HouseholderSequence objects, but I don't think that's a big deal. I attach a patch to fix ideas.
Comment on attachment 62 [details] Patch (relative to Benoit's patch 57) to hide .setTrans() (In reply to comment #9) > However, should HouseholderSequence really expose the .setTrans() function? The > .transpose() function is standard throughout Eigen and does something pretty > similar; just replace > HouseholderSequence(...).setTrans() > by > HouseholderSequence(...).transpose() Indeed, I had forgotten, but they do the same thing. > > The disadvantage is that the latter creates two HouseholderSequence objects, > but I don't think that's a big deal. Indeed not a big deal: HouseholderSequence objects are lightweight, just like expressions which we are returning by value all the time. Optimizing compilers are expected to be able to optimize them away. Please push.
-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/50.