Bug 206 - JacobiSVD problem size constructor does not preallocate all of its resources
JacobiSVD problem size constructor does not preallocate all of its resources
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: SVD
3.0
All All
: --- Unknown
Assigned To: Nobody
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-28 18:46 UTC by Adolfo Rodriguez Tsouroukdissian
Modified: 2011-10-31 05:21 UTC (History)
2 users (show)



Attachments
Removes explicit heap allocations from JacobiSVD and its preconditioners. (14.15 KB, patch)
2011-03-01 18:44 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
Removes heap allocations from JacobiSVD and dependent modules. (25.99 KB, patch)
2011-03-02 18:23 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
Removes heap allocations from JacobiSVD and dependent modules. (27.52 KB, patch)
2011-03-03 18:00 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
Removes heap allocations from JacobiSVD and dependent modules. (27.52 KB, patch)
2011-03-04 16:27 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
Removes heap allocations from JacobiSVD and dependent modules. (32.43 KB, patch)
2011-03-04 16:29 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 1: correctly forward computationOptions (6.12 KB, patch)
2011-03-06 20:01 UTC, Benoit Jacob
jacob.benoit.1: review+
Details | Diff
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left (3.20 KB, patch)
2011-03-08 10:18 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left (6.73 KB, patch)
2011-03-08 14:45 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 3: Reimplement FullPivHouseholderQR<T>::matrixQ() using ReturnByValue (7.21 KB, patch)
2011-03-08 19:13 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left (10.55 KB, patch)
2011-03-09 13:15 UTC, Adolfo Rodriguez Tsouroukdissian
jacob.benoit.1: review+
Details | Diff
part 3: Reimplement FullPivHouseholderQR<T>::matrixQ() using ReturnByValue (8.47 KB, patch)
2011-03-09 13:18 UTC, Adolfo Rodriguez Tsouroukdissian
jacob.benoit.1: review+
Details | Diff
part 4: Removes heap allocations from JacobiSVD and its preconditioners (19.35 KB, patch)
2011-03-09 13:25 UTC, Adolfo Rodriguez Tsouroukdissian
jacob.benoit.1: review-
Details | Diff
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left (5.60 KB, patch)
2011-04-07 13:13 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 4: Removes heap allocations from JacobiSVD and its preconditioners (17.82 KB, patch)
2011-04-07 13:18 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 4: Removes heap allocations from JacobiSVD and its preconditioners (17.84 KB, patch)
2011-04-08 08:58 UTC, Adolfo Rodriguez Tsouroukdissian
no flags Details | Diff
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left (5.51 KB, patch)
2011-04-19 11:43 UTC, Adolfo Rodriguez Tsouroukdissian
jacob.benoit.1: review+
Details | Diff
part 4: Removes heap allocations from JacobiSVD and its preconditioners (21.61 KB, patch)
2011-04-19 11:46 UTC, Adolfo Rodriguez Tsouroukdissian
jacob.benoit.1: review+
Details | Diff

Description Adolfo Rodriguez Tsouroukdissian 2011-02-28 18:46:55 UTC
Using Eigen 3.0 beta 4.

Problem size constructor does not preallocate all of the heap resources used by the compute method on a similarly sized problem. The following code reproduces the issue.

MatrixXd A = MatrixXd::Random(rows, cols);
Eigen::JacobiSVD<MatrixXd> svd(rows, cols, Eigen::ComputeThinU | Eigen::ComputeThinV); // Performs 4 allocations
svd.compute(A); // Performs 7 allocations, but does not compute U nor V!
svd.compute(A, Eigen::ComputeThinU | Eigen::ComputeThinV); // Performs 8 allocations.  Does compute U and V

All (undesired) allocations performed by compute() take place in lines 558-559 of JacobiSVD:

if(!internal::qr_preconditioner_impl<MatrixType, QRPreconditioner, internal::PreconditionIfMoreColsThanRows>::run(*this, matrix)
  && !internal::qr_preconditioner_impl<MatrixType, QRPreconditioner, internal::PreconditionIfMoreRowsThanCols>::run(*this, matrix))
Comment 1 Adolfo Rodriguez Tsouroukdissian 2011-03-01 18:44:00 UTC
Created attachment 110 [details]
Removes explicit heap allocations from JacobiSVD and its preconditioners.
Comment 2 Adolfo Rodriguez Tsouroukdissian 2011-03-01 18:49:55 UTC
When using dynamic-sized matrices and problem size constructor, the attached patch removes allocations done by JacobiSVD and its preconditioners. However, the solver is still not preallocating all of its resources, because the QR module has to be patched as well, namely:

HouseholderQR.householderQ() -> Allocates 1 instance
ColPivHouseholderQR.householderQ() -> Allocates 1 instance
FullPivHouseholderQR.matrixQ() -> Allocates 2 instances

A second (future) patch should address these issues.
Comment 3 Adolfo Rodriguez Tsouroukdissian 2011-03-02 11:11:41 UTC
(In reply to comment #2)

Please ignore the below lines. Wrong assessment. The problem is elsewhere. Working on fix.
> HouseholderQR.householderQ() -> Allocates 1 instance
> ColPivHouseholderQR.householderQ() -> Allocates 1 instance
> FullPivHouseholderQR.matrixQ() -> Allocates 2 instances
Comment 4 Adolfo Rodriguez Tsouroukdissian 2011-03-02 18:23:34 UTC
Created attachment 115 [details]
Removes heap allocations from JacobiSVD and dependent modules.

Patch solves most heap allocations done in the compute() method of JacobiSVD. Important notes:
- There are still allocations made when using the ColPivHouseholder and HouseholderQR preconditioners with full U/V computation. This is due to the evalTo() method of HouseHolderSequence that allocates a temporary. It's triggered by calls such as svd.m_matrixU = m_qr.householderQ(). Don't know how to deal with it. Will open separate ticket to discuss that.

- The signature of FullPivHouseholderQR::matrixQ() changed so that it returns a constant reference instead of a matrix copy. Can this API change make it to 3.0?

- For HouseholderSequence objects, applyThisOnTheRight/Left was allocating a temporary. An alternative overload has been provided. Don't know if it's the cleanest solution, but works. Should this change be also reflected on Core/EigenBase.h ? (not done in patch).

- Tested for all combinations of preconditioners, rows<cols, cols<rows, fixed/dynamic storage, thin/full U V computation. Also, test suite is passing.
Comment 5 Adolfo Rodriguez Tsouroukdissian 2011-03-03 18:00:17 UTC
Created attachment 119 [details]
Removes heap allocations from JacobiSVD and dependent modules.
Comment 6 Adolfo Rodriguez Tsouroukdissian 2011-03-04 16:27:53 UTC
Created attachment 125 [details]
Removes heap allocations from JacobiSVD and dependent modules.

Updated patch to include a fix for playing nice with fixed-sized matrices.
Comment 7 Adolfo Rodriguez Tsouroukdissian 2011-03-04 16:29:58 UTC
Created attachment 126 [details]
Removes heap allocations from JacobiSVD and dependent modules.
Comment 8 Benoit Jacob 2011-03-06 18:29:50 UTC
Hi,

Thanks for your work and patch, and sorry for the late reply.

But it's very hard for me to review a monolithic 32 kB patch. Next time, could you please split it into small chunks, each doing one specific thing?

> - The signature of FullPivHouseholderQR::matrixQ() changed so that it returns a
> constant reference instead of a matrix copy. Can this API change make it to
> 3.0?

That doesn't make sense to me: that would be returning a reference to a temporary object.
Comment 10 Adolfo Rodriguez Tsouroukdissian 2011-03-06 19:42:47 UTC
(In reply to comment #8)
> Hi,
> 
> Thanks for your work and patch, and sorry for the late reply.
> 
> But it's very hard for me to review a monolithic 32 kB patch. Next time, could
> you please split it into small chunks, each doing one specific thing?
> 
> > - The signature of FullPivHouseholderQR::matrixQ() changed so that it returns a
> > constant reference instead of a matrix copy. Can this API change make it to
> > 3.0?
> 
> That doesn't make sense to me: that would be returning a reference to a
> temporary object.

It returns a reference to a new class member that stores the Q matrix.
Comment 11 Adolfo Rodriguez Tsouroukdissian 2011-03-06 19:46:01 UTC
(In reply to comment #9)
> Documented there:
> http://eigen.tuxfamily.org/index.php?title=Mercurial#Splitting_a_patch_into_smaller_patches
> 
> with more details for the case where you're using MQ:
> http://eigen.tuxfamily.org/index.php?title=Mercurial_Queues#Splitting_a_patch.2C_the_general_case.2C_including_per-hunk_and_per-line_splitting

I'll try to split the patch into smaller ones. Sorry for the inconvenience. Will let you know when I get around it.
Comment 12 Benoit Jacob 2011-03-06 19:49:17 UTC
> > > - The signature of FullPivHouseholderQR::matrixQ() changed so that it returns a
> > > constant reference instead of a matrix copy. Can this API change make it to
> > > 3.0?
> > 
> > That doesn't make sense to me: that would be returning a reference to a
> > temporary object.
> 
> It returns a reference to a new class member that stores the Q matrix.

But I don't want to add such a new class member! The Q matrix is already 'encoded' in the member data stored in FullPivHouseholderQr. So that new member would be redundant.

I understand your motivation for doing that though, but would use a different approach: let the JacobiSVD perform itself the Householder transformations using a preallocated workspace.
Comment 13 Benoit Jacob 2011-03-06 20:01:09 UTC
Created attachment 128 [details]
part 1: correctly forward computationOptions
Comment 14 Benoit Jacob 2011-03-06 20:02:50 UTC
Note that, as documented there,

http://eigen.tuxfamily.org/index.php?title=Mercurial#Configuring_Mercurial

I like to configure Mercurial so it generates 8 lines of surrounding context and shows function names in patches. The generated patches are then bigger but easier to read.
Comment 15 Benoit Jacob 2011-03-06 20:08:13 UTC
Maybe a viable and reusable approach would be to extend the FullPivHouseholderQR class to have a 'eval matrixQ to' function taking a preallocated destination matrix.
Comment 16 Benoit Jacob 2011-03-06 20:09:20 UTC
...or use a ReturnByValue for matrixQ(), that's exactly what it's for.
Comment 17 Adolfo Rodriguez Tsouroukdissian 2011-03-06 20:24:23 UTC
(In reply to comment #16)
> ...or use a ReturnByValue for matrixQ(), that's exactly what it's for.

That would work for me. My main concern is to control the lifecycle of temporaries. I have to confess that I did not study the details of the decomposition's implementation, hence missed the fact that storing Q explicitly was redundant.
Comment 18 Benoit Jacob 2011-03-07 03:01:04 UTC
Comment on attachment 128 [details]
part 1: correctly forward computationOptions

OK, I checked in this patch together with a unit test. That required me to add a runtime variant of EIGEN_NO_MALLOC. So now, when you define EIGEN_RUNTIME_NO_MALLOC, you get to use the new function internal::set_is_malloc_allowed(bool).
Comment 19 Adolfo Rodriguez Tsouroukdissian 2011-03-08 10:18:02 UTC
Created attachment 132 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

For HouseholderSequence objects, applyThisOnTheRight/Left was allocating a
temporary. An alternative overload has been provided. Should this change be also reflected on Core/EigenBase.h ? (not done in patch).
Comment 20 Benoit Jacob 2011-03-08 13:54:49 UTC
(In reply to comment #19)
> Created attachment 132 [details]
> part 2: Non-allocating versions of
> HouseholderSequence::applyThisOnTheRight/Left
> 
> For HouseholderSequence objects, applyThisOnTheRight/Left was allocating a
> temporary. An alternative overload has been provided. Should this change be
> also reflected on Core/EigenBase.h ? (not done in patch).

For now let's only do what's needed to fix JacobiSVD. We can do the EigenBase API addition later.

Can you please rename temp to workspace for homogeneity with Householder.h, and fix indentation:

-      Matrix<Scalar,1,Dest::RowsAtCompileTime> temp(dst.rows());
+       Matrix<Scalar,1,Dest::RowsAtCompileTime,RowMajor,1,

and here too:

         dst.rightCols(rows()-m_shift-actual_k)
-           .applyHouseholderOnTheRight(
+        .applyHouseholderOnTheRight(

We normally align the . in chained method calls.

Also, since 'temp' is a plain matrix, a more elegant shortcut for &temp.coeffRef(0) is just temp.data(). Can you please make this change?
Comment 21 Adolfo Rodriguez Tsouroukdissian 2011-03-08 14:45:38 UTC
Created attachment 134 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

Incorporate code homogeneity fixes.
Comment 22 Adolfo Rodriguez Tsouroukdissian 2011-03-08 14:48:04 UTC
> OK, I checked in this patch together with a unit test. That required me to add
> a runtime variant of EIGEN_NO_MALLOC. So now, when you define
> EIGEN_RUNTIME_NO_MALLOC, you get to use the new function
> internal::set_is_malloc_allowed(bool).

This addition is great, and The Right Thing to ensure that no regressions are introduced in the code, allocation-wise. Many thanks.
Comment 23 Adolfo Rodriguez Tsouroukdissian 2011-03-08 19:13:24 UTC
Created attachment 135 [details]
part 3: Reimplement FullPivHouseholderQR<T>::matrixQ() using ReturnByValue

Implementation of ReturnByValue specialization contains two overloads of evalTo(): an API-preserving one, and an additional one that accepts a working vector for preventing allocations during evalTo().

If you're OK with the two versions of evalTo(), I can also propose a similar patch for HouseholderSequence::evalTo() (bug 211).
Comment 24 Adolfo Rodriguez Tsouroukdissian 2011-03-09 13:15:33 UTC
Created attachment 136 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

Added overload of HouseholderSequence::evalTo that takes work vector. Fixes issue discussed in bug 211, at least from the POV of my usecase.
Comment 25 Adolfo Rodriguez Tsouroukdissian 2011-03-09 13:18:08 UTC
Created attachment 137 [details]
part 3: Reimplement FullPivHouseholderQR<T>::matrixQ() using ReturnByValue

Trivial addition: Missing (just-in-case) resize of work vector in FullPivHouseholderQRMatrixQReturnType::evalTo
Comment 26 Adolfo Rodriguez Tsouroukdissian 2011-03-09 13:25:34 UTC
Created attachment 138 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners

Requires parts 1-3.
Comment 27 Benoit Jacob 2011-03-14 10:28:16 UTC
Comment on attachment 138 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners

I have a few problems with this patch.

>+    if (computeThinV) m_thinVWorkspace.resize(rows);
>+    else m_fullVWorkspace.resize(cols);

This seems wrong. Just because computeThinV is false doesn't mean that we're going to compute the full V, right? We could also not be computing V at all.

>+private:
>+  ColPivHouseholderQR<TransposeTypeWithSameStorageOrder> m_qr;
>+  TransposeTypeWithSameStorageOrder m_adjoint;

Why store the adjoint as a member there?

>+  ThinVWorkspaceType m_thinVWorkspace;
>+  FullVWorkspaceType m_fullVWorkspace;

Not a fan of having members for all cases there. The workspace is just a plain C array, i.e. it doesn't have to be specifically of the right Eigen type, so you could just have a single member of some Vector type here, same type for both use cases.
Comment 28 Benoit Jacob 2011-03-14 10:29:32 UTC
Also, note that part 2 still has some indentation problems, as in

    object.method1()
    .method2()

instead of 

    object.method1()
          .method2()
Comment 29 Adolfo Rodriguez Tsouroukdissian 2011-03-17 21:51:08 UTC
(In reply to comment #27)
> Comment on attachment 138 [details]
> part 4: Removes heap allocations from JacobiSVD and its preconditioners
> 
> I have a few problems with this patch.
> 
> >+    if (computeThinV) m_thinVWorkspace.resize(rows);
> >+    else m_fullVWorkspace.resize(cols);
> 
> This seems wrong. Just because computeThinV is false doesn't mean that we're
> going to compute the full V, right? We could also not be computing V at all.

You're right. I realized this after submitting the patch but never got around fixing it.

> 
> >+private:
> >+  ColPivHouseholderQR<TransposeTypeWithSameStorageOrder> m_qr;
> >+  TransposeTypeWithSameStorageOrder m_adjoint;
> 
> Why store the adjoint as a member there?

Will check.

> 
> >+  ThinVWorkspaceType m_thinVWorkspace;
> >+  FullVWorkspaceType m_fullVWorkspace;
> 
> Not a fan of having members for all cases there. The workspace is just a plain
> C array, i.e. it doesn't have to be specifically of the right Eigen type, so
> you could just have a single member of some Vector type here, same type for
> both use cases.
Not a fan of the solution I came up with either. What you propose is to have a single vector member and Map<...> to the appropriate type when used?. Seems cleaner. I didn't think of it at the moment. Still learning how to better take advantage of Eigen.
I won't be able to submit further fixes for the next 3+ weeks (paternity leave, hooray!), the downside being that these patches may not make it on time to 3.0 :(

Many thanks for the code reviews so far.
Comment 30 Adolfo Rodriguez Tsouroukdissian 2011-04-07 13:13:22 UTC
Created attachment 152 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

Workspace parameters are now of type MatrixBase<T> instead of T. This is a requirement for the way part4 is currently implemented.
Comment 31 Adolfo Rodriguez Tsouroukdissian 2011-04-07 13:18:30 UTC
Created attachment 153 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners

(In reply to comment #27)
> >+    if (computeThinV) m_thinVWorkspace.resize(rows);
> >+    else m_fullVWorkspace.resize(cols);
> 
> This seems wrong. Just because computeThinV is false doesn't mean that we're
> going to compute the full V, right? We could also not be computing V at all.

Fixed.

> >+private:
> >+  ColPivHouseholderQR<TransposeTypeWithSameStorageOrder> m_qr;
> >+  TransposeTypeWithSameStorageOrder m_adjoint;
>
> Why store the adjoint as a member there?

Because doing m_qr.compute(matrix.adjoint()) is causing one unwanted allocation at compute() time. The m_adjoint member defers the allocation to initialization time. Is there a more memory-efficient way of going about this?. Is this extra allocation happening because decompositions are templated on the matrix type, and not MatrixBase, hence the adjoint gets evaluated?.

>
> >+  ThinVWorkspaceType m_thinVWorkspace;
> >+  FullVWorkspaceType m_fullVWorkspace;
>
> Not a fan of having members for all cases there. The workspace is just a plain
> C array, i.e. it doesn't have to be specifically of the right Eigen type, so
> you could just have a single member of some Vector type here, same type for
> both use cases.

It's now implemented with a single workspace member, but attachment part2 had to be modified in order to make it play nice with fixed-size vectors.
Comment 32 Adolfo Rodriguez Tsouroukdissian 2011-04-08 08:58:51 UTC
Created attachment 154 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners
Comment 33 Adolfo Rodriguez Tsouroukdissian 2011-04-19 11:43:25 UTC
Created attachment 158 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

Reverted last submitted changes to attachment, as they were not needed.
Comment 34 Adolfo Rodriguez Tsouroukdissian 2011-04-19 11:46:51 UTC
Created attachment 159 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners

Cosmetic changes: Take advantage of the fact that QR preconditioners are friends of JacobiSVD, and simplify parts of patch.
Comment 35 Benoit Jacob 2011-10-31 04:09:38 UTC
Comment on attachment 158 [details]
part 2: Non-allocating versions of HouseholderSequence::applyThisOnTheRight/Left

Review of attachment 158 [details]:
-----------------------------------------------------------------

very good patch... sorry it took me so long to get back to my eigen reviews. as of today we have a good patch review tool, at least.
Comment 36 Benoit Jacob 2011-10-31 04:16:41 UTC
Comment on attachment 159 [details]
part 4: Removes heap allocations from JacobiSVD and its preconditioners

Review of attachment 159 [details]:
-----------------------------------------------------------------

looks good, I hope this still applies...
Comment 37 Benoit Jacob 2011-10-31 05:20:43 UTC
Rebased and pushed your patches:
 - part 1 was already pushed
 - part 2 pushed as cset 7419a8d76ac7
 - part 3 was partially pushed already; pushed the rest as cset 053b0c623e42
 - part 2 pushed as cset 0fa573497f3c

Thanks for your patience! and for the great patches.
Comment 38 Benoit Jacob 2011-10-31 05:21:22 UTC
i meant: part 4 pushed as cset 0fa573497f3c

Note You need to log in before you can comment on or make changes to this bug.