Difference between revisions of "User:Tellenbach"

From Eigen
Jump to: navigation, search
(List of fixed issues in Eigen 3.4)
(List of commits in Eigen 3.4)
Line 237: Line 237:
 
|}
 
|}
  
== List of commits in Eigen 3.4 ==
+
== Commits in Eigen 3.4 ==
  
 
* [https://gitlab.com/libeigen/eigen/-/commit/0b56b62f30bec7ac27fe50f7c1d8ffce299218b7 Commit 0b56b62f3]: Reverse compare logic ƒin F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).
 
* [https://gitlab.com/libeigen/eigen/-/commit/0b56b62f30bec7ac27fe50f7c1d8ffce299218b7 Commit 0b56b62f3]: Reverse compare logic ƒin F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).

Revision as of 23:55, 13 August 2021

List of fixed issues in Eigen 3.4

Issue #2298 List of dense linear decompositions lacks completeorthogonal decomposition
Issue #2284 JacobiSVD Outputs Invalid U (Reads Past End of Array)
Issue #2267 [3.4 bug] FixedInt<0> error with gcc 4.9.3
Issue #2263 usage of signed zeros leads to wrong results with -ffast-math
Issue #2251 Method unaryExpr() does not support function pointers in Eigen 3.4rc1
Issue #2242 No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h'
Issue #2229 Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2)
Issue #2213 template maxCoeff<PropagateNaN> compilation error with Eigen 3.4.
Issue #2209 unaryExpr deduces wrong return type on MSVC
Issue #2157 forward_adolc test fails since PR !363
Issue #2119 Move assignment swaps even for non-dynamic storage
Issue #2112 Build failure with boost::multiprecision type
Issue #2093 Incorrect evaluation of Ref
Issue #1906 Eigen failed with error C2440 with MSVC on windows
Issue #1850 error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t
Issue #1833 c++20 compilation failure
Issue #1826 -Wdeprecated-anon-enum-enum-conversion warnings (c++20)
Issue #1815 IndexedView of a vector should allow linear access
Issue #1805 Uploaded doxygen documentation does not build LaTeX formulae
Issue #1790 packetmath_1 unit test fails
Issue #1788 Rule-of-three/rule-of-five violations
Issue #1776 subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning
Issue #1774 std::cbegin() returns non-const iterator
Issue #1752 A change to the C++ Standard will break some tests
Issue #1741 Map<>.noalias()=A*B gives wrong result
Issue #1736 Column access of some IndexedView won't compile
Issue #1718 Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC
Issue #1695 Stuck in loop for a certain input when using mpreal support
Issue #1692 pass enumeration argument to constructor of VectorXd
Issue #1684 array_reverse fails with clang >=6 + AVX + -O2
Issue #1674 SIMD sin/cos gives wrong results with -ffast-math
Issue #1669 Zero-sized matrices generate assertion failures
Issue #1664 dot product with single column block fails with new static checks
Issue #1652 Corner cases in SIMD sin/cos
Issue #1643 Compilation failure
Issue #1637 Register spilling with recent gcc & clang
Issue #1619 const_iterator vs iterator compilation error
Issue #1615 Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow
Issue #1611 NEON: plog(+/-0) should return -inf and not NaN
Issue #1585 Matrix product is repeatedly evaluated when iterating over the product expression
Issue #1557 Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding
Issue #1544 SparseQR generates incorrect Q matrix in complex case
Issue #1543 \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product
Issue #1493 dense Q extraction and solve is sometimes erroneous for complex matrices
Issue #1453 Strange behavior for Matrix::Map, if only InnerStride is provided
Issue #1409 Add support for C++17 operator new alignment
Issue #1340 Add operator + to sparse matrix iterator
Issue #1318 More robust quaternion from matrix
Issue #1306 Add support for AVX512 to Eigen
Issue #1305 Implementation of additional component-wise unary functions
Issue #1221 I get tons of error since my distribution upgraded to GCC 6.1.1
Issue #1195 vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling
Issue #1194 Improve det4x4
Issue #1049 std::make_shared fails to fulfill structure aliment
Issue #1046 fixed matrix types do not report correct alignment requirements
Issue #1014 Eigenvalues 3x3 matrix
Issue #1001 infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
Issue #977 Add stable versions of normalize() and normalized()
Issue #899 SparseQR occasionally fails for under-determined systems
Issue #864 C++11 alias templates for commonly used types
Issue #751 Make AMD Ordering numerically more robust
Issue #747 Allow for negative stride
Issue #720 Gaussian NullaryExpr
Issue #663 Permit NoChange in setZero, setOnes, setConstant, setRandom
Issue #645 GeneralizedEigenSolver: missing computation of eigenvectors
Issue #632 Optimize addition/subtraction of sparse and dense matrices/vectors
Issue #631 (Optionally) throw an exception when using an unsuccessful decomposition
Issue #564 maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works
Issue #556 Matrix multiplication crashes using mingw 4.7
Issue #505 Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11)
Issue #445 ParametrizedLine should have transform method
Issue #437 [feature request] Add Reshape Operation
Issue #426 Behavior of sum() for Matrix<bool> is unexpected and confusing
Issue #329 Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices
Issue #231 STL compatible iterators
Issue #96 Clean internal::result_of
Issue #65 Core - optimize partial reductions
Issue #64 Tests : precision-oriented tests

Commits in Eigen 3.4

  • Commit 0b56b62f3: Reverse compare logic ƒin F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).
  • Commit 44cc96e1a: Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+
  • Commit 576e451b1: Add CompleteOrthogonalDecomposition to the table of linear algeba decompositions.
  • Commit 0d8901270: Update code snippet for tridiagonalize_inplace.
  • Commit 6d2506040: * revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B. * This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B.
  • Commit cb44a003d: Do not set AnnoyingScalar::dont_throw if not defined EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW.
  • Commit 13d7658c5: Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl).
  • Commit 338924602: added includes for unordered_map
  • Commit 93bff85a4: remove denormal flushing in fp32tobf16 for avx & avx512
  • Commit 4e0357c6d: Avoid memory allocation in tridiagonalization_inplace_selector::run.
  • Commit 1e9f623f3: Do not build shared libs if not supported
  • Commit 4240b480e: updated documentation for middleCol and middleRow
  • Commit 5b83d3c4b: Make inverse 3x3 faster and avoid gcc bug.
  • Commit 46ecdcd74: Fix MPReal detection and support.
  • Commit 9a1691a14: Fix cmake warnings, FindPASTIX/FindPTSCOTCH.
  • Commit bb33880e5: Fix TriSycl CMake files.
  • Commit 237c59a2a: Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset.
  • Commit 3dc42eeae: Enable equality comparisons on GPU.
  • Commit 7adc1545b: fix:typo in dox (has->have)
  • Commit c0c7b695c: Fix assignment operator issue for latest MSVC+NVCC.
  • Commit c334eece4: _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier
  • Commit 5ccb72b2e: Fixed typo in TutorialSparse.dox
  • Commit 9c90d5d83: Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector.
  • Commit 5d37114fc: Fix explicit default cache size typo.
  • Commit 930696fc5: Enable extract et. al. for HIP GPU.
  • Commit 56966fd2e: Defer to std::fill_n when filling a dense object with a constant value.
  • Commit 5a3c9eddb: Removed superfluous boolean `degenerate` in TensorMorphing.h.
  • Commit 69ec4907d: Make a copy of input matrix when try to do the inverse in place, this fixes #2285.
  • Commit 7571704a4: Fix CMake directory issues.
  • Commit 84955d109: Fix Tensor documentation page.
  • Commit 601814b57: Don't crash when attempting to shuffle an empty tensor.
  • Commit 05bab8139: Fix breakage of conj_helper in conjunction with custom types introduced in !537.
  • Commit eebde572d: Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow
  • Commit 8190739f1: Fix compile issues for gcc 4.8.
  • Commit b6db01343: Fix inverse nullptr/asan errors for LU.
  • Commit 1f6b1c1a1: Fix duplicate definitions on Mac
  • Commit 517294d6e: Make DenseStorage<> trivially_copyable
  • Commit 94e2250b3: Correct declarations for aarch64-pc-windows-msvc
  • Commit d82d91504: Modify tensor argmin/argmax to always return first occurence.
  • Commit 380d0e491: Get rid of redundant `pabs` instruction in complex square root.
  • Commit e83af2cc2: Commit 52a5f982 broke conjhelper functionality for HIP GPUs.
  • Commit 413ff2b53: Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code.
  • Commit a235ddef3: Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
  • Commit 4780d8dfb: Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp
  • Commit fd5d23fdf: Update ComplexEigenSolver_eigenvectors.cpp
  • Commit a2040ef79: Rewrite balancer to avoid overflows.
  • Commit c2c0f6f64: Fix fix<> for gcc-4.9.3.
  • Commit ee4e099aa: Remove pset, replace with ploadu.
  • Commit 9fc93ce31: EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
  • Commit 1374f49f2: Add missing ppc pcmp_lt_or_nan<Packet8bf>
  • Commit 2d6eaaf68: Fix placement of permanent GPU defines.
  • Commit 47722a66f: Fix more enum arithmetic.
  • Commit 5e75331b9: Fix checking of version number for mingw.
  • Commit b5fc69bdd: Add ability to permanently enable HIP/CUDA gpu* defines.
  • Commit 4b683b65d: Allow custom TENSOR_CONTRACTION_DISPATCH macro.
  • Commit 1cb1ffd5b: Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled.
  • Commit 4b502a721: Fix c++20 warnings about using enums in arithmetic expressions.
  • Commit 85868564d: Fix parsing of version for nvhpc
  • Commit cbb6ae629: Removed dead code from GPU float16 unit test.
  • Commit 573570b6c: Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor.
  • Commit 98cf1e076: Add missing NEON ptranspose implementations.
  • Commit ee2a8f713: Modify Unary/Binary/TernaryOp evaluators to work for non-class types.
  • Commit 383504630: predux_half_dowto4 test extended to all applicable packets
  • Commit 4fbd01cd4: Adds macro for checking if C++14 variable templates are supported
  • Commit a883a8797: Use derived object type in conservative_resize_like_impl
  • Commit 0bd9e9bc4: ptranpose test for non-square kernels added
  • Commit 77c66e368: Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 .
  • Commit 2f908f825: Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 .
  • Commit 82f13830e: Fix calls to device functions from host code
  • Commit d1825cbb6: Device implementation of log for std::complex types.
  • Commit d9288f078: Fix ambiguity due to argument dependent lookup.
  • Commit 85ebd6aff: Fix for issue where numext::imag and numext::real are used before they are defined.
  • Commit 2947c0cc8: Restore ABI compatibility for conj with 3.3, fix conflict with boost.
  • Commit 25424f4cf: Clean up gpu device properties.
  • Commit 42acbd570: Fix numext::arg return type.
  • Commit 9e0dc8f09: Revert addition of unused `paddsub<Packet2cf>`. This fixes #2242
  • Commit da19f7a91: Simplify TensorRandom and remove time-dependence.
  • Commit fc2cc1084: Better CUDA complex division.
  • Commit a33855f6e: Add missing pcmp_lt_or_nan for NEON Packet4bf.
  • Commit 83df5df61: Added complex matrix unit tests for SelfAdjointEigenSolve
  • Commit ac3c5aad3: Tests added and AVX512 bug fixed for pcmp_lt_or_nan
  • Commit 63abb1000: Tests for pcmp_lt and pcmp_le added
  • Commit baf601a0e: Fix for issue with static global variables in TensorDeviceGpu.h
  • Commit 587a69151: Check existence of BSD random before use.
  • Commit 8830d66c0: DenseStorage safely copy/swap.
  • Commit 54425a39b: Make vectorized compute_inverse_size4 compile with AVX.
  • Commit 34d0be9ec: Compilation of basicbenchmark fixed
  • Commit 42a8bdd4d: HasExp added for AVX512 Packet8d
  • Commit 28564957a: Fix taking address of rvalue compiler issue with TensorFlow (plus other warnings).
  • Commit ab7fe215f: Fix ldexp for AVX512 (#2215)
  • Commit 1f4c0311c: Bump to 3.3.91 (3.4-rc1)
  • Commit 3e819d83b: Before 3.4 branch
  • Commit 69adf26aa: Modify googlehash use to account for namespace issues.
  • Commit 9357feedc: Avoid using uninitialized inputs and if available, use slightly more efficient `movsd` instruction for `pset1<Packet2cf>`.
  • Commit a2c054201: Fix typo in TensorDimensions.h
  • Commit dfd6720d8: Fix for float16 GPU unit test.
  • Commit 1e1c8a735: Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for `std::result_of` and `std::invoke_result`. Fixes #2209
  • Commit f6fc66aa7: fixed doxygen for unsupported iterative solver module
  • Commit d58678069: Make iterators default constructible and assignable, by making...
  • Commit 2859db022: This fixes an issue where the compiler was not choosing the GPU specific specialization of ScanLauncher.
  • Commit fcb5106c6: Scaled epsilon the wrong way.
  • Commit 6197ce1a3: Replace `-2147483648` by `-0.0f` or `-0.0` constants (this should fix #2189). Also, remove unnecessary `pgather` operations.
  • Commit 22edb4682: Align local arrays to Packet boundary.
  • Commit ace7f132e: Fix clang tidy warnings in AnnoyingScalar.
  • Commit 90187a33e: Fix SelfAdjoingEigenSolver (#2191)
  • Commit 3ddc0974c: Fix two bugs in commit
  • Commit c24bee612: Fix address of temporary object errors in clang11.
  • Commit e4233b6e3: Add CI infrastructure for pre-merge smoke tests.
  • Commit ae95b74af: Add CMake infrastructure for smoke testing
  • Commit 5bbc9cea9: Add an info() method to the SVDBase class to make it possible to tell the user that the computation failed, possibly due to invalid input. Make Jacobi and divide-and-conquer fail fast and return info() == InvalidInput if the matrix contains NaN or +/-Inf.
  • Commit b5a926a0f: Add GitLab templates for issues and merge requests
  • Commit 78ee3d626: Fix CUDA constexpr issues for numeric_limits.
  • Commit af1247fbc: Use Index type in loop over coefficients.
  • Commit 87729ea39: Eliminate `round_impl` double-promotion warnings for c++03.
  • Commit 748489ef9: Un-defining EIGEN_HAS_CONSTEXPR on the HIP platform
  • Commit d59ef212e: Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).
  • Commit e7b8643d7: Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()""
  • Commit 5521c65af: Eliminate mixingtypes_7 warning.
  • Commit 69a4f7095: Revert "Uses _mm512_abs_pd for Packet8d pabs"
  • Commit 824272cde: Re-enable CI for Power
  • Commit 4811e8196: Remove yet another comma at end of enum
  • Commit f019b97ac: Uses _mm512_abs_pd for Packet8d pabs
  • Commit 0cc9b5eb4: Split test commainitializer into two substests
  • Commit c3fbc6cec: Use singleton pattern for static registered tests.
  • Commit ed964ba3f: Proposed fix for issue #2187
  • Commit 8dfe1029a: Augment NumTraits with min/max_exponent() again.
  • Commit eb71e5db9: Fix another warning on missing commas
  • Commit df4bc2731: Revert "Augment NumTraits with min/max_exponent()."
  • Commit 75ce9cd2a: Augment NumTraits with min/max_exponent().
  • Commit 9fb706244: Silence warning on comma at end of enumerator list
  • Commit b8502a9dd: Updated SelfAdjointEigenSolver documentation to include that the eigenvectors matrix is unitary.
  • Commit 2e83cbbba: Add NaN propagation options to minCoeff/maxCoeff visitors.
  • Commit c0a889890: Fixed output of complex matrices
  • Commit f612df273: Add fmod(half, half).
  • Commit 14b7ebea1: Fix numext::round pre c++11 for large inputs.
  • Commit c9d4367fa: Fix pround and add print
  • Commit d24f9f9b5: Fix NVCC+ICC issues.
  • Commit 14487ed14: Add increment/decrement operators to Eigen::half.
  • Commit b27111078: Bump up rand histogram threshold.
  • Commit d098c4d64: Disable EIGEN_OPTIMIZATION_BARRIER for PPC clang.
  • Commit 543e34ab9: Re-implement move assignments.
  • Commit b8d1857f0: [MSVC-specific] Define EIGEN_ARCH_x86_64 for native x64 (_M_X64 is defined and _M_ARM64EC is not), and define EIGEN_ARCH_ARM64 for both the native ARM64 (_M_ARM64 is defined) or ARM64EC (_M_ARM64EC is defined). _M_ARM64EC is defined when the code is compiled by MSVC for ARM64EC, a new ARM64 ABI designed to be compatible with x64 application emulation on ARM64. If _M_ARM64EC is defined, _M_X64 and _M_AMD64 are also defined, so x64-specific code (especially intrinsics) is also compiled to ARM64 instructions (compliant with the ARM64EC ABI) for maximum x64 compatibility. Although a majority of x64-specific intrinsics can emulated by ARM64 instructions, it is still a good to simply recompile the native ARM64 code paths to ARM64EC for pure computation tasks, for performance reasons.
  • Commit 853a5c4b8: Fix ambiguous call to CUDA __half constructor.
  • Commit 94327dbfb: Fix typo: DEVICE -> GPU
  • Commit 1296abdf8: Fix non-trivial Half constructor for CUDA.
  • Commit 604524314: Revert stack allocation limit change that crept in.
  • Commit 1a96d49af: Changing the Eigen::half implementation for HIP
  • Commit 2468253c9: Define EIGEN_CPLUSPLUS and replace most __cplusplus checks.
  • Commit 82d61af3a: Fix rint SSE/NEON again, using optimization barrier.
  • Commit 5f0b4a401: Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"
  • Commit 6cbb3038a: Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()
  • Commit 5bfc67f9e: Deactive CI for Power due to problems with GitLab runner
  • Commit a6601070f: Add log2 operation to TensorBase
  • Commit 9a663973b: Revert "Fix rint for SSE/NEON."
  • Commit e72dfeb8b: Fix rint for SSE/NEON.
  • Commit 199c5f2b4: geo_alignedbox_5 was failing with AVX enabled, due to storing `Vector4d` in a `std::vector` without using an aligned allocator. Got rid of using `std::vector` and simplified the code. Avoid leading `_`
  • Commit 1e0c7d4f4: Add print for SSE/NEON, use NEON rounding intrinsics if available.
  • Commit 976ae0ca6: Document that using raw function pointers doesn't work with unaryExpr.
  • Commit c65c2b31d: Make half/bfloat16 constructor take inputs by value, fix powerpc test.
  • Commit 39a590dfb: Remove unused include
  • Commit 8f686ac4e: clang 10 aggressively warns about precision loss when converting int to float (or long to double)
  • Commit 2660d01fa: Inherit from `no_assignment_operator` to avoid implicit copy constructor warnings
  • Commit a3521d743: Fix some enum-enum conversion warnings
  • Commit ca528593f: Fixed/masked more implicit copy constructor warnings
  • Commit 81b5fe2f0: ReturnByValue is already non-copyable
  • Commit 4fb3459a2: Fix double-promotion warnings
  • Commit 4bfcee47b: Idrs iterative linear solver
  • Commit 29ebd84cb: Fix NEON sqrt for 32-bit, add prsqrt.
  • Commit fe19714f8: Merge branch 'rmlarsen1/eigen-nan_prop'
  • Commit e67672024: Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop
  • Commit 5e7d4c33d: Add TODO.
  • Commit fb5b59641: Defer default for minCoeff/maxCoeff to templated variant.
  • Commit e19829c3b: Fix floor/ceil for NEON fp16.
  • Commit 5529db752: Fix SSE/NEON pfloor/pceil for saturated values.
  • Commit 51eba8c3e: Fix indentation.
  • Commit 5297b7162: Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.
  • Commit ecb7b19df: Disable new/delete test for HIP
  • Commit 6eebe97ba: Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.
  • Commit f284c8592: Don't crash when attempting to slice an empty tensor.
  • Commit 4cb0592af: Fix indentation.
  • Commit 6b34568c7: Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop
  • Commit 0065f9d32: Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.
  • Commit 841c8986f: Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.
  • Commit 113e61f36: Remove unused function scalar_cmp_with_cast.
  • Commit 98ca58b02: Cast anonymous enums to int when used in expressions.
  • Commit c31ead8a1: Having forward template function declarations in a P10 file causes bad code in certain situations.
  • Commit f44197fab: Some improvements for kissfft from Martin Reinecke(pocketfft author): 1.Only computing about half of the factors and use complex conjugate symmetry for the rest instead of all to save time. 2.All twiddles are calculated in double because that gives the maximum achievable precision when doing float transforms. 3.Reducing all angles to the range 0<angle<pi/4 which gives even more precision.
  • Commit a31effc3b: Add `invoke_result` and eliminate `result_of` warnings for C++17+.
  • Commit 8523d447a: Fixes to support old and new versions of the compilers for built-ins. Cast to non-const when using vector_pair with certain built-ins.
  • Commit 5908aeeab: Fix CUDA device new and delete, and add test.
  • Commit 119763cf3: Eliminate CMake FindPackageHandleStandardArgs warnings.
  • Commit 6cf0ab5e9: Disable fast psqrt for NEON.
  • Commit aba399827: Fix check if GPU compile phase for std::hash
  • Commit db5691ff2: Fix some CUDA warnings.
  • Commit 88d4c6d4c: Accurate pow, part 2. This change adds specializations of log2 and exp2 for double that make pow<double> accurate the 1 ULP. Speed for AVX-512 is within 0.5% of the currect implementation.
  • Commit 2ac0b7873: Fixed sparse conservativeResize() when both num cols and rows decreased.
  • Commit 10c77b0ff: Fix compilation errors with later versions of GCC and use of MMA.
  • Commit 73922b017: Fixes Bug #1925. Packets should be passed by const reference, even to inline functions.
  • Commit 5f9cfb252: Add missing adolc isinf/isnan.
  • Commit ce4af0b38: Missing change regarding #1910
  • Commit a7749c09b: Bug #1910: Make SparseCholesky work for RowMajor matrices
  • Commit 128eebf05: Revert "add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC)."
  • Commit 33e0af013: Return nan at poles of polygamma, digamma, and zeta if limit is not defined
  • Commit 7f09d3487: Use the Cephes double subtraction trick in pexp<float> even when FMA is available. Otherwise the accuracy drops from 1 ulp to 3 ulp.
  • Commit 12fd3dd65: add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC).
  • Commit aa8b22e77: Bump to 3.4.99
  • Commit 5336ad859: Define internal::make_unsigned for [unsigned]long long on macOS.
  • Commit 0845df7f7: Fix uninitialized warning on AVX.
  • Commit 9b51dc797: Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product
  • Commit be0574e21: New accurate algorithm for pow(x,y). This version is accurate to 1.4 ulps for float, while still being 10x faster than std::pow for AVX512. A future change will introduce a specialization for double.
  • Commit 7ff0b7a98: Updated pfrexp implementation.
  • Commit 9ad4096cc: Document possible inconsistencies when using `Matrix<bool, ...>`
  • Commit f702792a7: missing method in packetmath.h void ptranspose(PacketBlock<Packet16uc, 4>& kernel)
  • Commit db61b8d47: Avoid -Wunused warnings in NDEBUG builds.
  • Commit 622c59894: Don't allow all test jobs to fail but only the currently failing ones.
  • Commit 90ee821c5: Use vrsqrts for rsqrt Newton iterations.
  • Commit 9fde9cce5: Adjust bounds for pexp_float/double
  • Commit 4cb563a01: Fix ldexp implementations.
  • Commit 7eb07da53: loop less ptranspose
  • Commit 36200b785: Remove vim specific comments to recognoize correct file-type.
  • Commit 54589635a: Replace nullptr by NULL in SparseLU.h to be C++03 compliant.
  • Commit 984d010b7: add specialization of check_sparse_solving() for SuperLU solver, in order to test adjoint and transpose solves
  • Commit b57893065: Fix documentation typos in LDLT.h
  • Commit 66841ea07: Enable bdcsvd on host.
  • Commit 6e3b795f8: Add more tests for pow and fix a corner case for huge exponent where the result is always zero or infinite unless x is one.
  • Commit abcde69a7: Disable vectorized pow for half/bfloat16.
  • Commit f85038b7f: Fix excessive GEBP register spilling for 32-bit NEON.
  • Commit 56c8b14d8: Eliminate implicit conversions from float to double.
  • Commit fb4548e27: Implement bit_* for device.
  • Commit 1615a2799: Fix altivec packetmath.
  • Commit 1414e2212: Fix clang compilation for AltiVec from previous check-in
  • Commit 170a504c2: Add the following functions
  • Commit 598e1b6e5: Add the following functions:
  • Commit 0668c68b0: Allow for negative strides.
  • Commit 288d456c2: Replace language_support module with builtin CheckLanguage
  • Commit 3f4684f87: Include `<cstdint>` in one place, remove custom typedefs
  • Commit 0784d9f87: Fix sqrt, ldexp and frexp compilation errors.
  • Commit a4edb1079: fix test of ExtractVolumePatchesOp
  • Commit 4c42d5ee4: Eliminate implicit conversion warning in test/array_cwise.cpp
  • Commit e0d13ead9: Replace std::isnan with numext::isnan for c++03
  • Commit c35965b38: Remove unused variable in SparseLU.h
  • Commit f0e46ed5d: Fix pow and other cwise ops for half/bfloat16.
  • Commit f19bcffee: Specialize std::complex operators for use on GPU device.
  • Commit 65e2169c4: Add support for Arm SVE
  • Commit b2126fd6b: Fix pfrexp/pldexp for half.
  • Commit 25d8498f8: Fix stable_norm_1 test.
  • Commit 660c6b857: Remove std::cerr in iterative solver since we don't have iostream.
  • Commit d5b798111: Fix signed-unsigned comparison.
  • Commit e409795d6: Proper CPUID
  • Commit cdd8fdc32: Vectorize `pow(x, y)`. This closes https://gitlab.com/libeigen/eigen/-/issues/2085, which also contains a description of the algorithm.
  • Commit bde674164: Improved std::complex sqrt and rsqrt.
  • Commit 21a8a2487: fix paddings of TensorVolumePatchOp
  • Commit 38ae5353a: 1)provide a better generic paddsub op implementation 2)make paddsub op support the Packet2cf/Packet4f/Packet2f in NEON 3)make paddsub op support the Packet2cf/Packet4f in SSE
  • Commit 352f1422d: Remove `inf` local variable.
  • Commit 204408497: Remove TODO from Transform::computeScaleRotation()
  • Commit 3daf92c7a: Transform::computeScalingRotation flush determinant to +/- 1.
  • Commit 587fd6ab7: Only specialize complex `sqrt_impl` for CUDA if not MSVC.
  • Commit 2a6addb4f: Fix for breakage in ROCm support - 210108
  • Commit f149e0ebc: Fix MSVC complex sqrt and packetmath test.
  • Commit 8d9cfba79: Fix rand test for MSVC.
  • Commit e741b4366: Make Transform::computeRotationScaling(0,&S) continuous
  • Commit 0bdc0dba2: Add missing #endif directive in Macros.h
  • Commit cb654b1c4: #define was defined incorrectly because the result_of function was deprecated in c++17 and removed in c++20. Also, EIGEN_COMP_MSVC (which is _MSC_VER) only affects result_of indirectly, which can cause errors.
  • Commit 52d1dd979: Fix Ref initialization.
  • Commit 166fcdecd: Allow CwiseUnaryView to be used on device.
  • Commit bb1de9dbd: Fix Ref Stride checks.
  • Commit 12dda34b1: Eliminate boolean product warnings by factoring out a `combine_scalar_factors` helper function.
  • Commit 070d303d5: Add CUDA complex sqrt.
  • Commit fdf2ee62c: Fix missing EIGEN_DEVICE_FUNC
  • Commit 05754100f: * Add iterative psqrt<double> for AVX and SSE when FMA is available. This provides a ~10% speedup. * Write iterative sqrt explicitly in terms of pmadd. This gives up to 7% speedup for psqrt<float> with AVX & SSE with FMA. * Remove iterative psqrt<double> for NEON, because the initial rsqrt apprimation is not accurate enough for convergence in 2 Newton-Raphson steps and with 3 steps, just calling the builtin sqrt insn is faster.
  • Commit 3bee9422d: Merge branch 'lambdaknight/eigen-master'
  • Commit 19e6496ce: Replace call to FixedDimensions() with a singleton instance of FixedDimensions.
  • Commit 6cee8d347: Add an additional step of Newton-Raphson for `psqrt<double>` on Arm, which otherwise has an error of ~1000 ulps.
  • Commit bc7d1599f: TensorStorage with FixedDimensions now has zero instance memory overhead. Removed m_dimension as instance member of TensorStorage with FixedDimensions and instead use the template parameter. This means that the sizeof a pure fixed-size storage is exactly equal to the data it is storing.
  • Commit cf0b5b034: Remove code checking for CMake < 3.5
  • Commit 751f18f2c: Remove comma at the end of enumeration list to silence C++03 warnings
  • Commit 5dc2fbabe: Fix implicit cast to double.
  • Commit 55967f87d: Fix NEON pmax<PropagateNumbers,Packet4bf>.
  • Commit 839aa505c: Fix typo in AVX512 packet math.
  • Commit 536c8a79f: Remove unused macro in Half.h
  • Commit 8c9976d7f: Fix more SSE/AVX packet conversions for peven.
  • Commit c6efc4e0b: Replace M_LOG2E and M_LN2 with custom macros.
  • Commit e82722a4a: Fix MSVC SSE casts.
  • Commit f3d2ea48f: Fix for broken ROCm/HIP Support
  • Commit c7eb3a74c: Don't guard psqrt for std::complex<float> with EIGEN_ARCH_ARM64
  • Commit bccf055a7: Add Armv8 guard on PropagateNumbers implementation.
  • Commit 82c0c18a8: Remove private access of std::deque::_M_impl.
  • Commit 00be0a7ff: Fix vectorization of complex sqrt on NEON
  • Commit 8eb461a43: Remove comma at end of enumerator list in NEON PacketMath
  • Commit 2e8f850c7: Fix a typo in SparseMatrix documentation.
  • Commit 125cc9a5d: Implement vectorized complex square root.
  • Commit 8cfe0db10: Fix host/device calls for __half.
  • Commit baf9d762b: - Enabling PropagateNaN and PropagateNumbers for NEON. - Adding propagate tests to bfloat16.
  • Commit 634bd79b0: Fix unused warning on new `dense_assignment_loop` impl.
  • Commit 655c3a404: Add specialization for compile-time zero-sized dense assignment.
  • Commit 5ec490743: Clean up `#if`s in GPU PacketPath.
  • Commit f9fac1d5b: Add log2() to Eigen.
  • Commit 2dbac2f99: Fix bad NEON fp16 check
  • Commit e2f21465f: Special function implementations for half/bfloat16 packets.
  • Commit 305b8bd27: Remove duplicate #if clause
  • Commit 9ee9ac81d: Fix shfl* macros for CUDA/HIP
  • Commit a9a2f2beb: The function 'prefetch' did not work correctly on the win64 platform
  • Commit f23dc5b97: Revert "Add log2() operator to Eigen"
  • Commit 4d91519a9: Add log2() operator to Eigen
  • Commit 25d8ae746: Small cleanup of generic plog implementations: Adding the term e*ln(2) is split into two step for no obvious reason. This dates back to the original Cephes code from which the algorithm is adapted. It appears that this was done in Cephes to prevent the compiler from reordering the addition of the 3 terms in the approximation
  • Commit eb4d4ae07: Include chrono in main for c++11.
  • Commit 71c85df4c: Clean up the Tensor header and get rid of the EIGEN_SLEEP macro.
  • Commit 70fbcf82e: Fix typo in `F32MaskToBf16Mask`.
  • Commit 2627e2f2e: Fix neon cmp* functions for bf16.
  • Commit ddd48b242: Implement CUDA __shfl* for Eigen::half
  • Commit e57281a74: Fix a few issues for AVX512. This change enables vectorized versions of log, exp, log1p, expm1 when AVX512DQ is not available.
  • Commit 1992af3de: Fix #2077, `EIGEN_CONSTEXPR` in `Half`.
  • Commit 7b80609d4: add EIGEN_DEVICE_FUNC to methods
  • Commit 89f90b585: AVX512 missing ops.
  • Commit c5985c46f: Fix typo in doc
  • Commit 68f69414f: Workaround for doxygen class template titles in which the template part of the class signature is lost due to a problem with forward declarations. The problem is probably caused by doxygen bug #7689. It is confirmed to be fixed in doxygen >= 1.8.19.
  • Commit a7170f2ac: Fix doxygen class blocks that were not associated with the correct classes.
  • Commit 550e8f8f5: Include CMakeDependentOption to be able to use cmake_dependent_option
  • Commit 9842366bb: Make inclusion of doc sub-directory optional by adjusting options.
  • Commit aa56e1d98: check for include dirs set
  • Commit 1e74f93d5: Fix some packet-functions in the IBM ZVector packet-math.
  • Commit 79818216e: Revert "Fix Half NaN definition and test."
  • Commit c770746d7: Fix Half NaN definition and test.
  • Commit 22f67b595: Fix boolean float conversion and product warnings.
  • Commit a3b300f1a: Implement missing AVX half ops.
  • Commit 38abf2be4: Fix Half NaN definition and test.
  • Commit 4cf01d2cf: Update AVX half packets, disable test.
  • Commit fd1dcb6b4: Fixes duplicate symbol when building blas
  • Commit 6c9c3f9a1: Remove explicit casts from Eigen::half and Eigen::bfloat16 to bool
  • Commit a8fdcae55: Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix.
  • Commit 11e4056f6: Re-enable Arm Neon Eigen::half packets of size 8
  • Commit 17268b155: Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom
  • Commit 41d5d5334: Initialize primitives to fix -Wuninitialized-const-reference.
  • Commit 3669498f5: Fix rule-of-3 for the Tensor module.
  • Commit 60218829b: EOF newline added to InverseSize4.
  • Commit 2d6370654: Add missing parens around macro argument.
  • Commit 6bba58f10: Replace SSE_SHUFFLE_MASK macro with shuffle_mask.
  • Commit e9b55c4db: Avoid promotion of Arm __fp16 to float in Neon PacketMath
  • Commit 117a4c061: Fix missing `EIGEN_CONSTEXPR` pop_macro in `Half`.
  • Commit 394f56405: Unify Inverse_SSE.h and Inverse_NEON.h into a single generic implementation using PacketMath.
  • Commit 8e9cc5b10: Eliminate double-promotion warnings.
  • Commit 9175f50d6: Add EIGEN_DEVICE_FUNC to TranspositionsBase
  • Commit 280f4f240: Enable MathJax in Doxygen.in
  • Commit bb69a8db5: Explicit casts of S -> std::complex<T>
  • Commit 90f6d9d23: Suppress ignored-attributes warning (same as in vectorization_logic). Remove redundant include and using namespace.
  • Commit 8324e5e04: Fix typo in NEON/PacketMath.h
  • Commit 852513e7a: Disable testing of OpenGL by default.
  • Commit bec72345d: Simplify expression for inner product fallback in Gemv product evaluator.
  • Commit 276db21f2: Remove redundant branch for handling dynamic vector*vector. This will be handled by the equivalent branch in the specialization for GemvProduct.
  • Commit cf12474a8: Optimize matrix*matrix and matrix*vector products when they correspond to inner products at runtime.
  • Commit c29935b32: Add support for dynamic dispatch of MMA instructions for POWER 10
  • Commit b714dd970: remove annotation for first declaration of default con/destruction
  • Commit e24a1f57e: [SYCL Function pointer Issue]: SYCL does not support function pointer inside the kernel, due to the portability issue of a function pointer and memory address space among host and accelerators. To fix the issue, function pointers have been replaced by function objects.
  • Commit 696146891: Address issues with `openglsupport` test.
  • Commit 348a48682: Fix erroneous forward declaration of boost nvp.
  • Commit 82fe059f3: Fix issue2045 which get a error case _mm256_set_m128d op not supported by gcc 7.x
  • Commit 9d11e2c03: CMakefile update for ROCm 4.0
  • Commit 39a038f2e: Fix for ROCm (and CUDA?) breakage - 201029
  • Commit f895755c0: Remove unused functions in Half.h.
  • Commit 09f015852: Replace numext::as_uint with numext::bit_cast<numext::uint32_t>
  • Commit e265f7ed8: Add support for Armv8.2-a __fp16
  • Commit a725a3233: [SYCL clean up the code] : removing exrta #pragma unroll in SYCL which was causing issues in embeded systems
  • Commit b9ff791fe: [Missing SYCL math op]: Addin the missing LDEXP Function for SYCL.
  • Commit 61461d682: [Fixing expf issue]: Eigen uses the packet type operation for scaler type float on Sigmoid function(https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/functors/UnaryFunctors.h#L990). As a result SYCL backend breaks since SYCL backend only supports packet operation for vectorized type float4 and double2. The issue has been fixed by adding scalar type float to packet operation pexp for SYCL backend.
  • Commit ecb7bc951: Bug #2036 make sure find_standard_math_library_test_program actually compiles (and is guaranteed to call math functions)
  • Commit 09f595a26: Make sure compiler does not optimize away calls to math functions
  • Commit 28aef8e81: Improve polynomial evaluation with instruction-level parallelism for pexp_float and pexp<Packet16f>
  • Commit 4a77eda1f: remove unnecessary specialize template of pexp for scale float/double
  • Commit d9f0d9eb7: Fix missing `pfirst<Packet16b>` for MSVC.
  • Commit 21edea5ed: Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent.
  • Commit 011e0db31: Fix for ROCm/HIP breakage - 201013
  • Commit 6ea809170: Revert change from 4e4d3f32d168ed9ce09d950f099a60ddcd11240f that broke BFloat16.h build with older compilers.
  • Commit 4700713fa: Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 plog<Packet16f> op with generic api
  • Commit af6f43d7f: Add specializations for pmin/pmax with prescribed NaN propagation semantics for SSE/AVX/AVX512.
  • Commit 274ef12b6: Remove leftover debug print statement in cxx11_tensor_expr.cpp
  • Commit 208b3626d: Revert generic implementation of `predux`, since it break compilation of `predux_any` with MSVC.
  • Commit e3e2cf9d2: Add MatrixBase::cwiseArg()
  • Commit 61fc78bbd: Get rid of nested template specialization in TensorReductionGpu.h, which was broken by c6953f799b01d36f4236b64f351cc1446e0abe17.
  • Commit c6953f799: Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions.
  • Commit 807e51528: undefine EIGEN_CONSTEXPR before redefinition
  • Commit 9a4d04c05: Make bitwise_helper a device function to unbreak GPU builds.
  • Commit 4e4d3f32d: Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.
  • Commit 7a8d3d5b8: Disable test exceptions when using OpenMP.
  • Commit 9022f5aa8: Mention problems when using potentially throwing scalars and OpenMP
  • Commit d199c17b1: Fix typo in Tutorial_BlockOperations_block_assignment.cpp
  • Commit 4091f6b25: Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD
  • Commit 183a20821: Implement generic bitwise logical packet ops that work for all types.
  • Commit 8f8d77b51: Add EIGEN prefix for HAS_LGAMMA_R
  • Commit 2279f2c62: Use lgamma_r if it is available (update check for glibc 2.19+)
  • Commit b43102440: Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.
  • Commit f66f3393e: Use reinterpret_cast instead of C-style cast in Inverse_NEON.h
  • Commit 22c971a22: Don't cast away const in Inverse_NEON.h.
  • Commit f93841b53: Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.
  • Commit ee714f79f: Fix CUDA build breakage and incorrect result for absdiff on HIP with long double arguments.
  • Commit f7b185a8b: dont use =* might not return a Scalar
  • Commit 9078f47cd: Fix build breakage with MSVC 2019, which does not support MMX intrinsics for 64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c
  • Commit 3b445d9bf: Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564.
  • Commit 44b9d4e41: Specialize pldexp_double and pfdexp_double and get rid of Packet2l definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.
  • Commit d5a0d8949: Fix alignedbox 32-bit precision test failure.
  • Commit 30960d485: Fix failure in GEBP kernel when compiling with OpenMP and FMA
  • Commit f9d1500f7: Revert !182.
  • Commit 068121ec0: Add missing newline at the end of Inverse_NEON.h
  • Commit 74ff5719b: Fix compilation of 64 bit constant arguments to pset1frombits in TypeCasting.h on platforms where uint64_t != unsigned long.
  • Commit 3a0b23e47: Fix compilation of pset1frombits calls on iOS.
  • Commit 6b0c0b587: Provide a more efficient Packet2l->Packet2d cast method
  • Commit 6425e875a: Added AlignedBox::transform(AffineTransform).
  • Commit a967fadb2: Make relative path variables of type STRING
  • Commit e4b24e7fb: Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined
  • Commit ce5c59729: Fix for ROCm/HIP breakage - 200921
  • Commit b8a13f13c: Add CI configuration for ppc64le
  • Commit 821702e77: Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler
  • Commit 493a7c773: Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>
  • Commit 38e4a6739: Fix using FindStandardMathLibrary.cmake with -Wall (-Wunused-value) added to CMAKE_CXX_FLAG
  • Commit c4b99f78c: Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
  • Commit 9aad16b44: Fix undefined reference to pset1frombits bug on different platforms
  • Commit c4aa8e0db: Rename variable to avoid shadowing of a previously declared one
  • Commit e55182ac0: Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
  • Commit 14022f5eb: Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h. 'vmvnq_u64' does not exist for some reason.
  • Commit a5b226920: Fix typo in PacketMath.h
  • Commit 3af744b02: Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.
  • Commit 31a6b88ff: Disable double version of compute_inverse_size4 on Inverse_NEON.h if Packet2d is not supported.
  • Commit 880fa43b2: Add support for CastXML on ARM aarch64
  • Commit 6f0f6f792: Fix compiler error due to c++20 operator== generation rules
  • Commit cc0c38ace: Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.
  • Commit bb56a6258: Make bfloat16(float(-nan)) produce -nan, not nan.
  • Commit 3012e755e: Add plog ops support packet2d for NEON
  • Commit e4fb0ddf7: Add EIGEN_UNUSED_VARIABLE to unused variable in Memory.h
  • Commit 65e400896: Fix bfloat16 round on gcc 4.8
  • Commit 5636f80d1: Fix issue #1968. Don't discard return value from "new" in C++17.
  • Commit 7c5d48f31: Unified sse pldexp_double api
  • Commit 71e08c702: Make blueNorm threadsafe if C++11 atomics are available.
  • Commit adc861cab: New CI infrastructure, including AArch64 runners
  • Commit 5328c9be4: Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow
  • Commit 35d149e34: Add missing functions for Packet8bf in Altivec architecture. Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.
  • Commit 85428a344: Add Neon psqrt<Packet2d> and pexp<Packet2d>
  • Commit 527210682: remove semi triggering -Wextra-semi-stmt
  • Commit 5f25bcf7d: Add Inverse_NEON.h
  • Commit 6fe88a3c9: MatrixProuct enhancements:
  • Commit 656885627: Changing u/int8_t to un/signed char because clang does not understand it.
  • Commit 27e664807: fix #1901: warning in Mode==(Upper|Lower)
  • Commit 5b9bfc892: BUG: cmake_minimum_required must be the first command
  • Commit e5886457c: Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub.
  • Commit 25424d91f: Fix #1974: assertion when reserving an empty sparse matrix
  • Commit 8bb0febaf: add psqrt ops support packet2f/packet4f for NEON
  • Commit 1b1082334: adding attributes to constructors to support hip-clang on ROCm 3.5
  • Commit 603e213d1: Fixing a CUDA / P100 regression introduced by PR 181
  • Commit c060114a2: Fix nightly CI configuration
  • Commit fe8c3ef3c: Add possibility to split test suit build targets and improved CI configuration
  • Commit d10b27fe3: Add missing inline keyword in Quaternion.h.
  • Commit d4a727d09: Disable min/max NaN propagation in test cxx11_tensor_expr
  • Commit d2bb6cf39: Fix compilation error in blasutil test
  • Commit c6820a631: Replace the call to int64_t in the blasutil test by explicit types
  • Commit 8ba1b0f41: bfloat16 packetmath for Arm Neon backend
  • Commit 704798d1d: Add support for Bfloat16 to use vector instructions on Altivec architecture
  • Commit 46f8a1856: Adding an explicit launch_bounds(1024) attribute for GPU kernels.
  • Commit 21122498e: Temporarily turn off the NEON implementation of pfloor as it does not work for large values.
  • Commit 23b7f0572: Disable CI buildstage again
  • Commit d0f5d4bc5: add a banner to advertise the survey
  • Commit 5e484fa11: Fix StlDeque for GCC 10
  • Commit 3ec4f0b64: Fix undefine BF16 union behavior in AVX512.
  • Commit b92206676: Inherit alignment trait from argument in TensorBroadcasting to avoid segfault when the argument is unaligned.
  • Commit 99da2e1a8: Fix clang-tidy warnings in generic bfloat16 implementation
  • Commit 649fd1c2a: Fix CMake install command
  • Commit e48d8e472: Don't allow failure for CI build stage anymore
  • Commit b8ca93842: Improve CI configuration
  • Commit fb0c6868a: Add missing footer declaration
  • Commit c1ffe452f: Fix bfloat16 casts
  • Commit 2ce2f5198: remove piwik tracker
  • Commit 1b84f21e3: Revert change that made conversion from bfloat16 to {float, double} implicit. Add roundtrip tests for casting between bfloat16 and complex types.
  • Commit 38b91f256: Fix cast of blfoat16 to std::complex<T>
  • Commit bed7fbe85: Make sure we take the little-endian path if __BYTE_ORDER__ is not defined.
  • Commit 0e1a33a46: Faster conversion from integer types to bfloat16
  • Commit acab22c20: Avoid division by zero in nonZerosEstimate() for empty blocks.
  • Commit ac2eca6b1: Update tensor reduction test to avoid undefined division of bfloat16 by int.
  • Commit 0aeaf5f45: Make numext::as_uint a device function.
  • Commit 60faa9f89: user-defined copy operations removed in favor of compiler-generated ones
  • Commit b11f817bc: Avoid undefined behavior by union type punning in float_to_bfloat16_rtne
  • Commit 56b3e3f3f: AVX path for BF16
  • Commit 4ab32e2de: Allow implicit conversion from bfloat16 to float and double
  • Commit dcf7655b3: Guard operator<< test by EIGEN_NO_IO.
  • Commit ed00df445: Guard operator<< by EIGEN_NO_IO.
  • Commit fb77b7288: Add operator<< to print a quaternion.
  • Commit ee4715ff4: Fix test basic stuff
  • Commit 8889a2c1c: Add operator==/operator!= to Quaternion. Fixes #1876.
  • Commit 6964ae8d5: Change the sign operator in Eigen to return NaN for NaN arguments, not zero.
  • Commit cb6315318: Make test packetmath C++98 compliant
  • Commit 116c5235a: BF16 for scalar_cmp_with_cast_op
  • Commit 8731452b9: Delete duplicate test cases in vectorization_logic.cpp
  • Commit 9cb8771e9: Fix tensor casts for large packets and casts to/from std::complex
  • Commit 145e51516: Fix denormal check pre c++11.
  • Commit 689b57070: Report custom C++ flags in CMake testing summary
  • Commit f3b8d441f: Remote CI tags to enable shared runners
  • Commit dc0b81fb1: Pass CMAKE_MAKE_PROGRAM to Fortran language support test
  • Commit 13d25f5ed: Add initial CI configuration file.
  • Commit 7222f0b6b: Fix packetmath_1 float tests for arm/aarch64.
  • Commit 14f84978e: Replaced call to deprecated 'load' function with appropriate call to 'on'.
  • Commit ff4e7a082: Add missing Packet2l/Packet2ul ops for NEON.
  • Commit 03ebdf6ac: Added missing NEON pcasts, update packetmath tests.
  • Commit 386d809bd: Support BFloat16 in Eigen
  • Commit 6b9c92fe7: Add Apache 2.0 license text in COPYING.APACHE.
  • Commit cf7adf3a5: Update `things you can do` message using cmake commands
  • Commit 231ce2153: Run two independent chains, when reducing tensors.
  • Commit a475bf14d: Fix pscatter and pgather for Altivec Complex double
  • Commit c6c84ed96: Fix unused variable warning on Arm
  • Commit 6228f2723: Fix #1818: SparseLU: add methods nnzL() and nnzU()
  • Commit 39cbd6578: Fix #1911: add benchmark for move semantics with fixed-size matrix
  • Commit a7d2552af: Remove HasCast and fix packetmath cast tests.
  • Commit 463ec8664: Fix #1757: remove the word 'suicide'
  • Commit b5d66b5e7: Implement scalar_cmp_with_cast_op
  • Commit c4059ffcb: Fix static analyzer warning in SelfadjointProduct.h. Fix compiler warnings in GeneralBlockPanelKernel.h.
  • Commit 1fcaaf460: Update FindComputeCpp.cmake to fix build problems on Windows
  • Commit 3ce18d3c8: Revert ".gitlab-ci.yml: initial commit"
  • Commit c2ab36f47: Fix broken packetmath test for logistic on Arm.
  • Commit 537e2b322: Fix typo in previous update to generic predux_any.
  • Commit fdc1cbdce: Avoid implicit float equality comparison in generic predux_any, but use numext::not_equal_strict to avoid breaking builds that compile with -Werror=float-equal.
  • Commit daf9bbeca: Fix compilation error in logistic packet op.
  • Commit 6d2a9a524: Update run instructions for benchCholesky
  • Commit 029a76e11: Bug #1777: make the scalar and packet path consistent for the logistic function + respective unit test
  • Commit 99b7f7cb9: Fix #556: warnings with mingw
  • Commit 72782d13e: Bug #1767: increase required cmake version to 3.5.0
  • Commit 867a75650: Fix #1833: compilation issue of "array!=scalar" with c++20
  • Commit ab615e411: Save one extra temporary when assigning a sparse product to a row-major sparse matrix
  • Commit 95177362e: .gitlab-ci.yml: initial commit
  • Commit 8d1302f56: Add support for PacketBlock<Packet8s,4> and PacketBlock<Packet16uc,4> ptranspose on NEON
  • Commit 8719b9c5b: Disable test for 32-bit systems (e.g. ARM, i386)
  • Commit 8e1df5b08: Fix incorrect usage of `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`
  • Commit 4e7046063: Fix #1874: it works on both MSVC 2017 and other platforms.
  • Commit 2d67af2d2: Add pscatter for Packet16{u}c (int8)
  • Commit 5328cd62b: Guard usage of decltype since it's a C++11 feature
  • Commit cc86a31e2: Add guard around specialization for bool, which is only currently implemented for SSE.
  • Commit 8a7f360ec: - Vectorizing MMA packing. - Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.
  • Commit a145e4adf: Add newline at the end of StlIterators.h.
  • Commit 8ce9630dd: Fix #1874: workaround MSVC 2017 compilation issue.
  • Commit 9b411757a: Add missing packet ops for bool, and make it pass the same packet op unit tests as other arithmetic types.
  • Commit d640276d3: Added support for reverse iterators for Vectorwise operations.
  • Commit fa8fd4b4d: Indexed view should have RowMajorBit when there is staticly a single row
  • Commit a187ffea2: Resolve "IndexedView of a vector should allow linear access"
  • Commit ba9d18b93: Add KLU support to spbenchsolver
  • Commit 5fdc17924: Altivec template functions to better code reusability
  • Commit d3e81db6c: Eigen moved the `scanLauncehr` function inside the internal namespace. This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings.
  • Commit c1d944dd9: Remove packet ops pinsertfirst and pinsertlast that are only used in a single place, and can be replaced by other ops when constructing the first/final packet in linspaced_op_impl::packetOp.
  • Commit 5c4e19fbe: Possibility to specify user-defined default cache sizes for GEBP kernel
  • Commit 225ab040e: Remove unused packet op "palign". Clean up a compiler warning in c++03 mode in AVX512/Complex.h.
  • Commit 74ec8e661: Make size odd for transposeInPlace test to make sure we hit the scalar path.
  • Commit 49f1aeb60: Remove traits declaring NEON vectorized casts that do not actually have packet op implementations.
  • Commit 2fd8a5a08: Add parallelization of TensorScanOp for types without packet ops.
  • Commit 0e59f786e: Fix accidental copy of loop variable.
  • Commit 7b76c85da: Vectorize and parallelize TensorScanOp.
  • Commit a74a278ab: Fix confusing template param name for Stride fwd decl.
  • Commit 923ee9aba: Fix the embarrassingly incomplete fix to the embarrassing bug in blocked transpose.
  • Commit a32923a43: Fix (embarrassing) bug in blocked transpose.
  • Commit 1e41406c3: Add missing transpose in cleanup loop. Without it, we trip an assertion in debug mode.
  • Commit fbe7916c5: Fix compilation error with Clang on Android: _mm_extract_epi64 fails to compile.
  • Commit 82f54ad14: Fix perf monitoring merge function
  • Commit ab773c7e9: Extend support for Packet16b:
  • Commit b47c77799: Block transposeInPlace() when the matrix is real and square. This yields a large speedup because we transpose in registers (or L1 if we spill), instead of one packet at a time, which in the worst case makes the code write to the same cache line PacketSize times instead of once.
  • Commit 29f0917a4: Add support to vector instructions to Packet16uc and Packet16c
  • Commit e80ec2435: Remove unused packet op "preduxp".
  • Commit 0aebe19ac: BooleanRedux.h: Add more EIGEN_DEVICE_FUNC qualifiers.
  • Commit 3c02fefec: Add async evaluation support to TensorSlicingOp.
  • Commit 0c67b855d: Add Packet8s and Packet8us to support signed/unsigned int16/short Altivec vector operations
  • Commit e8f40e467: Fix bug in ptrue for Packet16b.
  • Commit 2f6ddaa25: Add partial vectorization for matrices and tensors of bool. This speeds up boolean operations on Tensors by up to 25x.
  • Commit 00f634015: Update PreprocessorDirectives.dox - Added line for the new VectorwiseOp plugin directive (and re-alphabatized the plugin section)
  • Commit 5ab87d8ab: Move eigen_packet_wrapper to GenericPacketMath.h and use it for SSE/AVX/AVX512 as it is already used for NEON. This will allow us to define multiple packet types backed by the same vector type, e.g., __m128i. Use this machanism to define packets for half and clean up the packet op implementations.
  • Commit 4aae8ac69: Fix typo in TypeCasting.h
  • Commit 1d674003b: Fix big in vectorized casting of
  • Commit b1aa07a8d: Fix a bug in TensorIndexList.h
  • Commit d46d726e9: CommaInitializer wrongfully asserted for 0-sized blocks commainitialier unit-test never actually called `test_block_recursion`, which also was not correctly implemented and would have caused too deep template recursion.
  • Commit c854e189e: Fixed commainitializer test.
  • Commit 39142904c: Resolve C4346 when building eigen on windows
  • Commit f0577a2bf: Speed up matrix multiplication for small to medium size matrices by using half- or quarter-packet vectorized loads in gemm_pack_rhs if they have size 4, instead of dropping down the the scalar path.
  • Commit 8e875719b: Replace norm() with squaredNorm() to address integer overflows
  • Commit 9dda5eb7d: Missing struct definition in NumTraits
  • Commit bcc0e9e15: Add numeric_limits min and max for bool
  • Commit 54a0a9c9d: Bugfix: conjugate_gradient did not compile with lazy-evaluated RealScalar
  • Commit 4fd5d1477: Fix packetmath test build for AVX.
  • Commit 393dbd8ee: Fix bug in https://gitlab.com/libeigen/eigen/-/commit/52d54278beefee8b2f19dcca4fd900916154e174
  • Commit 55c8fe8d0: Fix bug in https://gitlab.com/libeigen/eigen/-/commit/52d54278beefee8b2f19dcca4fd900916154e174
  • Commit 6d2dbfc45: NEON: Fixed MSVC types definitions
  • Commit 52d54278b: Additional NEON packet-math operations
  • Commit deb93ed1b: Adhere to recommended load/store intrinsics for pp64le
  • Commit 5c22c7a7d: Make file formatting comply with POSIX and Unix standards
  • Commit 5afdaa473: Fixing float32's pround halfway criteria to match STL's criteria.
  • Commit 96cd1ff71: Fixed: - access violation when initializing 0x0 matrices - exception can be thrown during stack unwind while comma-initializing a matrix if eigen_assert if configured to throw
  • Commit cc954777f: Update VectorwiseOp.h to allow Plugins similar to MatrixBase.h or ArrayBase.h
  • Commit 55ecd58a3: Bug https://gitlab.com/libeigen/eigen/-/issues/1415: add missing EIGEN_DEVICE_FUNC to diagonal_product_evaluator_base.
  • Commit 4da2c6b19: Remove reference to non-existent unary_op_base class.
  • Commit eda90baf3: Add missing arguments to numext::absdiff().
  • Commit d5c665742: Add absolute_difference coefficient-wise binary Array function
  • Commit 6ff5a1409: Reenabling packetmath unsigned tests, adding dummy pabs for relevant unsigned types.
  • Commit 232f90408: Add shift_left<N> and shift_right<N> coefficient-wise unary Array functions
  • Commit 54aa8fa18: Implement integer square-root for NEON
  • Commit 37ccb8691: Update NullaryFunctors.h
  • Commit 7158ed4e0: Fixing HIP breakage caused by the recent commit that introduces Packet4h2 as the Eigen::Half packet type
  • Commit d53ae40f7: NEON: Added int64_t and uint64_t packet math
  • Commit 4b9ecf292: NEON: Added int8_t and uint8_t packet math
  • Commit ceaabd4e1: NEON: Added int16_t and uint16_t packet math
  • Commit d5d3cf933: NEON: Added uint32_t packet math
  • Commit eacf97f72: NEON: Implemented half-size vectors
  • Commit 5f411b729: NEON: Set packet_traits<double> flags
  • Commit 88337acae: test/packetmath: Add tests for all integer types
  • Commit 9e6897757: test/packetmath: Made negate non-mandatory
  • Commit b733b8b68: remove duplicate pset1 for half and add some comments about why we need expose pmul/add/div/min/max on host
  • Commit a45d28256: Don't restrict CMAKE_BUILD_TYPE
  • Commit 98bfc5aaa: Update MarketIO.h
  • Commit 52a2fbbb0: Revert "avoid selecting half-packets when unnecessary"
  • Commit 235bcfe08: Revert "Pick full packet unconditionally when EIGEN_UNALIGNED_VECTORIZE"
  • Commit d7a42eade: Revert "do not pick full-packet if it'd result in more operations"
  • Commit 6ac37768a: Revert "add some static checks for packet-picking logic"
  • Commit 87cfa4862: Revert "Disable test in test/vectorization_logic.cpp, which is currently failing with AVX."
  • Commit b625adffd: Disable test in test/vectorization_logic.cpp, which is currently failing with AVX.
  • Commit f0ce88cff: Include <sstream> explicitly, and don't rely on the implicit include via <complex>.
  • Commit eb6cc2958: Avoid a division in NonBlockingThreadPool::Steal.
  • Commit 776960024: add some static checks for packet-picking logic
  • Commit e9cc0cd35: do not pick full-packet if it'd result in more operations
  • Commit 44df2109c: Pick full packet unconditionally when EIGEN_UNALIGNED_VECTORIZE
  • Commit 5ca10480b: avoid selecting half-packets when unnecessary
  • Commit f584bd9b3: Fail at compile time if default executor tries to use non-default device
  • Commit 3fda850c4: Remove dead code from TensorReduction.h
  • Commit b5df8cabd: fix hip-clang compilation due to new HIP scalar accessor
  • Commit 6d284bb1b: Fix for HIP breakage - 200115. Adding a missing EIGEN_DEVICE_FUNC attr
  • Commit f6c6de5d6: Ensure Igamma does not NaN or Inf for large values.
  • Commit 6601abce8: Remove rogue include in TypeCasting.h. Meta.h is already included by the top-level header in Eigen/Core.
  • Commit b9362fb8f: Convert StridedLinearBufferCopy::Kind to enum class
  • Commit 5a8b97b40: Switching unpacket_traits<Packet4i> to vectorizable=true.
  • Commit 42838c28b: Adding correct cache sizes for PPC architecture.
  • Commit 1d0c45122: Removing executable bit from file mode
  • Commit 35219cea6: Bug #1790: Make `areApprox` check `numext::isnan` instead of bitwise equality (NaNs don't have to be bitwise equal).
  • Commit 2e099e8d8: Added special_packetmath test and tweaked bounds on tests. Refactor shared packetmath code to header file. (Squashed from PR !38)
  • Commit e1ecfc162: call Explicitly ::rint and ::rintf for targets without c++11. Without this, the Windows build breaks when trying to compile numext::rint<double>.
  • Commit da5a7afed: Improvements to the tidiness and completeness of the NEON implementation
  • Commit 452371cea: Fix for gcc build error when using Eigen headers with AVX512
  • Commit 601f89dfd: Adding RInt vector support for SYCL.
  • Commit 2ea5a715c: Properly initialize b vector in SplineFitting
  • Commit 925497411: Don't add EIGEN_DEVICE_FUNC to random() since ::rand is not available in Cuda.
  • Commit a3ec89b5b: Add missing EIGEN_DEVICE_FUNC annotations in MathFunctions.h.
  • Commit 8333e0359: Use data.data() instead of &data (since it is not obvious that Array is trivially copyable)
  • Commit e6fcee995: Don't use the rational approximation to the logistic function on GPUs as it appears to be slightly slower.
  • Commit 4217a9f09: The upper limits for where to use the rational approximation to the logistic function were not set carefully enough in the original commit, and some arguments would cause the function to return values greater than 1. This change set the versions found by scanning all floating point numbers (using std::nextafterf()).
  • Commit 9623c0c4b: Fix formatting
  • Commit 19876ced7: Bug #1785: Introduce numext::rint.
  • Commit d0ae052da: [SYCL Backend] * Adding Missing operations for vector comparison in SYCL. This caused compiler error for vector comparison when compiling SYCL * Fixing the compiler error for placement new in TensorForcedEval.h This caused compiler error when compiling SYCL backend * Reducing the SYCL warning by removing the abort function inside the kernel * Adding Strong inline to functions inside SYCL interop.
  • Commit eedb7eeac: Protecting integer_types's long long test with a check to see if we have CXX11 support.
  • Commit bcbaad6d8: Bug #1800: Guard against misleading indentation
  • Commit 00de57079: Fix -Werror -Wfloat-conversion warning.
  • Commit 636e2bb3f: Fix for HIP breakage - 191220
  • Commit 1e9664b14: Bug #1796: Make matrix squareroot usable for Map and Ref types
  • Commit d86544d65: Reduce code duplication and avoid confusing Doxygen
  • Commit dde279f57: Hide recursive meta templates from Doxygen
  • Commit c21771ac0: Use double-braces initialization (as everywhere else in the test-suite).
  • Commit a3273aeff: Fix trivial shadow warning
  • Commit 870e53c0f: Bug #1788: Fix rule-of-three violations inside the stable modules. This fixes deprecated-copy warnings when compiling with GCC>=9 Also protect some additional Base-constructors from getting called by user code code (#1587)
  • Commit 6965f6de7: Fix unit-test which I broke in previous fix
  • Commit 7a65219a2: Fix TensorPadding bug in squeezed reads from inner dimension
  • Commit 73e55525e: Return const data pointer from TensorRef evaluator.data()
  • Commit ae07801dd: Tensor block evaluation cost model
  • Commit 72166d0e6: Fix some maybe-unitialized warnings
  • Commit 5a3eaf88a: Workaround class-memaccess warnings on newer GCC versions
  • Commit de07c4d1c: fix compilation due to new HIP scalar accessor
  • Commit 788bef6ab: Reduce block evaluation overhead for small tensor expressions
  • Commit 725216333: Add default definition for EIGEN_PREDICT_*
  • Commit a56607448: Improve accuracy of fast approximate tanh and the logistic functions in Eigen, such that they preserve relative accuracy to within a few ULPs where their function values tend to zero (around x=0 for tanh, and for large negative x for the logistic function).
  • Commit 8e5da7146: Resolve double-promotion warnings when compiling with clang. `sin` was calling `sin(double)` instead of `std::sin(float)`
  • Commit 9b7a2b43c: Renamed .hgignore to .gitignore (removing hg-specific "syntax" line)
  • Commit 06e99aaf4: Bug 1785: fix pround on x86 to use the same rounding mode as std::round.
  • Commit 73a8d572f: Clamp tanh approximation outside [-c, c] where c is the smallest value where the approximation is exactly +/-1. Without FMA, c = 7.90531110763549805, with FMA c = 7.99881172180175781.
  • Commit 88062b7fe: Fix implementation of complex expm1. Add tests that fail with previous implementation, but pass with the current one.
  • Commit 381f8f313: Initialize non-trivially constructible types when allocating a temp buffer.
  • Commit 64272c7f4: Squeeze reads from two inner dimensions in TensorPadding
  • Commit 963ba1015: Add back accidentally deleted default constructor to TensorExecutorTilingContext.
  • Commit 1b6e0395e: Added io test
  • Commit 3c0ef9f39: IO: Fixed printing of char and unsigned char matrices
  • Commit e87af0ed3: Added Eigen::numext typedefs for uint8_t, int8_t, uint16_t and int16_t
  • Commit 15b3bcfca: Bug 1786: fix compilation with MSVC
  • Commit c9220c035: Remove block memory allocation required by removed block evaluation API
  • Commit 1c879eb01: Remove V2 suffix from TensorBlock
  • Commit dbca11e88: Remove TensorBlock.h and old TensorBlock/BlockMapper
  • Commit c49f0d851: Fix for HIP breakage detected on 191210
  • Commit 2918f85ba: Do not use std::vector in getResourceRequirements
  • Commit 8056a05b5: Undo the block size change.
  • Commit dbb703d44: Add async evaluation support to TensorSelectOp
  • Commit 11d646532: fix AlignedVector3 inconsisent interface with other Vector classes, default constructor and operator- were missing.
  • Commit bb7ccac3a: Add recursive work splitting to EvalShardedByInnerDimContext
  • Commit 25230d186: Improve performance of contraction kernels
  • Commit 08eeb648e: update hg to git hashes
  • Commit 366cf005b: Add missing initialization in cxx11_tensor_trace.cpp.
  • Commit c488b8b32: Replace calls to "hg" by calls to "git"
  • Commit 8fbe0e469: Update old links to bitbucket to point to gitlab.com
  • Commit 114a15c66: Added tag before-git-migration for changeset a7c7d329d89e8484be58df6078a586c44523db37
  • Commit a7c7d329d: Merged in ezhulenev/eigen-01 (pull request PR-769)
  • Commit cacf43397: Merged in anshuljl/eigen-2/Anshul-Jaiswal/update-configurevectorizationh-to-not-op-1573079916090 (pull request PR-754)
  • Commit 8f4536e85: Capture TensorMap by value inside tensor expression AST
  • Commit 4e696901f: Remove __host__ annotation for device-only function.
  • Commit ead81559c: Use EIGEN_DEVICE_FUNC macro instead of __device__.
  • Commit 6358599ec: Fix QuaternionBase::cast for quaternion map and wrapper.
  • Commit 7745f6901: bug #1776: fix vector-wise STL iterator's operator-> using a proxy as pointer type. This changeset fixes also the value_type definition.
  • Commit 66f07efea: Revert the specialization for scalar_logistic_op<float> introduced in:
  • Commit 3b15373bb: Merged in ezhulenev/eigen-02 (pull request PR-767)
  • Commit 312c8e77f: Fix for the HIP build+test errors.
  • Commit 956131d0e: Merged in codeplaysoftware/eigen/SYCL-Backend (pull request PR-691)
  • Commit 00f32752f: [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
  • Commit 82a47338d: Fix shadow warnings in AlignedBox and SparseBlock
  • Commit ea51a9eac: Add missing EIGEN_DEVICE_FUNC attribute to template specializations for pexp to fix GPU build.
  • Commit 5a3ebda36: Fix warning due to missing cast for exponent arguments for std::frexp and std::lexp.
  • Commit 2df57be85: Merged in realjhol/eigen/fix-warnings (pull request PR-760)
  • Commit 5496d0da0: Add async evaluation support to TensorReverse
  • Commit bc66c8825: Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling
  • Commit c79b6ffe1: Add an explicit example for auto and re-evaluation
  • Commit e78ed6e7f: COMP: Simplify install commands for Eigen
  • Commit 9d5cdc98c: COMP: target_compile_definitions requires cmake 2.8.11
  • Commit e5778b87b: Fix duplicate symbol linking error.
  • Commit 86eb41f1c: SparseRef: Fixed alignment warning on ARM GCC
  • Commit c1a67cb5a: Update ConfigureVectorization.h to not optimize fp16 routines when compiling with cuda.
  • Commit cc3d0e6a4: Add EIGEN_HAS_INTRINSIC_INT128 macro
  • Commit ee404667e: Rollback or PR-746 and partial rollback of https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494 .
  • Commit 743c92528: test/packetmath: Silence alignment warnings
  • Commit 0c9745903: Merged in ezhulenev/eigen-01 (pull request PR-746)
  • Commit 8c8cab1af: STYLE: Convert CMake-language commands to lower case
  • Commit 6fb3e5f17: STYLE: Remove CMake-language block-end command arguments
  • Commit f1e830730: 1. Fix a bug in psqrt and make it return 0 for +inf arguments. 2. Simplify handling of special cases by taking advantage of the fact that the builtin vrsqrt approximation handles negative, zero and +inf arguments correctly. This speeds up the SSE and AVX implementations by ~20%. 3. Make the Newton-Raphson formula used for rsqrt more numerically robust:
  • Commit 2cb2915f9: bug #1744: fix compilation with MSVC 2017 and AVX512, plog1p/pexpm1 require plog/pexp, but the later was disabled on some compilers
  • Commit c3f6fcf2c: bug #1747: one more fix for MSVC regarding the Bessel implementation.
  • Commit b9837ca9a: bug #1281: fix AutoDiffScalar's make_coherent for nested expression of constant ADs.
  • Commit 0fb6e2440: Fix case issue with Lapack unit tests
  • Commit 8af045a28: bug #1774: fix VectorwiseOp::begin()/end() return types regarding constness.
  • Commit 75b4c0a3e: PR 751: Fixed compilation issue when compiling using MSVC with /arch:AVX512 flag
  • Commit 8496f86f8: Enable CompleteOrthogonalDecomposition::pseudoInverse with non-square fixed-size matrices.
  • Commit 002e5b6db: Move to my.cdash.org
  • Commit 13c3327f5: Remove legacy block evaluation support
  • Commit 71aa53dd6: Disable AVX on broken xcode versions. See PR 748. Patch adapted from Hans Johnson's PR 748.
  • Commit 0ed033859: Fix a race in async tensor evaluation: Don't run on_done() until after device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
  • Commit c952b8dfd: Break loop dependence in TensorGenerator block access
  • Commit ebf04fb3e: Fix data race in css11_tensor_notification test.
  • Commit 73ecb2c57: Cleanup includes in Tensor module after switch to C++11 and above
  • Commit e7ed4bd38: Remove internal::smart_copy and replace with std::copy
  • Commit fbc0a9a3e: Fix CXX11Meta compilation with MSVC
  • Commit bd864ab42: Prevent potential ODR in TensorExecutor
  • Commit 6332aff0b: This PR fixes: * The specialization of array class in the different namespace for GCC<=6.4 * The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
  • Commit 8e4e29ae9: Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)
  • Commit 97c0c5d48: Add block evaluation V2 to TensorAsyncExecutor. Add async evaluation to a number of ops.
  • Commit 102cf2a72: Fix for the HIP build+test errors.
  • Commit 668ab3fc4: Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate c++11 functionality with older compilers.
  • Commit df0e8b813: Propagate block evaluation preference through rvalue tensor expressions
  • Commit 0d2a14ce1: Cleanup Tensor block destination and materialized block storage allocation
  • Commit 02431cbe7: TensorBroadcasting support for random/uniform blocks
  • Commit d380c23b2: Block evaluation for TensorGenerator/TensorReverse/TensorShuffling
  • Commit 39fb9eecc: bug #1747: fix compilation with MSVC
  • Commit a411e9f34: Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op
  • Commit b03eb63d7: Merged in ezhulenev/eigen-01 (pull request PR-726)
  • Commit e7d8ba747: bug #1752: make is_convertible equivalent to the std c++11 equivalent and fallback to std::is_convertible when c++11 is enabled.
  • Commit fb557aec5: bug #1752: disable some is_convertible tests for recent compilers.
  • Commit 33e174613: Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
  • Commit f0a4642ba: Implement c++03 compatible fix for changeset 7a43af1a335da2c0489b4119a33ee1cbff0c15d6
  • Commit 196de2efe: Explicitly bypass resize and memmoves when there is already the exact right number of elements available.
  • Commit 36da231a4: Disable an expected warning in unit test
  • Commit d1def335d: fix one more possible conflicts with real/imag
  • Commit 87427d2ea: PR 719: fix real/imag namespace conflict
  • Commit 7a43af1a3: Fix compilation of FFTW unit test
  • Commit f74ab8cb8: Add block evaluation to TensorEvalTo and fix few small bugs
  • Commit 3afb640b5: Fixing incorrect size in Tensor documentation.
  • Commit 20c4a9118: Use "pdiv" rather than operator/ to support packet types.
  • Commit d1dd51cb5: Merged in ezhulenev/eigen-01 (pull request PR-723)
  • Commit 98bdd7252: Fix compilation warnings and errors with clang in TensorBlockV2 code and tests
  • Commit fab4e3a75: Address comments on Chebyshev evaluation code:
  • Commit 60ae24ee1: Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect
  • Commit 6e40454a6: Add beta to TensorContractionKernel and make memset optional
  • Commit bd0fac456: Prevent infinite loop in the nvcc compiler while unrolling the recurrent templates for Chebyshev polynomial evaluation.
  • Commit 9549ba831: Fix perf issue in SimplicialLDLT::solve for complexes (again, m_diag is real)
  • Commit c8b2c603b: Fix speed issue with SimplicialLDLT for complexes: the diagonal is real!
  • Commit 13ef08e5a: Move implementation of vectorized error function erf() to SpecialFunctionsImpl.h.
  • Commit 7c8bc0d92: Fix cxx11_tensor_block_io test
  • Commit 0c845e28c: Fix erf in c++03
  • Commit 71d5bedf7: Fix compilation warnings and errors with clang in TensorBlockV2
  • Commit 5e186b198: Fix for the HIP build+test errors.
  • Commit f35b9ab51: Fix a bug in a packed block type in TensorContractionThreadPool
  • Commit d38e6fbc2: Merged in rmlarsen/eigen (pull request PR-704)
  • Commit 591a554c6: Add TODO to cleanup FMA cost modelling.
  • Commit c64396b4c: Choose TensorBlock StridedLinearCopy type statically
  • Commit c97b20846: Add new TensorBlock api implementation + tests
  • Commit ef9dfee7b: Tensor block evaluation V2 support for unary/binary/broadcsting
  • Commit efd9867ff: bug #1746: Removed implementation of standard copy-constructor and standard copy-assign-operator from PermutationMatrix and Transpositions to allow malloc-less std::move. Added unit-test to rvalue_types
  • Commit e4c1b3c1d: Fix implicit conversion warnings and use pnegate to negate packets
  • Commit ba0736fa8: Fix (or mask away) conversion warnings introduced in 553caeb6a3bb545aef895f8fc9f219be44679017 .
  • Commit 1d5af0693: Add support for asynchronous evaluation of tensor casting expressions.
  • Commit 6de5ed08d: Add generic PacketMath implementation of the Error Function (erf).
  • Commit 28b678649: Fix build on setups without AVX512DQ.
  • Commit e02d42963: Fix for the HIP build+test errors.
  • Commit df0816b71: Merging eigen/eigen.
  • Commit 6e215cf10: Add Bessel functions to SpecialFunctions.
  • Commit 7c7329684: Revert accidental change to GCC diagnostics
  • Commit bf8866b46: Fix maybe-unitialized warnings in TensorContractionThreadPool
  • Commit 553caeb6a: Use ThreadLocal container in TensorContractionThreadPool
  • Commit facdec5aa: Add packetized versions of i0e and i1e special functions. - In particular refactor the i0e and i1e code so scalar and vectorized path share code. - Move chebevl to GenericPacketMathFunctions.
  • Commit b052ec699: Merged eigen/eigen into default
  • Commit cdb377d0c: Fix for the HIP build+test errors introduced by the ndtri support.
  • Commit 747c6a51c: bug #1736: fix compilation issue with A(all,{1,2}).col(j) by implementing true compile-time "if" for block_evaluator<>::coeff(i)/coeffRef(i)
  • Commit 031f17117: bug #1741: fix self-adjoint*matrix, triangular*matrix, and triangular^1*matrix with a destination having a non-trivial inner-stride
  • Commit 459b2bcc0: Fix compilation of BLAS backend and frontend
  • Commit 97f1e1d89: Merged in ezhulenev/eigen-01 (pull request PR-698)
  • Commit d918bd9a8: Update ThreadLocal to use separate Initialize/Release callables
  • Commit afa8d1353: Fix some implicit literal to Scalar conversions in SparseCore
  • Commit c06e6fd11: bug #1741: fix SelfAdjointView::rankUpdate and product to triangular part for destination with non-trivial inner stride
  • Commit ea0d5dc95: bug #1741: fix C.noalias() = A*C; with C.innerStride()!=1
  • Commit e3dec4dcc: ThreadLocal container that does not rely on thread local storage
  • Commit 17226100c: Fix a circular dependency regarding pshift* functions and GenericPacketMathFunctions. Another solution would have been to make pshift* fully generic template functions with partial specialization which is always a mess in c++03.
  • Commit 55b63d4ea: Fix compilation without vector engine available (e.g., x86 with SSE disabled): -> ppolevl is required by ndtri even for the scalar path
  • Commit a9cf823db: Merged eigen/eigen
  • Commit e6c183f8f: Fix doc issues regarding ndtri
  • Commit 5702a5792: Fix possible warning regarding strict equality comparisons
  • Commit 99036a361: Merging from eigen/eigen.
  • Commit a8d264fa9: Add test for const TensorMap underlying data mutation
  • Commit f68f2bba0: TensorMap constness should not change underlying storage constness
  • Commit 8e7e3d9bc: Makes Scalar/RealScalar typedefs public in Pardiso's wrappers (see PR 688)
  • Commit e38dd48a2: PR 681: Add ndtri function, the inverse of the normal distribution function.
  • Commit f59bed7a1: Change typedefs from private to protected to fix MSVC compilation
  • Commit 47fefa235: Allow move-only done callback in TensorAsyncDevice
  • Commit 18ceb3413: Add ndtri function, the inverse of the normal distribution function.
  • Commit d55d392e7: Fix bugs in log1p and expm1 where repeated using statements would clobber each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.
  • Commit 85928e5f4: Guard against repeated definition of EIGEN_MPL2_ONLY
  • Commit facc4e453: Disable tests for contraction with output kernels when using libxsmm, which does not support this.
  • Commit eab7e52db: [Eigen] Vectorize evaluation of coefficient-wise functions over tensor blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX.
  • Commit 098712616: Clean up unnecessary namespace specifiers in TensorBlock.h.
  • Commit 0050644b2: Fix doc regarding alignment and c++17
  • Commit e2999d4c3: Fix performance regressions due to https://bitbucket.org/eigen/eigen/pull-requests/662.
  • Commit c694be121: Fixed Tensor documentation formatting.
  • Commit 15f3d9d27: More colamd cleanup: - Move colamd implementation in its own namespace to avoid polluting the internal namespace with Ok, Status, etc. - Fix signed/unsigned warning - move some ugly free functions as member functions
  • Commit a4d1a6cd7: Eigen_Colamd.h updated to replace constexpr with consts and enums.
  • Commit 283558fac: Ordering.h edited to fix dependencies on Eigen_Colamd.h
  • Commit 39f30923c: Eigen_Colamd.h edited replacing macros with constexprs and functions.
  • Commit 0a6b553ec: Eigen_Colamd.h edited online with Bitbucket replacing constant #defines with const definitions
  • Commit f22b7283a: Added leading asterisk for Doxygen to consume as it was removing asterisk intended to be part of the code.
  • Commit 6e17491f4: Fix typo in Umeyama method documentation
  • Commit e0f5a2a45: Remove {} accidentally added in previous commit
  • Commit ea6d7eb32: Move variadic constructors outside `#ifndef EIGEN_PARSED_BY_DOXYGEN` block, to make it actually appear in the generated documentation.
  • Commit 9237883ff: Escape \# inside doxygen docu
  • Commit c2671e531: Build deprecated snippets with -DEIGEN_NO_DEPRECATED_WARNING Also, document LinSpaced only where it is implemented
  • Commit 3cd148f98: Fix expression evaluation heuristic for TensorSliceOp
  • Commit 23b958818: Fix compiler for unsigned integers.
  • Commit 608301459: Add outer/inner chipping optimization for chipping dimension specified at runtime
  • Commit 7eb2e0a95: adding the EIGEN_DEVICE_FUNC attribute to the constCast routine.
  • Commit ef8aca6a8: Merged in codeplaysoftware/eigen (pull request PR-667)
  • Commit 4ac93f8ed: Allocate non-const scalar buffer for block evaluation with DefaultDevice
  • Commit 9ea490c82: [SYCL] : * Modifying TensorDeviceSYCL to use `EIGEN_THROW_X`. * Modifying TensorMacro to use `EIGEN_TRY/CATCH(X)` macro. * Modifying TensorReverse.h to use `EIGEN_DEVICE_REF` instead of `&`. * Fixing the SYCL device macro in SpecialFunctionsImpl.h.
  • Commit 81a03bec7: Fix TensorReverse on GPU with m_stride[i]==0
  • Commit 8053eeb51: Fix CUDA compilation error for pselect<half>.
  • Commit 74a9dd110: Fix preprocessor condition to only generate a warning when calling eigen::GpuDevice::synchronize() from device code, but not when calling from a non-GPU compilation unit.
  • Commit 70d4020ad: Remove comma causing warning in c++03 mode.
  • Commit 6e7c76481: Merge with Eigen head
  • Commit 878845cb2: Add block access to TensorReverseOp and make sure that TensorForcedEval uses block access when preferred
  • Commit 1f61aee5c: [SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
  • Commit 7d08fa805: [SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
  • Commit 16a56b2dd: [SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization
  • Commit adec097c6: Remove extra comma (causes warnings in C++03)
  • Commit 229db8157: Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp
  • Commit ba506d5bd: fix for a ROCm/HIP specificcompile errror introduced by a recent commit.
  • Commit c9394d7a0: Remove extra "one" in comment.
  • Commit b8f8dac4e: Update comment as suggested by tra@google.com.
  • Commit e5e63c2ca: Fix grammar.
  • Commit 302a404b7: Added comment explaining the surprising EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC clause.
  • Commit b5237f53b: Fix CUDA build on Mac.
  • Commit 988f24b73: Various fixes for packet ops. 1. Fix buggy pcmp_eq and unit test for half types. 2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types. 3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.
  • Commit e0be7f30e: bug #1724: Mask buggy warnings with g++-7 (grafted from 427f2f66d69ae9b124c2f8bcd927fb6e19e07e91 )
  • Commit fab51d133: Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & COLAMD_DEAD to prevent conflicts with other libraries / code.
  • Commit 79c402e40: Fix shadow warnings in TensorContractionThreadPool
  • Commit edf2ec28d: Fix block mapper type name in TensorExecutor
  • Commit f0b36fb9a: evalSubExprsIfNeededAsync + async TensorContractionThreadPool
  • Commit 619cea949: Revert accidentally removed <memory> header from ThreadPool
  • Commit 66665e7e7: Asynchronous expression evaluation with TensorAsyncDevice
  • Commit f6c51d920: Fix missing header inclusion and colliding definitions for half type casting, which broke build with -march=native on Haswell/Skylake.
  • Commit bc40d4522: Const correctness in TensorMap<const Tensor<T, ...>> expressions
  • Commit 1187bb65a: Add more tests for corner cases of log1p and expm1. Add handling of infinite arguments to log1p such that log1p(inf) = inf.
  • Commit 6e77f9bef: Remove shadow warnings in TensorDeviceThreadPool
  • Commit 9aba52740: Revert changes to std_falback::log1p that broke handling of arguments less than -1. Fix packet op accordingly.
  • Commit b021cdea6: Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.
  • Commit 84fefdf32: Merged in ezhulenev/eigen-01 (pull request PR-683)
  • Commit 8b5ab0e4d: Fix get_random_seed on Native Client
  • Commit 690178801: Asynchronous parallelFor in Eigen ThreadPoolDevice
  • Commit 2fb24384c: Merged in jaopaulolc/eigen (pull request PR-679)
  • Commit 57f6b6259: Merged in rmlarsen/eigen (pull request PR-680)
  • Commit 071311821: Remove XSMM support from Tensor module
  • Commit 5ac7984ff: Fix debug macros in p{load,store}u
  • Commit db9147ae4: Add missing pcmp_XX methods for double/Packet2d
  • Commit a3298b22e: Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.
  • Commit 787f6ef02: Fix packed load/store for PowerPC's VSX
  • Commit 4d29aa029: Fix offset argument of ploadu/pstoreu for Altivec
  • Commit 66d073c38: bug #1718: Add cast to successfully compile with clang on PowerPC
  • Commit 6d432eae5: Make is_valid_index_type return false for float and double when EIGEN_HAS_TYPE_TRAITS is off.
  • Commit f715f6e81: Add workaround for choosing the right include files with FP16C support with clang.
  • Commit ffaf658ec: PR 655: Fix missing Eigen namespace in Macros
  • Commit 0b24e1cb5: [SYCL] Adding the SYCL memory model. The SYCL memory model provides : * an interface for SYCL buffers to behave as a non-dereferenceable pointer * an interface for placeholder accessor to behave like a pointer on both host and device
  • Commit c1b0aea65: Merged in Artem-B/eigen (pull request PR-654)
  • Commit b08527b0c: Clean up CUDA/NVCC version macros and their use in Eigen, and a few other CUDA build failures.
  • Commit b4c49bf00: Minor build improvements
  • Commit 561440058: digits10() needs to return an integer Problem reported on https://stackoverflow.com/questions/56395899
  • Commit 36e0a2b93: Merged in deven-amd/eigen-hip-fix-190524 (pull request PR-649)
  • Commit 2c3893016: fix for HIP build errors that were introduced by a commit earlier this week
  • Commit 56bc4974f: GEMV: remove double declaration of constant.
  • Commit ac21a08c1: Cast Index to RealScalar This fixes compilation issues with RealScalar types that are not implicitly castable from Index (e.g. ceres Jet types). Reported by Peter Anderson-Sprecher via eMail
  • Commit 3eb5ad0ed: Enable support for F16C with Clang. The required intrinsics were added here: https://reviews.llvm.org/D16177 and are part of LLVM 3.8.0.
  • Commit e92486b8c: Merged in rmlarsen/eigen (pull request PR-643)
  • Commit fd595d42a: Merge
  • Commit cc7ecbb12: Merged in scramsby/eigen (pull request PR-646)
  • Commit 01654d97f: Prevent potential division by zero in TensorExecutor
  • Commit 78d301572: Merged in ezhulenev/eigen-01 (pull request PR-644)
  • Commit bf9cbed8d: Merged in glchaves/eigen (pull request PR-635)
  • Commit 96a276803: Always evaluate Tensor expressions with broadcasting via tiled evaluation code path
  • Commit ab0a30e42: Make Eigen build with cuda 10 and clang.
  • Commit 734a50dc6: Make Eigen build with cuda 10 and clang.
  • Commit c8d8d5c0f: Merged in rmlarsen/eigen_threadpool (pull request PR-640)
  • Commit 5f32b79ed: Collapsed revision from PR-641 * SparseLU.h - corrected example, it didn't compile * Changed encoding back to UTF8
  • Commit ad372084f: Removing unused API to fix compile error in TensorFlow due to AVX512VL, AVX512BW usage
  • Commit 4ccd1ece9: bug #1707: Fix deprecation warnings, or disable warnings when testing deprecated functions
  • Commit d3ef7cf03: Fix build with clang on Windows.
  • Commit e5ac8cbd7: A) fix deadlocks in thread pool caused by EventCount
  • Commit c5019f722: Use pade for matrix exponential also for complex values.
  • Commit 45b40d91c: Fix AVX512 & GCC 6.3 compilation
  • Commit e6667a706: Fix stupid shadow-warnings (with old clang versions)
  • Commit e54dc24d6: Restore C++03 compatibility
  • Commit cca76c272: Restore C++03 compatibility
  • Commit 8e33844fc: Fix traits for scalar_logistic_op.
  • Commit ff06ef758: Eigen: Fix MSVC C++17 language standard detection logic To detect C++17 support, use _MSVC_LANG macro instead of _MSC_VER. _MSC_VER can indicate whether the current compiler version could support the C++17 language standard, but not whether that standard is actually selected (i.e. via /std:c++17). See these web pages for more details: https://devblogs.microsoft.com/cppblog/msvc-now-correctly-reports-__cplusplus/ https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros
  • Commit e9f0eb8a5: Add masked_store_available to unpacket_traits
  • Commit 96e30e936: Add masked pstoreu for Packet16h
  • Commit b4010f02f: Add masked pstoreu to AVX and AVX512 PacketMath
  • Commit 578407f42: Fix regression in changeset ae33e866c750c6c24ada5c6f7f3ec15815d0e683
  • Commit ac50afaff: Merged in ezhulenev/eigen-01 (pull request PR-633)
  • Commit d4dcb71bc: Speed up GEMV on AVX-512 builds, just as done for GEBP previously.
  • Commit ae33e866c: Fix compilation with PGI version 19
  • Commit 665ac22cc: Merged in ezhulenev/eigen-01 (pull request PR-632)
  • Commit 01d7e6ee9: Check if gpu_assert was overridden in TensorGpuHipCudaDefines
  • Commit 8ead5bb3d: Fix doxygen warnings to enable statis code analysis
  • Commit 07355d47c: Get rid of SequentialLinSpacedReturnType deprecation warnings in DenseBase.h
  • Commit 144ca3332: Remove deprecation annotation from typedef Eigen::Index Index, as it would generate too many build warnings.
  • Commit a7b7f3ca8: Add missing EIGEN_DEPRECATED annotations to deprecated functions and fix few other doxygen warnings
  • Commit 68a2a8c44: Use packet ops instead of AVX2 intrinsics
  • Commit 8c7a6feb8: Adding lowlevel APIs for optimized RHS packet load in TensorFlow SpatialConvolution
  • Commit 4270c6281: Split the implementation of i?amax/min into two. Based on PR-627 by Sameer Agarwal. Like the Netlib reference implementation, I*AMAX now uses the L1-norm instead of the L2-norm for each element. Changed I*MIN accordingly.
  • Commit 039ee5212: Tweak cost model for tensor contraction when parallelizing over the inner dimension.
  • Commit 9a3f06d83: Update TheadPoolDevice example to include ThreadPool creation and passing pointer into constructor.
  • Commit 66a885b61: adding EIGEN_DEVICE_FUNC to the recently added TensorContractionKernel constructor. Not having the EIGEN_DEVICE_FUNC attribute on it was leading to compiler errors when compiling Eigen in the ROCm/HIP path
  • Commit 629ddebd1: Add missing semicolon
  • Commit 4e2f6de1a: Add support for custom packed Lhs/Rhs blocks in tensor contractions
  • Commit 45e65fbb7: bug #1695: fix a numerical robustness issue. Computing the secular equation at the middle range without a shift might give a wrong sign.
  • Commit 8de66719f: Collapsed revision from PR-619 * Add support for pcmp_eq in AltiVec/Complex.h * Fixed implementation of pcmp_eq for double
  • Commit f11364290: ICC does not support -fno-unsafe-math-optimizations
  • Commit 3031d5720: PR 621: Fix documentation of EIGEN_COMP_EMSCRIPTEN
  • Commit 51e399fc1: updates requested in the PR feedback. Also droping coded within #ifdef EIGEN_HAS_OLD_HIP_FP16
  • Commit 2dbea5510: Merged eigen/eigen into default
  • Commit 5c93b38c5: Merged in rmlarsen/eigen (pull request PR-618)
  • Commit 48898a988: fix unit test in c++03: c++03 does not allow passing local or anonymous enum as template param
  • Commit cf7e2e277: bug #1692: enable enum as sizes of Matrix and Array
  • Commit e42f9aa68: Make clipping outside [-18:18] consistent for vectorized and non-vectorized paths of scalar_logistic_<float>.
  • Commit 1936aac43: Merged in tellenbach/eigen/sykline_consistent_include_guards (pull request PR-617)
  • Commit bd9c2ae3f: Fix include guard comments
  • Commit 8450a6d51: Clean up half packet traits and add a few more missing packet ops.
  • Commit b013176e5: Remove undefined std::complex<int>
  • Commit 97f9a46cb: PR 593: Add variadtic ctor for DiagonalMatrix with unit tests
  • Commit 45ab514fe: revert debug stuff
  • Commit 6a3400314: Remove EIGEN_MPL2_ONLY guard in IncompleteCholesky that is no longer needed after the AMD reordering code was relicensed to MPL2.
  • Commit d7d2f0680: bug #1684: partially workaround clang's 6/7 bug #40815
  • Commit 690f0795d: Merged in rmlarsen/eigen (pull request PR-615)
  • Commit 190143367: erm.. use proper id
  • Commit 90302aa8c: update tracking code
  • Commit 77f7d4a89: Clean up PacketMathHalf.h and add a few missing logical packet ops.
  • Commit 001f10e3c: Fix segfaults with cuda compilation
  • Commit 899c16fa2: Fix a bug in TensorGenerator for 1d tensors
  • Commit 0f8bfff23: Fix a data race in NonBlockingThreadPool
  • Commit 656d9bc66: Apply SSE's pmin/pmax fix for GCC <= 5 to AVX's pmin/pmax
  • Commit 2df4f0024: Change license from LGPL to MPL2 with agreement from David Harmon.
  • Commit 3c3f639fe: Merge.
  • Commit f4ec8edea: Add macro EIGEN_AVOID_THREAD_LOCAL to make it possible to manually disable the use of thread_local.
  • Commit 41cdc370d: Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
  • Commit cc407c9d4: Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
  • Commit 1bc2a0a57: Add missing return to NonBlockingThreadPool::LocalSteal
  • Commit 4e4dcd902: Remove redundant steal loop
  • Commit 4d808e834: Merged in rmlarsen/eigen_threadpool (pull request PR-606)
  • Commit 2ea18e505: Merged in ezhulenev/eigen-01 (pull request PR-610)
  • Commit 25abaa2e4: Check that inner block dimension is continuous
  • Commit 5d9a6686e: Block evaluation for TensorGeneratorOp
  • Commit b4861f477: Merged in ezhulenev/eigen-01 (pull request PR-609)
  • Commit bfbf7da04: bug #1689 fix used-but-marked-unused warning
  • Commit a407e022e: Tune tensor contraction threadpool heuristics
  • Commit 56c6373f8: Add an extra check for the RunQueue size estimate
  • Commit b1a862749: Do not create Tensor<const T> in cxx11_tensor_forced_eval test
  • Commit 0318fc7f4: Remove EIGEN_MPL2_ONLY guards around code re-licensed from LGPL to MPL2 in https://bitbucket.org/eigen/eigen/commits/2ca1e732398ea2c506427e9031212d63e9253b96
  • Commit efb5080d3: Do not initialize invalid fast_strides in TensorGeneratorOp
  • Commit b95941e5c: Add tiled evaluation for TensorForcedEvalOp
  • Commit 694084ecb: Use fast divisors in TensorGeneratorOp
  • Commit b0d406d91: Enable construction of Ref<VectorType> from a runtime vector.
  • Commit 9ba81cf0f: Fully qualify Eigen::internal::aligned_free
  • Commit 22144e949: bug #1629: fix compilation of PardisoSupport (regression introduced in changeset a7842daef2c82a9be200dff54d455f6d4a0b199c )
  • Commit b071672e7: Do not keep latex logs
  • Commit cf4a1c81f: Fix specialization for conjugate on non-complex types in TensorBase.h.
  • Commit c181dfb8a: Consistently use EIGEN_BLAS_FUNC in BLAS.
  • Commit 9558f4c25: Merged in rmlarsen/eigen_threadpool (pull request PR-596)
  • Commit 2ca1e7323: Merged in rmlarsen/eigen (pull request PR-597)
  • Commit e409dbba1: Enable SSE vectorization of Quaternion and cross3() with AVX
  • Commit 6560692c6: Improve EventCount used by the non-blocking threadpool.
  • Commit 0b25a5c43: fix alignment in ploadquad
  • Commit 1dc1677d5: Change licensing of OrderingMethods/Amd.h and SparseCholesky/SimplicialCholesky_impl.h from LGPL to MPL2. Google LLC executed a license agreement with the author of the code from which these files are derived to allow the Eigen project to distribute the code and derived works under MPL2.
  • Commit 0cb4ba98e: update wrt recent changes
  • Commit cca6c207f: AVX512: implement faster ploadquad<Packet16f> thus speeding up GEMM
  • Commit 1c09ee854: bug #1674: workaround clang fast-math aggressive optimizations
  • Commit 7e3084bb6: Fix compilation on ARM.
  • Commit 32502f3c4: bug #1684: add simplified regression test for respective clang's bug (this also reveal the same bug in Apples's clang)
  • Commit 42c23f14a: Speed up col/row-wise reverse for fixed size matrices by propagating compile-time sizes.
  • Commit 4d7f31710: Add a few missing packet ops: cmp_eq for NEON. pfloor for GPU.
  • Commit 2a39659d7: Add fully generic Vector<Type,Size> and RowVector<Type,Size> type aliases.
  • Commit 302377110: Update documentation of Matrix and Array type aliases.
  • Commit 475295b5f: Enable documentation of Array's typedefs
  • Commit 44b54fa4a: Protect c++11 type alias with Eigen's macro, and add respective unit test.
  • Commit 7195f008c: Merged in ra_bauke/eigen (pull request PR-180)
  • Commit 4e8047cdc: Fix compilation with gcc and remove TR1 stuff.
  • Commit 844e5447f: Update documentation regarding alignment issue.
  • Commit edd413c18: bug #1409: make EIGEN_MAKE_ALIGNED_OPERATOR_NEW* macros empty in c++17 mode: - this helps clang 5 and 6 to support alignas in STL's containers. - this makes the public API of our (and users) classes cleaner
  • Commit 3b5deeb54: bug #899: make sparseqr unit test more stable by 1) trying with larger threshold and 2) relax rank computation for rank-deficient problems.
  • Commit 482c5fb32: bug #899: remove "rank-revealing" qualifier for SparseQR and warn that it is not always rank-revealing.
  • Commit 9ac1634fd: Fix conversion warnings
  • Commit 292d61970: Fix C++17 compilation
  • Commit 071629a44: Fix incorrect value of NumDimensions in TensorContraction traits. Reported here: #1671
  • Commit a1646fc96: Commas at the end of enumerator lists are not allowed in C++03
  • Commit 2cfc025bd: fix unit compilation in c++17: std::ptr_fun has been removed.
  • Commit ab78cabd3: Add C++17 detection macro, and make sure throw(xpr) is not used if the compiler is in c++17 mode.
  • Commit 115da6a1e: Fix conversion warnings
  • Commit 7d10c7873: bug #1046: add unit tests for correct propagation of alignment through std::alignment_of
  • Commit 7580112c3: Fix harmless Scalar vs RealScalar cast.
  • Commit e23bf40dc: Add unit test for LinSpaced and complex numbers.
  • Commit 796db94e6: bug #1194: implement slightly faster and SIMD friendly 4x4 determinant.
  • Commit 31b6e080a: Fix regression: .conjugate() was popped out but not re-introduced.
  • Commit c69d0d08d: Set cost of conjugate to 0 (in practice it boils down to a no-op). This is also important to make sure that A.conjugate() * B.conjugate() does not evaluate its arguments into temporaries (e.g., if A and B are fixed and small, or * fall back to lazyProduct)
  • Commit 512b74aaa: GEMM: catch all scalar-multiple variants when falling-back to a coeff-based product. Before only s*A*B was caught which was both inconsistent with GEMM, sub-optimal, and could even lead to compilation-errors (https://stackoverflow.com/questions/54738495).
  • Commit ec032ac03: Guard C++11-style default constructor. Also, this is only needed for MSVC
  • Commit 902a7793f: Add possibility to bench row-major lhs and rhs
  • Commit 83309068b: bug #1680: improve MSVC inlining by declaring many triavial constructors and accessors as STRONG_INLINE.
  • Commit 0505248f2: bug #1680: make all "block" methods strong-inline and device-functions (some were missing EIGEN_DEVICE_FUNC)
  • Commit 559320745: bug #1678: Fix lack of __FMA__ macro on MSVC with AVX512
  • Commit d85ae650b: bug #1678: workaround MSVC compilation issues with AVX512
  • Commit f2970819a: bug #1679: avoid possible division by 0 in complex-schur
  • Commit 65e23ca7e: Revert https://bitbucket.org/eigen/eigen/commits/b55b5c7280a0481f01fe5ec764d55c443a8b6496 .
  • Commit efeabee44: Merged in ezhulenev/eigen-01 (pull request PR-590)
  • Commit 7b837559a: Fix signed-unsigned return in RuqQueue
  • Commit f0d42d226: Fix signed-unsigned comparison warning in RunQueue
  • Commit 106ba7bb1: Do not generate no-op cast() and conjugate() expressions
  • Commit 8c2f30c79: Speedup Tensor ThreadPool RunQueu::Empty()
  • Commit bdcb5f330: Let's properly use Score instead of std::abs, and remove deprecated FIXME ( a /= b does a/b and not a * (1/b) as it was a long time ago...)
  • Commit 2edfc6807: Fix compilation of empty products of the form: Mx0 * 0xN
  • Commit eb46f34a8: Speed up 2x2 LU by a factor 2, and other small fixed sizes by about 10%. Not sure that's so critical, but this does not complexify the code base much.
  • Commit dada863d2: Enable unit tests of PartialPivLU on fixed size matrices, and increase tested matrix size (blocking was not tested!)
  • Commit ab6e6edc3: Speedup PartialPivLU for small matrices by passing compile-time sizes when available. This change set also makes a better use of Map<>+OuterStride and Ref<> yielding surprising speed up for small dynamic sizes as well. The table below reports times in micro seconds for 10 random matrices: | ------ float --------- | ------- double ------- | size | before after ratio | before after ratio | fixed 1 | 0.34 0.11 2.93 | 0.35 0.11 3.06 | fixed 2 | 0.81 0.24 3.38 | 0.91 0.25 3.60 | fixed 3 | 1.49 0.49 3.04 | 1.68 0.55 3.01 | fixed 4 | 2.31 0.70 3.28 | 2.45 1.08 2.27 | fixed 5 | 3.49 1.11 3.13 | 3.84 2.24 1.71 | fixed 6 | 4.76 1.64 2.88 | 4.87 2.84 1.71 | dyn 1 | 0.50 0.40 1.23 | 0.51 0.40 1.26 | dyn 2 | 1.08 0.85 1.27 | 1.04 0.69 1.49 | dyn 3 | 1.76 1.26 1.40 | 1.84 1.14 1.60 | dyn 4 | 2.57 1.75 1.46 | 2.67 1.66 1.60 | dyn 5 | 3.80 2.64 1.43 | 4.00 2.48 1.61 | dyn 6 | 5.06 3.43 1.47 | 5.15 3.21 1.60 |
  • Commit 21eb97d3e: Add PacketConv implementation for non-vectorizable src expressions
  • Commit 1e36166ed: Optimize TensorConversion evaluator: do not convert same type
  • Commit 953ca5ba2: Spline.h: fix spelling "spang" -> "span"
  • Commit 59998117b: Don't do parallel_pack if we can use thread_local memory in tensor contractions
  • Commit 013cc3a6b: Make GEMM fallback to GEMV for runtime vectors. This is a more general and simpler version of changeset 4c0fa6ce0f81ce67dd6723528ddf72f66ae92ba2
  • Commit fa2fcb489: Backed out changeset 4c0fa6ce0f81ce67dd6723528ddf72f66ae92ba2
  • Commit b3c4344a6: bug #1676: workaround GCC's bug in c++17 mode.
  • Commit 3091c0389: Merged in ezhulenev/eigen-01 (pull request PR-581)
  • Commit 849112708: Do not reduce parallelism too much in contractions with small number of threads
  • Commit eb21bab76: Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing
  • Commit 6d0f6265a: Remove duplicated comment line
  • Commit 690b2c45b: Fix GeneralBlockPanelKernel Android compilation
  • Commit 871e2e533: bug #1674: disable GCC's unsafe-math-optimizations in sin/cos vectorization (results are completely wrong otherwise)
  • Commit e7b481ea7: Merged in rmlarsen/eigen (pull request PR-578)
  • Commit b55b5c728: Speed up row-major matrix-vector product on ARM
  • Commit 4c0fa6ce0: Speed up Eigen matrix*vector and vector*matrix multiplication.
  • Commit 7ef879f6b: GEBP: improves pipelining in the 1pX4 path with FMA. Prior to this change, a product with a LHS having 8 rows was faster with AVX-only than with AVX+FMA. With AVX+FMA I measured a speed up of about x1.25 in such cases.
  • Commit de77bf5d6: Fix compilation with ARM64.
  • Commit d58668692: Workaround lack of support for arbitrary packet-type in Tensor by manually loading half/quarter packets in tensor contraction mapper.
  • Commit eb4c6bb22: Fix conflicts and merge
  • Commit e3622a039: Slightly extend discussions on auto and move the content of the Pit falls wiki page here. http://eigen.tuxfamily.org/index.php?title=Pit_Falls
  • Commit df12fae8b: According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101, the previous GCC issue is fixed in GCC trunk (will be gcc 9).
  • Commit 3775926bb: ARM64 & GEBP: add specialization for double +30% speed up
  • Commit be5b0f664: ARM64 & GEBP: Make use of vfmaq_laneq_f32 and workaround GCC's issue in generating good ASM
  • Commit a7779a9b4: Hide some annoying unused variable warnings in g++8.1
  • Commit efe02292a: Add recent gemm related changesets and various cleanups in perf-monitoring
  • Commit 8a06c699d: bug #1669: fix PartialPivLU/inverse with zero-sized matrices.
  • Commit a2a07e62b: Fix compilation with c++03 (local class cannot be template arguments), and make SparseMatrix::assignDiagonal truly protected.
  • Commit f489f4451: bug #1574: implement "sparse_matrix =,+=,-= diagonal_matrix" with smart insertion strategies of missing diagonal coeffs.
  • Commit 803fa7976: Move evaluator<SparseCompressedBase>::find(i,j) to a more general and reusable SparseCompressedBase::lower_bound(i,j) functiion
  • Commit 53560f918: bug #1672: fix unit test compilation with MSVC by adding overloads of test_is* for long long (and factorize copy/paste code through a macro)
  • Commit c9825b967: Renaming even more `I` identifiers
  • Commit 5a52e35f9: Renaming some more `I` identifiers
  • Commit 71429883e: Fix compilation error in NEON GEBP specializaition of madd.
  • Commit 934b8a130: Avoid `I` as an identifier, since it may clash with the C-header complex.h
  • Commit ec8a38797: cleanup
  • Commit 6908ce2a1: More thoroughly check variadic template ctor of fixed-size vectors
  • Commit 237b03b37: PR 574: use variadic template instead of initializer_list to implement fixed-size vector ctor from coefficients.
  • Commit bd6dadcda: Tell doxygen that cxx11 math is available
  • Commit c64d5d382: Bypass inline asm for non compatible compilers.
  • Commit e16913a45: Fix name of tutorial snippet.
  • Commit 80f81f9c4: Cleanup SFINAE in Array/Matrix(initializer_list) ctors and minor doc editing.
  • Commit db152b9ee: PR 572: Add initializer list constructors to Matrix and Array (include unit tests and doc) - {1,2,3,4,5,...} for fixed-size vectors only - {{1,2,3},{4,5,6}} for the general cases - Template:1,2,3,4,5,.... is allowed for both row and column-vector
  • Commit 543529da6: Add more extensive tests of Array ctors, including {} variants
  • Commit 92774f027: Replace host_define.h with cuda_runtime_api.h
  • Commit d18f49cbb: Fix compilation of unit tests with gcc and c++17
  • Commit da0a41b9c: Mask unused-parameter warnings, when building with NDEBUG
  • Commit 2eccbaf3f: Add missing logical packet ops for GPU and NEON.
  • Commit d575505d2: After fixing bug #1557, boostmultiprec_7 failed with NumericalIssue instead of NoConvergence (all that matters here is no Success)
  • Commit ee3662abc: Remove some useless const_cast
  • Commit 0fe6b7d68: Make nestByValue works again (broken since 3.3) and add unit tests.
  • Commit 4b7cf7ff8: Extend reshaped unit tests and remove useless const_cast
  • Commit b57c9787b: Cleanup useless const_cast and add missing broadcast assignment tests
  • Commit be05d0030: Make FullPivLU use conjugateIf<>
  • Commit bba2f0506: Boosttest only available for Boost version >= 1.53.0
  • Commit 15e53d5d9: PR 567: makes all dense solvers inherit SoverBase (LU,Cholesky,QR,SVD). This changeset also includes: * add HouseholderSequence::conjugateIf * define int as the StorageIndex type for all dense solvers * dedicated unit tests, including assertion checking * _check_solve_assertion(): this method can be implemented in derived solver classes to implement custom checks * CompleteOrthogonalDecompositions: add applyZOnTheLeftInPlace, fix scalar type in applyZAdjointOnTheLeftInPlace(), add missing assertions * Cholesky: add missing assertions * FullPivHouseholderQR: Corrected Scalar type in _solve_impl() * BDCSVD: Unambiguous return type for ternary operator * SVDBase: Corrected Scalar type in _solve_impl()
  • Commit 7f32109c1: Add conjugateIf<bool> members to DesneBase, TriangularView, SelfadjointView, and make PartialPivLU use it.
  • Commit 7b35c26b1: Doc: remove link to porting guide
  • Commit 4759d9e86: Doc: add manual page on STL iterators
  • Commit 562985bac: bug #1646: fix false aliasing detection for A.row(0) = A.col(0); This changeset completely disable the detection for vectors for which are current mechanism cannot detect any positive aliasing anyway.
  • Commit 7401e2541: Fix compilation error for logical packet ops with older compilers.
  • Commit ee550a2ac: Fix flaky test for tensor fft.
  • Commit 0f028f61c: GEBP: fix swapped kernel mode with AVX512 and complex scalars
  • Commit e118ce86f: GEBP: cleanup logic to choose between a 4 packets of 1 packet
  • Commit 70e133333: bug #1661: fix regression in GEBP and AVX512
  • Commit ce88e297d: Add a comment stating this doc page is partly obsolete.
  • Commit 729d1291c: bug #1585: update doc on lazy-evaluation
  • Commit c8e40edac: Remove Eigen2ToEigen3 migration page (obsolete since 3.3)
  • Commit aeffdf909: bug #1617: add unit tests for empty triangular solve.
  • Commit 502f71798: bug #1646: disable aliasing detection for empty and 1x1 expression
  • Commit 0b466b693: bug #1633: use proper type for madd temporaries, factorize RhsPacketx4.
  • Commit dbfcceabf: Bug: 1633: refactor gebp kernel and optimize for neon
  • Commit 2b70b2f57: Make Transform::rotation() an alias to Transform::linear() in the case of an Isometry
  • Commit 2c2c11499: Silent maybe-uninitialized warnings by gcc
  • Commit 6ec6bf0b0: Enable visitor on empty matrices (the visitor is left unchanged), and protect min/maxCoeff(Index*,Index*) on empty matrices by an assertion (+ doc & unit tests)
  • Commit 027e44ed2: bug #1592: makes partial min/max reductions trigger an assertion on inputs with a zero reduction length (+doc and tests)
  • Commit f8bc5cb39: Fix detection of vector-at-time: use Rows/Cols instead of MaxRow/MaxCols. This fix VectorXd(n).middleCol(0,0).outerSize() which was equal to 1.
  • Commit 32d7232ae: fix always true warning with gcc 4.7
  • Commit 6cf7afa3d: Typo
  • Commit e7d4d4f19: cleanup
  • Commit 7b3aab093: Merged in rmlarsen/eigen (pull request PR-570)
  • Commit 8bf00c2ba: Remove extra <tr>.
  • Commit ec7fe8355: Merge.
  • Commit 2ea4efc0c: Merge.
  • Commit 2c5843dbb: Update documentation.
  • Commit 250dcd1fd: bug #1652: fix position of EIGEN_ALIGN16 attributes in Neon and Altivec
  • Commit 5a59452aa: Merged eigen/eigen into default
  • Commit 3c9e6d206: AVX512: fix pgather/pscatter for Packet4cd and unaligned pointers
  • Commit 61b6eb05f: AVX512 (r)sqrt(double) was mistakenly disabled with clang and others
  • Commit ccddeaad9: fix warning
  • Commit d4881751d: Doc: add Isometry in the list of supported Mode of Transform<>
  • Commit 9d988a1e1: Initialize isometric transforms like affine transforms.
  • Commit 4356a55a6: PR 571: Implements an accurate argument reduction algorithm for huge inputs of sin/cos and call it instead of falling back to std::sin/std::cos. This makes both the small and huge argument cases faster because: - for small inputs this removes the last pselect - for large inputs only the reduction part follows a scalar path, the rest use the same SIMD path as the small-argument case.
  • Commit f56672402: Fix StorageIndex FIXME in dense LU solvers
  • Commit 1c6e6e2c3: Merge.
  • Commit 0ba3b4541: Merged eigen/eigen into default
  • Commit 28ba1b2c3: Add support for inverse hyperbolic functions. Fix cost of division.
  • Commit 89c4001d6: Fix warnings in ptrue for complex and half types.
  • Commit a49d01edb: Fix warnings in ptrue for complex and half types.
  • Commit 1e6d15b55: Fix shorten-64-to-32 warning in TensorContractionThreadPool
  • Commit df29511ac: Fix merge.
  • Commit 8e71ed4cc: Merge.
  • Commit fff5a5b57: Resolve.
  • Commit 9396ace46: Merge.
  • Commit 74882471d: Merged eigen/eigen into default
  • Commit e9936cf2b: Merge.
  • Commit 9005f0111: Replace compiler's alignas/alignof extension by respective c++11 keywords when available. This also fix a compilation issue with gcc-4.7.
  • Commit 3c9add659: Remove reinterpret_cast from AVX512 complex implementation
  • Commit 0522460a0: bug #1656: Enable failtests only if BUILD_TESTING is enabled
  • Commit 0abe03764: Fix shorten-64-to-32 warning in TensorContractionThreadPool
  • Commit fcfced13e: Rename pones -> ptrue. Use _CMP_TRUE_UQ where appropriate.
  • Commit ce38c342c: merge.
  • Commit a05ec7993: merge
  • Commit e15bb785a: Collapsed revision * Add packet up "pones". Write pnot(a) as pxor(pones(a), a). * Collapsed revision * Simplify a bit. * Undo useless diffs. * Fix typo.
  • Commit f6ba6071c: Fix typo.
  • Commit 8f0444252: Collapsed revision * Collapsed revision * Add packet up "pones". Write pnot(a) as pxor(pones(a), a). * Collapsed revision * Simplify a bit. * Undo useless diffs. * Fix typo.
  • Commit 8f178429b: Collapsed revision * Collapsed revision * Add packet up "pones". Write pnot(a) as pxor(pones(a), a). * Collapsed revision * Simplify a bit. * Undo useless diffs. * Fix typo.
  • Commit 1119c73d2: Collapsed revision * Add packet up "pones". Write pnot(a) as pxor(pones(a), a). * Collapsed revision * Simplify a bit. * Undo useless diffs. * Fix typo.
  • Commit e00521b51: Undo useless diffs.
  • Commit f2767112c: Simplify a bit.
  • Commit cb955df9a: Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
  • Commit cb3c059fa: Merged eigen/eigen into default
  • Commit d812f411c: bug #1654: fix compilation with cuda and no c++11
  • Commit 3492a1ca7: fix plog(+inf) with AVX512
  • Commit 47810cf5b: Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE
  • Commit 3f14e0d19: fix warning
  • Commit aeec68f77: Add missing pcmp_lt and others for AVX512
  • Commit e6b217b8d: bug #1652: implements a much more accurate version of vectorized sin/cos. This new version achieve same speed for SSE/AVX, and is slightly faster with FMA. Guarantees are as follows: - no FMA: 1ULP up to 3pi, 2ULP up to sin(25966) and cos(18838), fallback to std::sin/cos for larger inputs - FMA: 1ULP up to sin(117435.992) and cos(71476.0625), fallback to std::sin/cos for larger inputs
  • Commit e70ffef96: Optimize evalShardedByInnerDim
  • Commit 055f0b73d: Add support for pcmp_eq and pnot, including for complex types.
  • Commit 190d053e4: Explicitly set fill character when printing aligned data to ostream
  • Commit bc5dd4caf: PR560: Fix the AVX512f only builds
  • Commit 697fba3bb: Fix unit test
  • Commit 60d3fe9a8: One more stupid AVX 512 fix (I don't have direct access to AVX512 machines)
  • Commit 4aa667b51: Add EIGEN_STRONG_INLINE where required
  • Commit 961ff567e: Add missing pcmp_lt_or_nan for AVX512
  • Commit 0f6f75bd8: Implement a faster fix for sin/cos of large entries that also correctly handle INF input.
  • Commit 38d704def: Make sure that psin/pcos return number in [-1,1] for large inputs (though sin/cos on large entries is quite useless because it's inaccurate)
  • Commit 5713fb7fe: Fix plog(+INF): it returned ~87 instead of +INF
  • Commit 6dd93f7e3: Make code compile again for older compilers. See https://stackoverflow.com/questions/7411515/
  • Commit 1024a70e8: gebp: Add new ½ and ¼ packet rows per (peeling) round on the lhs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
  • Commit e763fcd09: Introducing "vectorized" byte on unpacket_traits structs
  • Commit efa4c9c40: bug #1615: slightly increase the default unrolling limit to compensate for changeset 101ea26f5e18919972b321b5f7e3ef4e07be3fd6 . This solves a performance regression with clang and 3x3 matrix products.
  • Commit f20c99167: add changesets related to matrix product perf.
  • Commit dd6d65898: Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0.
  • Commit f582ea357: Fix compilation with expression template scalar type.
  • Commit cfc70dc13: Add regression test for bug #1174
  • Commit 2de8da70f: bug #1557: fix RealSchur and EigenSolver for matrices with only zeros on the diagonal.
  • Commit 72c0bbe2b: Simplify handling of tests that must fail to compile. Each test is now a normal ctest target, and build properties (compiler+flags) are preserved (instead of starting a new build-dir from scratch).
  • Commit 37c91e183: bug #1644: fix warning
  • Commit f159cf3d7: Artificially increase l1-blocking size for AVX512. +10% speedup with current kernels. With a 6pX4 kernel (not committed yet), this provides a +20% speedup.
  • Commit 0a7e7af6f: Properly set the number of registers for AVX512
  • Commit 7166496f7: bug #1643: fix compilation issue with gcc and no optimizaion
  • Commit 0d9063783: enable spilling workaround on architectures with SSE/AVX
  • Commit cf697272e: Remove debug code.
  • Commit 450dc97c6: Various fixes in polynomial solver and its unit tests: - cleanup noise in imaginary part of real roots - take into account the magnitude of the derivative to check roots. - use <= instead of < at appropriate places
  • Commit 348bb386d: Enable "old" CMP0026 policy (not perfect, but better than dozens of warning)
  • Commit bff90bf27: workaround "may be used uninitialized" warning
  • Commit 81c27325a: bug #1641: fix testing of pandnot and fix pandnot for complex on SSE/AVX/AVX512
  • Commit 426bce752: fix EIGEN_GEBP_2PX4_SPILLING_WORKAROUND for non vectorized type, and non x86/64 target
  • Commit cd25b538a: Fix noise in sparse_basic_3 (numerical cancellation)
  • Commit efaf03bf9: Fix noise in lu unit test
  • Commit 956678a4e: bug #1515: disable gebp's 3pX4 micro kernel for MSVC<=19.14 because of register spilling.
  • Commit 7b6d0ff1f: Enable FMA with MSVC (through /arch:AVX2). To make this possible, I also has to turn the #warning regarding AVX512-FMA to a #error.
  • Commit f233c6194: bug #1637: workaround register spilling in gebp with clang>=6.0+AVX+FMA
  • Commit ae59a7652: bug #1638: add a warning if avx512 is enabled without SSE/AVX FMA
  • Commit 4e7746fe2: bug #1636: fix gemm performance issue with gcc>=6 and no FMA
  • Commit cbf2f4b7a: AVX512f includes FMA but GCC does not define __FMA__ with -mavx512f only
  • Commit 1d683ae2f: Fix compilation with avx512f only, i.e., no AVX512DQ
  • Commit aab749b1c: fix test regarding AVX512 vectorization of complexes.
  • Commit c53eececb: Implement AVX512 vectorization of std::complex<float/double>
  • Commit 3fba59ea5: temporarily re-disable SSE/AVX vectorization of complex<> on AVX512 -> this needs to be fixed though!
  • Commit 1ac2695ef: bug #1636: fix compilation with some ABI versions.
  • Commit 47d8b741b: #elif -> #else to fix GPU build.
  • Commit 8a02883d5: Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554)
  • Commit acc3459a4: Add help messages in the quick ref/ascii docs regarding slicing, indexing, and reshaping.
  • Commit e2e897298: Fix page nesting
  • Commit c1d356e8b: bug #1635: Use infinity from Numtraits instead of creating it manually.
  • Commit 36f8f6d0b: Fix evalShardedByInnerDim for AVX512 builds
  • Commit b57b31cce: Merged in ezhulenev/eigen-01 (pull request PR-553)
  • Commit 0bb15bb6d: Update checks in ConfigureVectorization.h
  • Commit fd0fbfa9b: Do not disable alignment with EIGEN_GPUCC
  • Commit 919414b9f: bug #785: Make Cholesky decomposition work for empty matrices
  • Commit 0ea7ae721: Add missing padd for Packet8i (it was implicitly generated by clang and gcc)
  • Commit ab4df3e6f: bug #1634: remove double copy in move-ctor of non movable Matrix/Array
  • Commit c78546443: Add packet sin and cos to Altivec/VSX and NEON
  • Commit 69ace742b: Several improvements regarding packet-bitwise operations: - add unit tests - optimize their AVX512f implementation - add missing implementations (half, Packet4f, ...)
  • Commit fa87f9d87: Add psin/pcos on AVX512 -> almost for free, at last!
  • Commit c68bd2fa7: Cleanup
  • Commit f91500d30: Fix pandnot order in AVX512
  • Commit b477d60bc: Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)
  • Commit e19ece822: Disable fma gcc's workaround for gcc >= 8 (based on GEMM benchmarks)
  • Commit 41052f63b: same for pmax
  • Commit 3e95e398b: pmin/pmax o SSE: make sure to use AVX instruction with AVX enabled, and disable gcc workaround for fixed gcc versions
  • Commit aa6097395: Add missing SSE/AVX type-casting in AVX512 mode
  • Commit 48fe78c37: bug #1630: fix linspaced when requesting smaller packet size than default one.
  • Commit 80f1651f3: Use explicit packet type in SSE/PacketMath pldexp
  • Commit a4159dba0: do not read buffers out of bounds -- load only the 4 bytes we know exist here. Could also have done a vld1_lane_f32 but doing so here, without the overhead of initializing the unused lane, would have triggered used-of-uninitialized-value errors in tools such as ASan. Note that this code is sub-optimal before or after this change: we should be reading either 2 or 4 float32 values per load-instruction (2 for ARM in-order cores with an affinity for 8-byte loads; 4 for ARM out-of-order cores able to dual-issue 16-byte load instructions with arithmetic instructions). Before or after this patch, we are only loading 4 bytes of useful data here (even if before this patch, we were technically loading 8, only to use only the 4 first).
  • Commit b131a4db2: bug #1631: fix compilation with ARM NEON and clang, and cleanup the weird pshiftright_and_cast and pcast_and_shiftleft functions.
  • Commit a1a5fbbd2: Update pshiftleft to pass the shift as a true compile-time integer.
  • Commit fa7fd61ed: Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv
  • Commit 08edbc8cf: Merged in bjacob/eigen/fixbuild (pull request PR-549)
  • Commit 7b1cb8a44: fix the build on 64-bit ARM when NEON is disabled
  • Commit b5695a600: Unify Altivec/VSX pexp(double) with default implementation
  • Commit 7655a8af6: cleanup
  • Commit 502f92fa1: Unify SSE and AVX pexp for double.
  • Commit 4a347a005: Unify NEON's pexp with generic implementation
  • Commit 5c8406bab: Unify Altivec/VSX's pexp with generic implementation
  • Commit cf8b85d5c: Unify SSE and AVX implementation of pexp
  • Commit c2f35b1b4: Unify Altivec/VSX's plog with generic implementation, and enable it!
  • Commit c24e98e6a: Unify NEON's plog with generic implementation
  • Commit 2c44c4011: First step toward a unification of packet log implementation, currently only SSE and AVX are unified. To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.
  • Commit 5f6045077: Make SSE/AVX pandnot(A,B) consistent with generic version, i.e., "A and not B"
  • Commit 382279eb7: Extend unit test to recursively check half-packet types and non packet types
  • Commit 0836a715d: bug #1611: fix plog(0) on NEON
  • Commit 95566eeed: Fix typos
  • Commit e3b22a6bd: merge
  • Commit ccabdd88c: Fix reserved usage of double __ in macro names
  • Commit 572d62697: check two ctors
  • Commit 354f14293: Fix double = bool !
  • Commit a7842daef: Fix several uninitialized member from ctor
  • Commit ea60a172c: Add default constructor to Bar to make test compile again with clang-3.8
  • Commit 806352d84: Small typo found be Patrick Huber (pull request PR-547)
  • Commit a47605487: bug #1624: improve matrix-matrix product on ARM 64, 20% speedup
  • Commit c685fe983: Move regression test to right unit test file
  • Commit 4b2cebade: Workaround weird MSVC bug
  • Commit 0ec8afde5: Fixed most conversion warnings in MatrixFunctions module
  • Commit e7e6809e6: ROCm/HIP specfic fixes + updates
  • Commit 6a510fe69: Make MaxPacketSize a true upper bound, even for fixed-size inputs
  • Commit 43c987b1c: Add explicit regression test for bug #1622
  • Commit 670d56441: PR 544: Set requestedAlignment correctly for SliceVectorizedTraversals
  • Commit 3dc084504: Fix typo in comment on EIGEN_MAX_STATIC_ALIGN_BYTES
  • Commit 7fddc6a51: typo
  • Commit 449f948b2: help doxygen linking to DenseBase::NulllaryExpr
  • Commit 4263f23c2: Improve doc on multi-threading and warn about hyper-threading
  • Commit db529ae4e: doxygen does not like \addtogroup and \ingroup in the same line
  • Commit 72928a2c8: Merged in rmlarsen/eigen2 (pull request PR-543)
  • Commit cda479d62: Remove accidental changes.
  • Commit 719d9aee6: Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
  • Commit 77b447c24: Add optimized version of logistic function for float. As an example, this is about 50% faster than the existing version on Haswell using AVX.
  • Commit c81bdbdad: Add manual doc on STL-compatible iterators
  • Commit 010514691: Fix warning in c++03
  • Commit 93f9988a7: A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.
  • Commit 784a3f13c: bug #1619: fix mixing of const and non-const generic iterators
  • Commit db9a9a12b: bug #1619: make const and non-const iterators compatible
  • Commit fbd6e7b02: add missing ref to a.zeta(b)
  • Commit dffd1e11d: Limit the size of the toc
  • Commit a88e0a0e9: Update doxy hacks wrt doxygen 1.8.13/14
  • Commit bd9a00718: Let doxygen sees lastN
  • Commit d7c644213: Add and update manual pages for slicing, indexing, and reshaping.
  • Commit a36884847: Recent xcode versions does support EIGEN_HAS_STATIC_ARRAY_TEMPLATE
  • Commit f62a0f69c: Fix max-size in indexed-view
  • Commit bf495859f: Merged in glchaves/eigen (pull request PR-539)
  • Commit 995730fc6: Add option to disable plot generation
  • Commit 4ad359237: Vectorize row-by-row gebp loop iterations on 16 packets as well
  • Commit 9d318b92c: add unit tests for bug #1619
  • Commit 8d7a73e48: bug #1617: Fix SolveTriangular.solveInPlace crashing for empty matrix. This made FullPivLU.kernel() crash when used on the zero matrix. Add unit test for FullPivLU.kernel() on the zero matrix.
  • Commit 66b28e290: bug #1618: Use different power-of-2 check to avoid MSVC warning
  • Commit 07fcdd143: Merged in ezhulenev/eigen-02 (pull request PR-534)
  • Commit 8a977c1f4: Fix cxx11_tensor_{block_access, reduction} tests
  • Commit fb62d6d96: Fix typo in tutorial documentation.
  • Commit b5f077d22: Document EIGEN_NO_IO preprocessor directive
  • Commit 4a40b3785: Collapsed revision (based on pull request PR-325) * Support compiling without IO streams
  • Commit 14054e217: Do not rely on the compiler generating __device__ functions for constexpr in Cuda (via EIGEN_CONSTEXPR_ARE_DEVICE_FUNC. This breaks several target in the TensorFlow Cuda build, e.g.,
  • Commit 954b4ca9d: Suppress compiler warning about unused global variable.
  • Commit 9caafca55: Merged in rmlarsen/eigen (pull request PR-532)
  • Commit 449ff7467: Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file). Manually grafted from d107a371c61b764c73fd1570b1f3ed1c6400dd7e
  • Commit 39fec15d5: Merged eigen/eigen into default
  • Commit 40fa6f98b: bug #1606: Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11. Grafted manually from a4afa90d161faab385a77f0e2764fb13ff3b9484
  • Commit d8f285852: Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.
  • Commit dda68f56e: Fix GPU build due to gpu_assert not always being defined.
  • Commit 1dcf5a6ed: fix typo in doc
  • Commit 9e96e9193: Move from rvalue arguments in ThreadPool enqueue* methods
  • Commit 217d83981: Reduce thread scheduling overhead in parallelFor
  • Commit d52763bb4: Merged in ezhulenev/eigen-02 (pull request PR-528)
  • Commit 0f780bb0b: Fix float-to-double warning
  • Commit 900c7c61b: Check if it's allowed to squueze inner dimensions in TensorBlockIO
  • Commit a39e0f743: bug #1612: fix regression in "outer-vectorization" of partial reductions for PacketSize==1 (aka complex<double>)
  • Commit e3b85771d: Show call stack in case of failing sparse solving.
  • Commit d2d570c11: Remove useless (and broken) resize
  • Commit f0fb95135: Iterative solvers: unify and fix handling of multiple rhs. m_info was not properly computed and the logic was repeated in several places.
  • Commit 2747b98cf: DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve
  • Commit d835a0bf5: relax number of iterations checks to avoid false negatives
  • Commit 3a33db4de: merge
  • Commit 0ed811a9c: Suppress unused variable compiler warning in sparse subtest 3.
  • Commit aa110e681: PR 526: Speed up multiplication of small, dynamically sized matrices
  • Commit d9392f9e5: Fix code format
  • Commit 118520f04: Workaround nbcc+msvc compiler bug
  • Commit 24dc07651: Explicitly convert 0 to Scalar for custom types
  • Commit 8214cf189: Make sparse_basic includable from sparse_extra, but disable it since sparse_basic(DynamicSparseMatrix) does not compile at all anyways
  • Commit 43633fbab: Fix warning with AVX512f
  • Commit 97e2c808e: Fix avx512 plog(NaN) to return NaN instead of +inf
  • Commit b3f66d29a: Enable avx512 plog with clang
  • Commit 2ef1b3967: Relaxed fastmath unit test: if std::foo fails, then let's only trigger a warning is numext::foo fails too. A true error will triggered only if std::foo works but our numext::foo fails.
  • Commit 1d5a6363e: relax numerical tests from equal to approx (x87)
  • Commit f0aa7e40f: Fix regression in changeset 5335659c47d69d3ee1b6f9792fea5998731f9a53
  • Commit ce243ee45: bug #520: add diagmat +/- diagmat operators.
  • Commit 5335659c4: Merged in ezhulenev/eigen-02 (pull request PR-525)
  • Commit eec0dfd68: bug #632: add specializations for res ?= dense +/- sparse and res ?= sparse +/- dense. They are rewritten as two compound assignment to by-pass hybrid dense-sparse iterator.
  • Commit 8e6dc2c81: Fix bug in partial reduction of expressions requiring evaluation
  • Commit 76ceae49c: bug #1609: add inplace transposition unit test
  • Commit 2bf1a31d8: Use void type if stl-style iterators are not supported
  • Commit f3130ee1b: Avoid empty macro arguments
  • Commit e8918743c: Merged in ezhulenev/eigen-01 (pull request PR-523)
  • Commit befcac883: Hide stl-container detection test under #if
  • Commit c0ca8a9fa: Compile time detection for unimplemented stl-style iterators
  • Commit 1dd1f8e45: bug #65: add vectorization of partial reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean()
  • Commit bfa2a81a5: Make redux_vec_unroller more flexible regarding packet-type
  • Commit c0c3be26e: Extend unit tests for partial reductions
  • Commit 3f2c8b7ff: Fix a lot of Doxygen warnings in Tensor module
  • Commit f6359ad79: Small Doxygen fixes
  • Commit 7a882c05a: Fix compilation on CUDA
  • Commit 93a6192e9: fix mpreal for mpfr<4.0.0
  • Commit d16634c4d: Fix out-of bounds access in TensorArgMax.h.
  • Commit 1a737e1d6: Fix contraction test.
  • Commit e00487f7d: bug #1603: add parenthesis around ternary operator in function body as well as a harmless attempt to make MSVC happy.
  • Commit 2eda9783d: typo
  • Commit c6e2dde71: fix c++11 deprecated warning
  • Commit 6cc9b2c83: fix warning in mpreal.h
  • Commit 649d4758a: merge
  • Commit aa5820056: Unify c++11 usage in doc's examples and snippets
  • Commit e29bfe847: Update included mpreal header to 3.6.5 and fix deprecated warnings.
  • Commit 64b1a1531: Workaround stupid warning
  • Commit c9643f4a6: Disable C++11 deprecated warning when limiting Eigen to C++98
  • Commit 774bb9d6f: fix a doxygen issue
  • Commit 6c3f6cd52: Fix maybe-uninitialized warning
  • Commit bcb7c66b5: Workaround gcc's alloc-size-larger-than= warning
  • Commit 16b2001ec: Fix gcc 8.1 warning: "maybe use uninitialized"
  • Commit 6512c5e13: Implement a better workaround for GCC's bug #87544
  • Commit 409132bb8: Workaround gcc bug making it trigger an invalid warning
  • Commit c6a1ab403: Workaround MSVC compilation issue
  • Commit e21766c6f: Clarify doc of rowwise/colwise/vectorwise.
  • Commit d92f004ab: Simplify API by removing allCols/allRows and reusing rowwise/colwise to define iterators over rows/columns
  • Commit 91613bf2c: Add support for c++11 snippets
  • Commit 3e64b1fc8: Move iterators to internal, improve doc, make unit test c++03 friendly
  • Commit 2b2b4d058: fix unused warning
  • Commit 8a1e98240: add unit tests
  • Commit 5f26f5759: Change the logic of A.reshaped<Order>() to be a simple alias to A.reshaped<Order>(AutoSize,fix<1>). This means that now AutoOrder is allowed, and it always return a column-vector.
  • Commit 0481900e2: Add pointer-based iterator for direct-access expressions
  • Commit c5f1d0a72: Fix shadow warning
  • Commit b92c71235: Move struct outside of method for C++03 compatibility.
  • Commit 051f9c1af: Make code compile in C++03 mode again
  • Commit b786ce8c7: Fix conversion warning ... again
  • Commit 8c3852816: Factorize RowsProxy/ColsProxy and related iterators using subVector<>(Index)
  • Commit 12487531c: Add templated subVector<Vertical/Horizonal>(Index) aliases to col/row(Index) methods (plus subVectors<>() to retrieve the number of rows/columns)
  • Commit 37e29fc89: Use Index instead of ptrdiff_t or int, fix random-accessors.
  • Commit de2efbc43: bug #1605: workaround ABI issue with vector types (aka __m128) versus scalar types (aka float)
  • Commit b0c66adfb: bug #231: initial implementation of STL iterators for dense expressions
  • Commit 564ca71e3: Merged in deven-amd/eigen/HIP_fixes (pull request PR-518)
  • Commit 94898488a: This commit contains the following (HIP specific) updates:
  • Commit 2088c0897: Merged eigen/eigen into default
  • Commit 31629bb96: Get rid of unused variable warning.
  • Commit bb13d5d91: Fix bug in copy optimization in Tensor slicing.
  • Commit 104e8fa07: Fix a few warnings and rename a variable to not shadow "last".
  • Commit 7c1b47840: Merged in ezhulenev/eigen-01 (pull request PR-514)
  • Commit 524c81f3f: Add tests for evalShardedByInnerDim contraction + fix bugs
  • Commit 86ba50be3: Fix integer conversion warnings
  • Commit e95696acb: Optimize TensorBlockCopyOp
  • Commit 9f33e71e9: Revert code lost in merge
  • Commit a7a3e9f2b: Merge with eigen/eigen default
  • Commit 9f4988959: Remove explicit mkldnn support and redundant TensorContractionKernelBlocking
  • Commit 1e5750a5b: Merged in rmlarsen/eigen4 (pull request PR-511)
  • Commit af3ad4b51: oops, I've been too fast in previous copy/paste
  • Commit 24b163a87: #pragma GCC diagnostic push/pop is not supported prioro to gcc 4.6
  • Commit b314376f9: Test mkldnn pack for doubles
  • Commit 22ed98a33: Conditionally add mkldnn test
  • Commit d956204ab: Remove "false &&" left over from test.
  • Commit 3815aeed7: Parallelize tensor contraction over the inner dimension in cases where where one or both of the outer dimensions (m and n) are small but k is large. This speeds up individual matmul microbenchmarks by up to 85%.
  • Commit 71cd3fbd6: Support multiple contraction kernel types in TensorContractionThreadPool
  • Commit 0a3356f4e: Don't deactivate BVH test for clang (probably, this was failing for very old versions of clang)
  • Commit 41c3a2ffc: Fix documentation of reshape to vectors.
  • Commit 2c083ace3: Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides
  • Commit 626942d9d: fix alignment issue in ploaddup for AVX512
  • Commit 84a1101b3: Merge with default.
  • Commit 795e12393: Fix logic in diagonal*dense product in a corner case. The problem was for: diag(1x1) * mat(1,n)
  • Commit bac36d099: Demangle Travseral and Unrolling in Redux
  • Commit c696dbcaa: Fiw shadowing of last and all
  • Commit e3c828904: Replace unused PREDICATE by corresponding STATIC_ASSERT
  • Commit 1bf12880a: Add reshaped<>() shortcuts when returning vectors and remove the reshaping version of operator()(all)
  • Commit 4291f167e: Add missing plugins to DynamicSparseMatrix -- fix sparse_extra_3
  • Commit 03a0cb2b7: fix unalignedcount for avx512
  • Commit 371068992: Add more debug output
  • Commit 91716f03a: Fix vectorization logic unit test for AVX512
  • Commit b00e48a86: Improve slice-vectorization logic for redux (significant speed-up for reduxion of blocks)
  • Commit a488d5978: merge with default Eigen
  • Commit 47720e797: Doc fixes
  • Commit 3ec298591: Merged indexing cleanup (pull request PR-506)
  • Commit 651e5d486: Fix EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF_VECTORIZABLE_FIXED_SIZE for AVX512 or AVX with malloc aligned on 8 bytes only. This change also make it future proof for AVX1024
  • Commit 719e438a2: Collapsed revision * Split cxx11_tensor_executor test * Register test parts with EIGEN_SUFFIXES * Fix EIGEN_SUFFIXES in cxx11_tensor_executor test
  • Commit f0ef3467d: Fix doc
  • Commit 617f75f11: Add indexing namespace
  • Commit 0c56d22e2: Fix shadowing
  • Commit 8e2be7777: Merged eigen/eigen into default
  • Commit 5d2e75932: Initialize BlockIteratorState in a C++03 compatible way.
  • Commit e04faca93: merge
  • Commit d37188b9c: Fix MPrealSupport
  • Commit 3c6dc93f9: Fix GPU support.
  • Commit e0f6d352f: Rename test/array.cpp to test/array_cwise.cpp to avoid conflicts with the array header.
  • Commit eeeb18814: Fix warning
  • Commit 9419f506d: Fix regression introduced by the previous fix for AVX512. It brokes the complex-complex case on SSE.
  • Commit a0166ab65: Workaround for spurious "array subscript is above array bounds" warnings with g++4.x
  • Commit e38d1ab4d: Workaround increases required alignment warning
  • Commit c50250cb2: Avoid warning "suggest braces around initialization of subobject". This test is not run in C++03 mode, so no compatibility is lost.
  • Commit 71496b0e2: Fix gebp kernel for real+complex in case only reals are vectorized (e.g., AVX512). This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.
  • Commit 5a30eed17: Fix warnings in AVX512
  • Commit 2cf6d3050: Disable ignoring attributes warning
  • Commit 44d827438: Cast to longer type.
  • Commit d638b62dd: Silence compiler warning.
  • Commit db9c9df59: Silence more compiler warnings.
  • Commit febd09dcc: Silence compiler warnings in ThreadPoolInterface.h.
  • Commit c3a19527a: Fix doc wrt previous change
  • Commit dfa8439e4: Update reshaped API to use RowMajor/ColMajor directly as integral values instead of introducing RowOrder/ColOrder types. The API changed from A.respahed(rows,cols,RowOrder) to A.template reshaped<RowOrder>(rows,cols).
  • Commit f67b19a88: [PATCH 1/2] Misc. typos From 68d431b4c14ad60a778ee93c1f59ecc4b931950e Mon Sep 17 00:00:00 2001 Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` where the whitelists consists of: ``` als ans cas dum lastr lowd nd overfl pres preverse substraction te uint whch ``` --- CMakeLists.txt | 26 +++++++++---------- Eigen/src/Core/GenericPacketMath.h | 2 +- Eigen/src/SparseLU/SparseLU.h | 2 +- bench/bench_norm.cpp | 2 +- doc/HiPerformance.dox | 2 +- doc/QuickStartGuide.dox | 2 +- .../Eigen/CXX11/src/Tensor/TensorChipping.h | 6 ++--- .../Eigen/CXX11/src/Tensor/TensorDeviceGpu.h | 2 +- .../src/Tensor/TensorForwardDeclarations.h | 4 +-- .../src/Tensor/TensorGpuHipCudaDefines.h | 2 +- .../Eigen/CXX11/src/Tensor/TensorReduction.h | 2 +- .../CXX11/src/Tensor/TensorReductionGpu.h | 2 +- .../test/cxx11_tensor_concatenation.cpp | 2 +- unsupported/test/cxx11_tensor_executor.cpp | 2 +- 14 files changed, 29 insertions(+), 29 deletions(-)
  • Commit 297ca6231: ease transition by adding placeholders::all/last/and as deprecated
  • Commit 2014c7ae2: Move all, last, end from Eigen::placeholders namespace to Eigen::, and rename end to lastp1 to avoid conflicts with std::end.
  • Commit 82772e8d9: Rename Symbolic namespace to symbolic to be consistent with numext namespace
  • Commit 400512bfa: Merged in ezhulenev/eigen-02 (pull request PR-501)
  • Commit c4627039a: Support static dimensions (aka IndexList) in Tensor::resize(...)
  • Commit 3e8188fc7: bug #1600: initialize m_info to InvalidInput by default, even though m_info is not accessible until it has been initialized (assert)
  • Commit 218a7b984: Enable DSizes type promotion with c++03 compilers
  • Commit 1f0c941c3: Collapsed revision * Merged eigen/eigen into default
  • Commit 03a88c57e: Merged in ezhulenev/eigen-02 (pull request PR-498)
  • Commit 5ca0e4a24: Merged in ezhulenev/eigen-01 (pull request PR-497)
  • Commit a5cd4e9ad: Replace deprecated Eigen::DenseIndex with Eigen::Index in TensorIndexList
  • Commit b311bfb75: bug #1596: fix inclusion of Eigen's header within unsupported modules.
  • Commit 72f19c827: typo
  • Commit 66f056776: Add DSizes index type promotion
  • Commit f313126da: Fix warnings in IndexList array_prod
  • Commit 42705ba57: Fix weird error for building with g++-4.7 in C++03 mode.
  • Commit c2383f95a: Merged in ezhulenev/eigen/fix_dsizes (pull request PR-494)
  • Commit 30290cdd5: Merged in ezhulenev/eigen/moar_eigen_fixes_3 (pull request PR-493)
  • Commit f7d0053cf: Fix DSizes IndexList constructor
  • Commit 601e289d2: Merged in ezhulenev/eigen/moar_eigen_fixes_1 (pull request PR-492)
  • Commit 71070a1e8: Const cast scalar pointer in TensorSlicingOp evaluator
  • Commit 486337572: Explicitly construct tensor block dimensions from evaluator dimensions
  • Commit 14e35855e: Merged in chtz/eigen-maxsizevector (pull request PR-490)
  • Commit 281e63183: Merged in ezhulenev/eigen/indexlist_to_dsize (pull request PR-491)
  • Commit 1b8d70a22: Support reshaping with static shapes and dimensions conversion in tensor broadcasting
  • Commit 007f165c6: bug #1598: Let MaxSizeVector respect alignment of objects and add a unit test Also revert 8b3d9ed081fc5d4870290649853b19cb5179546e
  • Commit d7378aae8: Provide EIGEN_ALIGNOF macro, and give handmade_aligned_malloc the possibility for alignments larger than the standard alignment.
  • Commit 9b864cdb3: Merged in rmlarsen/eigen3 (pull request PR-480)
  • Commit d0eef5fe6: Don't use bracket syntax in ctor.
  • Commit 6313dde39: Fix merge error.
  • Commit 0db590d22: Backed out changeset 01197e44527941c95f9a63e4f60ab3a989f12cbe
  • Commit b3f4c067d: Merge
  • Commit 2b0701814: Enable vectorized version on GPUs. The underlying bug has been fixed.
  • Commit 53568e354: Merged in ezhulenev/eigen/tiled_evalution_support (pull request PR-444)
  • Commit 01197e445: Fix warnings
  • Commit 1141bcf79: Fix conjugate-gradient for very small rhs
  • Commit 7f3b17e40: MSVC 2015 supports c++11 thread-local-storage
  • Commit d138fe341: Fis static_assert in test to conform c++11 standard
  • Commit e289f44c5: Don't vectorize the MeanReducer unless pdiv is available.
  • Commit 55bb7e793: Merge with upstream eigen/default
  • Commit 81b38a155: Fix compilation of tiled evaluation code with c++03
  • Commit 5da960702: Merged eigen/eigen into default
  • Commit 46f88fc45: Use numerically stable tree reduction in TensorReduction.
  • Commit 4827bec77: LLT: correct doc and add missing reference for the return type of rankUpdate --- Eigen/src/Cholesky/LLT.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
  • Commit 3d057e045: Avoid compilation error in C++11 test when EIGEN_AVOID_STL_ARRAY is set.
  • Commit c6066ac41: Make param name and docs constistent for JacobiRotation::makeGivens
  • Commit edeee16a1: Fix build failures in matrix_power and matrix_exponential tests.
  • Commit c64fe9ea1: Updates to fix HIP-clang specific compile errors.
  • Commit 8b3d9ed08: Use padding instead of alignment attribute, which MaxSizeVector does not respect. This leads to undefined behavior and hard-to-trace bugs.
  • Commit 5927eef61: Enable std::result_of for msvc 2015 and later
  • Commit 3adece482: Fix misleading indentation of errorCode and make it loop-local
  • Commit 7e9c9fbb2: Disable type-limits warnings for g++ < 4.8
  • Commit ba2c8efdc: EIGEN_UNUSED is not supported by g++4.7 (and not portable)
  • Commit ff4e835d6: "sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect
  • Commit 023ed6b9a: Product of empty array must be 1 and not 0.
  • Commit c2f4e8c08: Fix integer conversion warning
  • Commit ddbc56438: Fixed a few more shadowing warnings when compiling with g++ (and c++03)
  • Commit 946c3e254: adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode
  • Commit 7ec8b40ad: Collapsed revision * Separating SYCL math function. * Converting function overload to function specialisation. * Applying the suggested design.
  • Commit 20ba2eee6: gcc thinks this may not be initialized
  • Commit 73ca600bc: Fix numerous shadow-warnings for GCC<=4.8
  • Commit ef4d79fed: Disable/ReenableStupidWarnings did not work properly, when included recursively
  • Commit befaf83f5: bug #1590: fix collision with some system headers defining the macro FP32
  • Commit 42f3ee4fb: Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers
  • Commit c144bb355: Merge with upstream eigen/default
  • Commit 574728867: Disable a bonus unit-test which is broken with gcc 4.7
  • Commit d5ed64512: bug #1573: workaround gcc 4.7 and 4.8 bug
  • Commit b1653d159: Fix some trivial C++11 vs C++03 compatibility warnings
  • Commit 42123ff38: Make unit test C++03 compatible
  • Commit 4b1ad086b: Fix shadow warnings in doc-snippets
  • Commit 117bc5d50: Fix some shadow warnings
  • Commit f155e97ad: Previous fix broke compilation for clang
  • Commit 209b4972e: Fix conversion warning
  • Commit 495f6c3c3: Fix missing-braces warnings
  • Commit 5aaedbece: Fixed more sign-compare and type-limits warnings
  • Commit 8295f02b3: Hide "maybe uninitialized" warning on gcc
  • Commit f7675b826: Fix several integer conversion and sign-compare warnings
  • Commit 949b0ad9c: Merged in rmlarsen/eigen3 (pull request PR-468)
  • Commit 744e2fe0d: Address comments about EIGEN_THREAD_LOCAL.
  • Commit ad4a08fb6: Use Intel cast intrinsics, since MSVC does not allow direct casting. Reported by David Winkler.
  • Commit 8d9bc5cc0: Fix g++ compilation.
  • Commit e9f9d7061: Don't rely on __had_feature for g++. Don't use __thread. Only use thread_local for gcc 4.8 or newer.
  • Commit 668690978: Pad PerThread when we emulate thread_local to prevent false sharing.
  • Commit 6cedc5a9b: rename mu.
  • Commit 6e0464004: Store std::unique_ptr instead of raw pointers in per_thread_map_.
  • Commit e51d9e473: Protect #undef max with #ifdef max.
  • Commit d35880ed9: merge
  • Commit a709c8efb: Replace pointers by values or unique_ptr for better leak-safety
  • Commit 39335cf51: Make MaxSizeVector leak-safe
  • Commit ff8e0ecc2: Updated one more line of code to avoid making the test dependent on cxx11 features.
  • Commit 43d9dd9b2: Removed more dependencies on cxx11.
  • Commit f76c80297: Add missing empty line
  • Commit 41f1cc67b: Assertion depended on a not yet initialized value
  • Commit 4713465ee: Silence double-promotion warning
  • Commit 595cae9b0: Silence logical-op-parentheses warning
  • Commit c9b25fbef: Silence unused parameter warning
  • Commit dbdeceabd: Silence double-promotion warning (when converting double to complex<long double>)
  • Commit 19df4d575: Merged in codeplaysoftware/eigen-upstream-pure/Pointer_type_creation (pull request PR-461)
  • Commit f641cf125: Adding missing at method in Eigen::array
  • Commit ede580ccd: Avoid using the auto keyword to make the tensor block access test more portable
  • Commit e23c8c294: Use actual types instead of the auto keyword to make the code more portable
  • Commit 80f1a76de: removing the noises.
  • Commit d0b01ebbf: Reverting the unitended delete from the code.
  • Commit 161dcbae9: Using PointerType struct and specializing it per device for TensorCustomOp.h
  • Commit f197c3f55: Removed an used variable (PacketSize) from TensorExecutor
  • Commit 418155690: Fixed the tensor contraction code.
  • Commit b6f96cf7d: Removed dependencies on cxx11 language features from the tensor_block_access test
  • Commit fbb834144: Fixed more compilation errors
  • Commit 6bb3f1b43: Made the tensor_block_access test compile again
  • Commit 43ec0082a: Made the kronecker_product test compile again
  • Commit ab3f48114: Cleaned up the code and make it compile with more compilers
  • Commit fa0bcbf23: merge
  • Commit 15d4f515e: Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version.
  • Commit aebdb0642: Fix a few compiler warnings in CXX11 tests.
  • Commit 2a98bd9c8: Merged eigen/eigen into default
  • Commit 59bba77ea: Fixed compilation errors with gcc 4.7 and 4.8
  • Commit a97aaa2bc: Merge with upstream.
  • Commit 8ba799805: Merge with upstream
  • Commit 6d6e7b702: merge
  • Commit 9bb75d8d3: Add Barrier.h.
  • Commit 2e1adc032: Merged eigen/eigen into default
  • Commit 8278ae631: Add support for thread local support on platforms that do not support it through emulation using a hash map.
  • Commit 501be70b2: Code cleanup
  • Commit 3d3711f22: Fixed compilation errors.
  • Commit 3ec60215d: Merged in rmlarsen/eigen2 (pull request PR-466)
  • Commit 0f1b2e08a: Call logistic functor from Tensor::sigmoid.
  • Commit d6e283ba9: sigmoid -> logistic
  • Commit 26239ee58: Use NULL instead of nullptr to avoid adding a cxx11 requirement.
  • Commit 3810ec228: Don't use the auto keyword since it's not always supported properly.
  • Commit e6d5be811: Fixed syntax of nested templates chevrons to make it compatible with c++97 mode.
  • Commit 1aa86aad1: Merge with upstream.
  • Commit 35d90e896: Fix BlockAccess enum in CwiseUnaryOp evaluator
  • Commit 855b68896: Merge with eigen/default
  • Commit f2209d06e: Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators
  • Commit c8ea39867: Avoided language features that are only available in cxx11 mode.
  • Commit 4be428622: Made the code compile with gcc 5.4.
  • Commit eabc7a403: PR 465: Fix issue in RowMajor assignment in plain_matrix_type_row_major::type
  • Commit c49e93440: SuiteSparse defines the macro SuiteSparse_long to control what type is used for 64bit integers. The default value of this macro on non-MSVC platforms is long and __int64 on MSVC. CholmodSupport defaults to using long for the long variants of CHOLMOD functions. This creates problems when SuiteSparse_long is different than long. So the correct thing to do here is to use SuiteSparse_long as the type instead of long.
  • Commit 3a2e1b1fc: Merge with upstream.
  • Commit bfc5091dd: Cast to diagonalSize to RealScalar instead Scalar.
  • Commit 8603d8002: Cast diagonalSize() to Scalar before multiplication. Without this, automatic differentiation in Ceres breaks because Scalar is a custom type that does not support multiplication by Index.
  • Commit cfaedb38c: Fix bug in a test + compilation errors
  • Commit ea8fa5e86: Merge with upstream
  • Commit 8c083bfd0: Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type.
  • Commit 050bcf612: bug #1584: Improve random (avoid undefined behavior).
  • Commit 1c8b9e10a: Merged with upstream eigen
  • Commit 131ed1191: Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462)
  • Commit 1285c080b: Merged in codeplaysoftware/eigen-upstream-pure/disabling_assert_in_sycl (pull request PR-459)
  • Commit c4b2845be: Merged in rmlarsen/eigen3 (pull request PR-458)
  • Commit 7124172b8: Merged in codeplaysoftware/eigen-upstream-pure/EIGEN_UNROLL_LOOP (pull request PR-460)
  • Commit 532a0be05: Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
  • Commit 67711eaa3: Fixing typo.
  • Commit 3055e3a7c: Creating a pointer type in TensorCustomOp.h
  • Commit 22031ab59: Adding EIGEN_UNROLL_LOOP macro.
  • Commit 908b906d7: Disabling assert inside SYCL kernel.
  • Commit 693fb1d41: Fix init order.
  • Commit 10d286f55: Silenced a couple of compilation warnings.
  • Commit d011d05fd: Fixed compilation errors.
  • Commit 36e7e7dd8: Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation.
  • Commit fa68342ef: Move sigmoid functor to core.
  • Commit 09c81ac03: bug #1451: fix numeric_limits<AutoDiffScalar<Der>> with a reference as derivative type
  • Commit 43fd42a33: Fix doxy and misc. typos Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` --- Eigen/src/Core/ProductEvaluators.h | 4 ++-- Eigen/src/Core/arch/GPU/Half.h | 2 +- Eigen/src/Core/util/Memory.h | 2 +- Eigen/src/Geometry/Hyperplane.h | 2 +- Eigen/src/Geometry/Transform.h | 2 +- Eigen/src/Geometry/Translation.h | 12 ++++++------ doc/PreprocessorDirectives.dox | 2 +- doc/TutorialGeometry.dox | 2 +- test/boostmultiprec.cpp | 2 +- test/triangular.cpp | 2 +- 10 files changed, 16 insertions(+), 16 deletions(-)
  • Commit 2cbd9dd49: [PATCH] cmake: Support source include with add_subdirectory and find_package use This commit allows the sources of the project to be included in a parent project CMakeLists.txt and support use of "find_package(Eigen3 CONFIG REQUIRED)"
  • Commit a80a29007: Fix 'template argument uses local type'-warnings (when compiled in C++03 mode)
  • Commit 6dcd2642a: bug #1526 - CUDA compilation fails on CUDA 9.x SDK when arch is set to compute_60 and/or above
  • Commit edfb7962f: Use `static const int` instead of `enum` to avoid numerous `local-type-template-args` warnings in C++03 mode
  • Commit ec38f07b7: bug #1595: Don't use C++11's std::isnan() in MIPS/MSA packet math.
  • Commit 1b0373ae1: Replace all using declarations with typedefs in Tensor ops
  • Commit 7f8b53fd0: bug #1580: Fix cuda clang build. STL is not supported, so std::equal_to and std::not_equal breaks compilation. Update the definition of EIGEN_CONSTEXPR_ARE_DEVICE_FUNC to exclude clang. See also PR 450.
  • Commit bcb29f890: Fix initialization order.
  • Commit cf17794ef: Merged in codeplaysoftware/eigen-upstream-pure/SYCL-required-changes (pull request PR-454)
  • Commit 3074b1ff9: Fixing the compilation error.
  • Commit 225fa112a: Merge with upstream.
  • Commit 01358300d: Creating separate SYCL required PR for uncontroversial files.
  • Commit 2bf1cc8cf: Fix 256 bit packet size assumptions in unit tests.
  • Commit dd5875e30: Merged in codeplaysoftware/eigen-upstream-pure/constructor_error_clang (pull request PR-451)
  • Commit 113d8343d: Merged in codeplaysoftware/eigen-upstream-pure/Fixing_visual_studio_error_For_tensor_trace (pull request PR-452)
  • Commit 516d2621b: fixing compilation error for cxx11_tensor_trace.cpp error on Microsoft Visual Studio.
  • Commit 40d6d020a: Fixing ambigous constructor error for Clang compiler.
  • Commit 62169419a: Fix two regressions introduced in previous merges: bad usage of EIGEN_HAS_VARIADIC_TEMPLATES and linking issue.
  • Commit 64abdf1d7: Fix typo + get rid of redundant member variables for block sizes
  • Commit 93b9e36e1: Merged in paultucker/eigen (pull request PR-431)
  • Commit 385b3ff12: Merged latest changes from upstream/eigen
  • Commit 17221115c: Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447)
  • Commit 0360c3617: Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446)
  • Commit c6a5c7071: Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h
  • Commit 9ca1c0913: Merged in codeplaysoftware/eigen-upstream-pure/new-arch-SYCL-headers (pull request PR-448)
  • Commit 45f75f1ac: Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449)
  • Commit 90e632fd6: Merged in codeplaysoftware/eigen-upstream-pure/EIGEN_STRONG_INLINE_MACRO (pull request PR-445)
  • Commit af96018b4: Using the suggested modification.
  • Commit b512a9536: Enabling per device specialisation of packetsize.
  • Commit c84509d7c: Adding new arch/SYCL headers, used for SYCL vectorization.
  • Commit 3a197a60e: variadic version of assert which can take a parameter pack as its input.
  • Commit d7a841484: Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
  • Commit 9e219bb3d: Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.
  • Commit 83c0a16ba: Add block evaluation support to TensorOps
  • Commit edf46bd7a: Merged in yuefengz/eigen (pull request PR-370)
  • Commit 385f7b8d0: Change getAllocator() to allocator() in ThreadPoolDevice.
  • Commit 6f5b126e6: Fix tensor contraction for AVX512 machines
  • Commit d6568425f: Close branch tiling_3.
  • Commit 678a0dcb1: Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)
  • Commit 679eece87: Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.
  • Commit 723856dec: bug #1577: fix msvc compilation of unit test, msvc defines ptrdiff_t as long long
  • Commit 966c2a7bb: Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible
  • Commit 6913221c4: Add tiled evaluation support to TensorExecutor
  • Commit 7b91c1120: bug #1578: Improve prefetching in matrix multiplication on MIPS.
  • Commit f5cace5e9: Fix two small typos in the documentation
  • Commit 34539c4af: Merged in rmlarsen/eigen1 (pull request PR-441)
  • Commit bc615e458: Re-enable FMA for fast sqrt functions
  • Commit 96b030a8e: Re-enable FMA for fast sqrt functions
  • Commit e47853262: Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.
  • Commit 2ebcb911b: Add pcast packet op for NEON.
  • Commit 397b0547e: DIsable static assertions only when necessary and disable double-promotion warnings in that case as well
  • Commit 5e79402b4: fix warnings for doc-eigen-prerequisites
  • Commit 5f79b7f9a: Removed several shadowing types and use global Index typedef everywhere
  • Commit 44ee20133: Rename variable which shadows class name
  • Commit 705f66a9c: Account for missing change on commit "Remove SimpleThreadPool and..."
  • Commit fd4fe7cbc: Fixed issue which made documentation not getting built anymore
  • Commit 636126ef4: Allow to filter out build-error messages
  • Commit d55efa6f0: TensorBlockIO
  • Commit 34a75c3c5: Initial support of TensorBlock
  • Commit 2c2de9da7: Merged in glchaves/eigen (pull request PR-433)
  • Commit 4ca3e48f4: fix typo
  • Commit c747cde69: Add lastN shorcuts to seq/seqN.
  • Commit 02eaaacbc: Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block
  • Commit 2bf864f1e: Disable type traits for stdlibc++ <= 4.9.3
  • Commit de7067193: Oopps, EIGEN_COMP_MSVC is not available before including Eigen.
  • Commit 56a750b6c: Disable optimization for sparse_product unit test with MSVC 2013, otherwise it takes several hours to build.
  • Commit d4afccde5: Add test coverage for ThreadPoolDevice optional allocator.
  • Commit c58b87472: PR430: Convert count to the reducer type in MeanReducer
  • Commit 2424e3b7a: Pass by const ref.
  • Commit 509a5fa77: Fix IsRelocatable without C++11
  • Commit 2ca259200: Fix determination of EIGEN_HAS_TYPE_TRAITS
  • Commit 5e5987996: Fix stupid error in Quaternion move ctor
  • Commit 4e9848fa8: Actually add optional Allocator* arg to ThreadPoolDevice().
  • Commit b3e7c9132: Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.
  • Commit 40797dbea: bug #1572: use c++11 atomic instead of volatile if c++11 is available, and disable multi-threaded GEMM on non-x86 without c++11.
  • Commit add575748: Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.
  • Commit 901c7d31f: Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally.
  • Commit f2b52f994: Add the cmake option "EIGEN_DASHBOARD_BUILD_TARGET" to control the build target in dashboard mode (e.g., ctest -D Experimental)
  • Commit 23d82c1ac: Merged in rmlarsen/eigen2 (pull request PR-422)
  • Commit a87cff20d: Fix GeneralizedEigenSolver when requesting for eigenvalues only.
  • Commit 3a9cf4e29: Get rid of alias for m_broadcast.
  • Commit 4222550e1: Optimize the case where broadcasting is a no-op.
  • Commit 4a3952fd5: Relax the condition to not only work on Android.
  • Commit 02a9443db: Clang produces incorrect Thumb2 assembler when using alloca. Don't define EIGEN_ALLOCA when generating Thumb with clang.
  • Commit 20991c320: bug #1571: fix is_convertible<from,to> with "from" a reference.
  • Commit 1920129d7: Remove clang warning
  • Commit 195c9c054: Print more debug info in gpu_basic
  • Commit 06eb24cf4: Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.
  • Commit 5fd03ddbf: Make EIGEN_TEST_CUDA_CLANG more friendly with OSX
  • Commit 86d9c0255: Forward declaring std::array does not work with all std libs, so let's just include <array>
  • Commit d908afe35: bug #1558: fix a corner case in MINRES when both v_new and w_new vanish.
  • Commit 6e654f337: Reduce number of allocations in TensorContractionThreadPool.
  • Commit 7ccb62374: bug #1569: fix Tensor<half>::mean() on AVX with respective unit test.
  • Commit 1f523e730: Add MIPS changes missing from previous merge.
  • Commit e3c2d6173: Assert that no output kernel is defined for GPU contraction
  • Commit 086ded5c8: Disable type traits for GCC < 5.1.0
  • Commit 79d4129cc: Specify default output kernel for TensorContractionOp
  • Commit 6e5a3b898: Add regression for bugs #1573 and #1575
  • Commit 863580fe8: bug #1432: fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable
  • Commit 053ed97c7: Generalize ScalarWithExceptions to a full non-copyable and trowing scalar type to be used in other unit tests.
  • Commit a503fc872: bug #1575: fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted.
  • Commit 308725c3c: More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA
  • Commit 3875fb05a: Add support for MIPS SIMD (MSA)
  • Commit 44ea5f762: Add unit test for -Tensor<complex> on GPU
  • Commit 12e1ebb68: Remove local Index typedef from unit-tests
  • Commit 63185be8b: Disable eigenvalues test for clang-cuda
  • Commit bec013b2c: fix unused warning
  • Commit 5c73c9223: Fix shadowing typedefs
  • Commit 98728312c: Fix compilation regarding std::array
  • Commit eb3d8f68b: fix unused warning
  • Commit 006e18e52: Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h
  • Commit 9a6a43319: Fix cxx11_tensor_fft not building on Windows.
  • Commit b347eb0b1: Fix doc
  • Commit e79c5149b: Fix AVX512 implementations of psqrt
  • Commit 1eff6cf8a: Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.
  • Commit adb134d47: Fix implicit conversion from 0.0 to scalar
  • Commit 937ad1822: add unit test for SimplicialCholesky and Boost multiprec.
  • Commit 6d451cf2b: Add missing consts for rows and cols functions in SparseLU
  • Commit a12b8a8c7: FindEigen3: Set Eigen3_FOUND variable
  • Commit 8bdb214fd: remove double ;;
  • Commit a9060378d: bug #1570: fix warning
  • Commit 6cd6551b2: Add deprecated header files for TensorFlow
  • Commit da0c60407: Merged in deven-amd/eigen (pull request PR-402)
  • Commit a4ea611ca: Remove useless specialization thanks to is_convertible being more robust.
  • Commit 8a40dda5a: Add some basic unit-tests
  • Commit 8ef267ccb: spellcheck
  • Commit 21cf4a1a8: Make is_convertible more robust and conformant to std::is_convertible
  • Commit 8a5955a05: Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.
  • Commit d193cc87f: Fix regression in 9357838f94d2907996adadc7e5200376f3561ed4
  • Commit fb3368773: Fix double ;;
  • Commit 876f392c3: Updates corresponding to the latest round of PR feedback
  • Commit 1fe0b7490: deleting hip specific files that are no longer required
  • Commit dec47a649: renaming CUDA* to GPU* for some header files
  • Commit 471cfe5ff: renaming CUDA* to GPU* for some header files
  • Commit 38807a257: merging updates from upstream
  • Commit f00d08cc0: Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix.
  • Commit 162547609: Add internall::is_identity compile-time helper
  • Commit fe723d612: Fix conversion warning
  • Commit 9357838f9: bug #1543: improve linear indexing for general block expressions
  • Commit de9e31a06: Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
  • Commit 6190aa563: bug #1567: add optimized path for tensor broadcasting and 'Channel First' shape
  • Commit ec323b7e6: Skip null numerators in triangular-vector-solve (as in BLAS TRSV).
  • Commit 359dd77ec: Fix legitimate "declaration shadows a typedef" warning
  • Commit e2b2c6153: merging from master
  • Commit 1bb6fa99a: merging the CUDA and HIP implementation for the Tensor directory and the unit tests
  • Commit cfdabbcc8: removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
  • Commit 7e41c8f1a: renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
  • Commit ee73ae0a8: Merged eigen/eigen into default
  • Commit 90a53ca6f: Fix the Packet16h version of ptranspose
  • Commit 1f54164ec: Fix a few issues with Packet16h
  • Commit f2dc048df: complete implementation of Packet16h (AVX512)
  • Commit a937c5020: palign is not used anymore, so let's relax the unit test
  • Commit 56a33ae57: test product kernel with half-floats.
  • Commit f4d623ffa: Complete Packet8h implementation and test it in packetmath unit test
  • Commit a8ab6060d: Add unitests for inverse and selfadjoint-eigenvalues on CUDA
  • Commit b8271bb36: fix md5sum of lapack_addons
  • Commit b6cc0961b: updates based on PR feedback
  • Commit ba972fb6b: moving Half headers from CUDA dir to GPU dir, removing the HIP versions
  • Commit d1d22ef0f: syncing this fork with upstream
  • Commit d3a380af4: Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
  • Commit f7124b3e4: Extend CUDA support to matrix inversion and selfadjointeigensolver
  • Commit 053712395: bug #1565: help MSVC to generatenot too bad ASM in reductions.
  • Commit 6a241bd8e: Implement custom inplace triangular product to avoid a temporary
  • Commit 3ae2083e2: Make is_same_dense compatible with different scalar types.
  • Commit 67ec37f7b: Activate dgmres unit test
  • Commit 047677a08: Fix regression in changeset f05dea6b2326836e5e0243fbaffbece84b833d64 : computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.
  • Commit d62556493: Simplify redux_evaluator using inheritance, and properly rename parameters in reducers.
  • Commit d428a199a: bug #1562: optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x...
  • Commit a7b313a16: Fix unit test
  • Commit 0cdacf3fa: update comment
  • Commit 54f6eeda9: Merged in net147/eigen (pull request PR-411)
  • Commit 9a81de1d3: Fix order of EIGEN_DEVICE_FUNC and returned type
  • Commit b7689bded: Use std::complex constructor instead of assignment from scalar
  • Commit f9d337780: First step towards a generic vectorised quaternion product
  • Commit ee5864f72: bug #1560 fix product with a 1x1 diagonal matrix
  • Commit 2f62cc68c: merge
  • Commit bda71ad39: Fix typo in pbend for AltiVec.
  • Commit b6ffcd22e: Merged in rmlarsen/eigen2 (pull request PR-409)
  • Commit 4cc32d80f: bug #1555: compilation fix with XLC
  • Commit 5418154a4: Fix oversharding bug in parallelFor.
  • Commit cb4c9a6a9: bug #1531: make dedicatd unit testing for NumDimensions
  • Commit d6813fb1c: bug #1531: expose NumDimensions for solve and sparse expressions.
  • Commit 89d65bb9d: bug #1531: expose NumDimensions for compatibility with Tensor
  • Commit f05dea6b2: bug #1550: prevent avoidable memory allocation in RealSchur
  • Commit 7933267c6: fix prototype
  • Commit f4d146187: Fix the way matrix folder is passed to the tests.
  • Commit 522d3ca54: Don't use std::equal_to inside cuda kernels since it's not supported.
  • Commit 7d7bb9153: Missing line during manual rebase of PR-374
  • Commit 30fa3d045: Merge from eigen/eigen
  • Commit d2b0a4a59: Merged in mfigurnov/eigen/fix-bessel (pull request PR-404)
  • Commit 6c71c7d36: Merge from eigen/eigen.
  • Commit c25034710: Fiw some warnings in dox examples
  • Commit 37348d03a: Fix int versus Index
  • Commit c723ffd76: Fix warning
  • Commit af7c83b9a: Fix warning
  • Commit 7fe29acee: Fix MSVC warning C4290: C++ exception specification ignored except to indicate a function is not __declspec(nothrow)
  • Commit aa813d417: Fix compilation of special functions without C99 math.
  • Commit 55774b48e: Fix short vs long
  • Commit e5f9f4768: Avoid unnecessary C++11 dependency
  • Commit b3fd93207: Fix typos found using codespell
  • Commit 5172a3284: Updated the stopping criteria in igammac_cf_impl.
  • Commit 4bd158fa3: Derivative of the incomplete Gamma function and the sample of a Gamma random variable.
  • Commit 8fbd47052: Adding support for using Eigen in HIP kernels.
  • Commit e206f8d4a: Merged in mfigurnov/eigen (pull request PR-400)
  • Commit e2ed0cf8a: Add a ThreadPoolInterface* getter for ThreadPoolDevice.
  • Commit 84868da90: Don't run hg on non mercurial clone
  • Commit f21685445: Exponentially scaled modified Bessel functions of order zero and one.
  • Commit 6af1433cb: Doc: add aliasing in common pitfaffs.
  • Commit ea9454319: Hyperlink DOIs against preferred resolver
  • Commit 999b552c1: Search for sequential Pastix.
  • Commit eef4b7bd8: Fix handling of path names containing spaces and the likes.
  • Commit 647b724a3: Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)
  • Commit 49262dfee: Fix compilation and SSE support with PGI compiler
  • Commit 750af0636: Add an option to test with external BLAS library
  • Commit d06a753d1: Make qr_fullpivoting unit test run for fixed-sized matrices
  • Commit f0862b062: Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older.
  • Commit 36e413a53: Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int)
  • Commit 725bd9290: fix stupid typo
  • Commit a382bc936: is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions
  • Commit 4dd767f45: add some internal checks
  • Commit 345c0ab45: check that all integer types are properly handled by mat(i,j)
  • Commit 405859f18: Set EIGEN_IDEAL_MAX_ALIGN_BYTES correctly for AVX512 builds
  • Commit 6293ad3f3: Performance improvements to tensor broadcast operation 1. Added new packet functions using SIMD for NByOne, OneByN cases 2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD 3. Added 4 test cases to cover the new packet functions
  • Commit 7134fa7a2: Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).
  • Commit e7147f69a: Add tests for sparseQR results (value and size) covering bugs #1522 and #1544
  • Commit b2053990d: Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment specializations. Otherwise causes problems with small fixed size matrix multiplication (call to 0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).
  • Commit 9f0c5c366: Make sparse QR result sizes consistent with dense QR, with the following rules:
  • Commit d65590095: bug #1544: Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386.
  • Commit 0371380d5: Merged in rmlarsen/eigen2 (pull request PR-393)
  • Commit b8d36774f: Rename clip2 to clamp.
  • Commit 812480baa: Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow.
  • Commit 1403c2c15: Merged in didierjansen/eigen (pull request PR-360)
  • Commit ad355b3f0: Merged in rmlarsen/eigen2 (pull request PR-392)
  • Commit 0272f2451: Fix "suggest parentheses around comparison" warning
  • Commit afec3021f: Use numext::maxi & numext::mini.
  • Commit b8c8e5f43: Add vectorized clip functor for Eigen Tensors.
  • Commit 6118c6ff4: Enable RawAccess to tensor slices whenever possinle. Avoid 32-bit integer overflow in TensorSlicingOp
  • Commit 6e7118265: Fix compilation with NEON+MSVC
  • Commit 097dd4616: Fix unit test for SIMD engine not supporting sqrt
  • Commit 8810baaed: Add multi-threading for sparse-row-major * dense-row-major
  • Commit 2f3287da7: Fix "used uninitialized" warnings
  • Commit 3ffd449ef: Workaround warning
  • Commit e8ca5166a: bug #1428: atempt to make NEON vectorization compilable by MSVC. The workaround is to wrap NEON packet types to make them different c++ types.
  • Commit 6f5935421: fix AVX512 plog
  • Commit e9da464e2: Add specializations of is_arithmetic for long long in c++11
  • Commit a57e6e5f0: workaround MSVC 2013 compilation issue (ambiguous call)
  • Commit 11123175d: typo in doc
  • Commit 5679e439e: bug #1543: fix linear indexing in generic block evaluation (this completes the fix in commit 12efc7d41b80259b996be5781bf596c249c90d3f )
  • Commit 35b31353a: Fix unit test
  • Commit 34e499ad3: Disable -Wshadow when compiling with g++
  • Commit b7b868d1c: fix AVX512 plog
  • Commit 686fb5723: fix const cast in NEON
  • Commit 02d2f1cb4: Cast zeros to Scalar in RealSchur
  • Commit 50633d1a8: Renamed .trans() et al. to .reverseFlag() et at. Adapted documentation of .setReverseFlag()
  • Commit 39c2cba81: Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
  • Commit 775766d17: Add parenthesis to fix compiler warnings
  • Commit 42715533f: bug #1493: Make representation of HouseholderSequence consistent and working for complex numbers. Made corresponding unit test actually test that. Also simplify implementation of QR decompositions
  • Commit c9ecfff2e: Add links where to make PRs and report bugs into README.md
  • Commit c8b19702b: Limit test size for sparse Cholesky solvers to EIGEN_TEST_MAX_SIZE
  • Commit 2cbb00b18: No need to make noise, if KLU is found
  • Commit 84dcd998a: Recent Adolc versions require C++11
  • Commit 4d392d93a: Make hypot_impl compile again for types with expression-templates (e.g., boost::multiprecision)
  • Commit 072e111ec: SelfAdjointView<...,Mode> causes a static assert since commit d820ab9edc0b38af4cdb3d545714a0c9083e5a78
  • Commit 7a9089c33: fix linking issue
  • Commit e43ca0320: bug #1520: workaround some -Wfloat-equal warnings by calling std::equal_to
  • Commit b0eda3cb9: Avoid using memcpy for non-POD elements
  • Commit 79266fec7: extend doxygen splitter for huge screens
  • Commit 426052ef6: Update header/footer for doxygen 1.8.13
  • Commit 9c8decffb: Fix javascript hacks for oxygen 1.8.13
  • Commit e79846687: bug #1538: update manual pages regarding BDCSVD.
  • Commit c91906b06: Umfpack: UF_long has been removed in recent versions of suitesparse, and fix a few long-to-int conversions issues.
  • Commit 0050709ea: Merged in v_huber/eigen (pull request PR-378)
  • Commit 8c1652055: Fix code sample output in block(int, int, int, int) doxygen
  • Commit 08008f67e: Add unitTest
  • Commit add15924a: Fix MKL backend for symmetric eigenvalues on row-major matrices.
  • Commit 04b1628e5: Add missing empty line.
  • Commit c2624c031: Fix cmake scripts with no fortran compiler
  • Commit 2f833b1c6: bug #1509: fix computeInverseWithCheck for complexes
  • Commit b903fa74f: Extend list of MSVC versions
  • Commit 403f09cce: Make stableNorm and blueNorm compatible with 2D matrices.
  • Commit 4213b63f5: Factories code between numext::hypot and scalar_hyot_op functor.
  • Commit 368dd4cd9: Make innerVector() and innerVectors() methods available to all expressions supported by Block.
  • Commit e116f6847: bug #1521: avoid signalling NaN in hypot and make it std::complex<> friendly.
  • Commit 73729025a: bug #1521: add unit test dedicated to numbest::hypos
  • Commit 13f5df9f6: Add a note on vec_min vs asm
  • Commit e91e31434: bug #1494: makes pmin/pmax behave on Altivec/VSX as on x86 regading NaNs
  • Commit 112c89930: comment unreachable code
  • Commit a1292395d: Fix compilation of product with inverse transpositions (e.g., mat * Transpositions().inverse())
  • Commit 8c7b5158a: commit 45e9c9996da790b55ed9c4b0dfeae49492ac5c46 (HEAD -> memory_fix) Author: George Burgess IV <gbiv@google.com> Date: Thu Mar 1 11:20:24 2018 -0800
  • Commit dd4cc6bd9: bug #1527: fix support for MKL's VML (destination was not properly resized)
  • Commit c5b56f1fb: bug #1528: better use numeric_limits::min() instead of 1/highest() that with underflow.
  • Commit 8d0ffe365: bug #1516: add assertion for out-of-range diagonal index in MatrixBase::diagonal(i)
  • Commit 407e3e262: bug #1532: disable stl::*_negate in C++17 (they are deprecated)
  • Commit 40b4bf3d3: AVX512: _mm512_rsqrt28_ps is available for AVX512ER only
  • Commit 584951ca4: Rename predux_downto4 to be more accurate on its semantic.
  • Commit 67bac6368: protect calls to isnan
  • Commit d43b2f01f: Fix unit testing of predux_downto4 (bad name), and add unit testing of prsqrt
  • Commit 7b0630315: AVX512: fix psqrt and prsqrt
  • Commit 6719409cd: AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ
  • Commit 524119d32: Fix uninitialized output argument.
  • Commit 267a144da: Remove unnecessary define
  • Commit baf9a5a77: Add interface to umfpack_*l_* functions
  • Commit e3912f5e6: MIsc. source and comment typos
  • Commit 5deeb19e7: bug #1517: fix triangular product with unit diagonal and nested scaling factor: (s*A).triangularView<UpperUnit>()*B
  • Commit 12efc7d41: Fix linear indexing in generic block evaluation.
  • Commit f4a6863c7: Fix typo
  • Commit 000840cae: Added a move constructor and move assignment operator to Tensor and wrote some tests.
  • Commit 3a2dc3869: Fix weird issue with MSVC 2013
  • Commit c95aacab9: Fix TensorContractionOp evaluators for GPU and SYCL
  • Commit 038b55464: Merged in deven-amd/eigen (pull request PR-425)
  • Commit f124f0796: applying EIGEN_DECLARE_TEST to *gpu* tests
  • Commit dff3a92d5: Remove usage of #if EIGEN_TEST_PART_XX in unit tests that does not require them (splitting can thus be avoided for them)
  • Commit 82f0ce272: Get rid of EIGEN_TEST_FUNC, unit tests must now be declared with EIGEN_DECLARE_TEST(mytest) { /* code */ }. This provide several advantages: - more flexibility in designing unit tests - unit tests can be glued to speed up compilation - unit tests are compiled with same predefined macros, which is a requirement for zapcc
  • Commit 37f4bdd97: Fix VERIFY_EVALUATION_COUNT(EXPR,N) with a complex expression as N
  • Commit 2b2cd8569: bug #1573: add noexcept move constructor and move assignment operator to Quaternion
  • Commit 43206ac4d: Call OutputKernel in evalGemv
  • Commit e204ecdaa: Remove SimpleThreadPool and always use {NonBlocking}ThreadPool
  • Commit b324ed55d: Call OutputKernel in evalGemv
  • Commit 01fd4096d: Fuse computations into the Tensor contractions using output kernel
  • Commit 5539587b1: Some warning fixes
  • Commit 8f55956a5: Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
  • Commit 09a16ba42: bug #1412: fix compilation with nvcc+MSVC
  • Commit 5b3c36792: Fix typos in the contraction example of tensor README
  • Commit f558ad295: Fix incorrect ldvt in LAPACKE call from JacobiSVD
  • Commit 22de74aa7: Disable use of recurrence for computing twiddle factors.
  • Commit 73629f8b6: Fix gcc7 warning
  • Commit 59985cfd2: Disable use of recurrence for computing twiddle factors. Fixes FFT precision issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689
  • Commit f9bdcea02: For cuda 9.1 replace math_functions.hpp with cuda_runtime.h
  • Commit 06bf1047f: Fix compilation of stableNorm with some expressions as input
  • Commit 73214c4bd: Workaround nvcc 9.0 issue. See PR 351. https://bitbucket.org/eigen/eigen/pull-requests/351
  • Commit 31e0bda2e: Fix cmake warning
  • Commit 26a2c6fc1: fix unit test
  • Commit 546ab97d7: Add possibility to overwrite EIGEN_STRONG_INLINE.
  • Commit 9c3aed9d4: Fix packet and alignment propagation logic of Block<Xpr> expressions. In particular, (A+B).col(j) lost vectorisation.
  • Commit 76c7dae60: ignore all *build* sub directories
  • Commit b2cacd189: fix header inclusion
  • Commit 3122477c8: Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
  • Commit 393b7c495: Merged in ncluehr/eigen/float2half-fix (pull request PR-349)
  • Commit aefd5fd5c: Replace __float2half_rn with __float2half
  • Commit d0b028e17: clarify Pastix requirements
  • Commit 3587e481f: silent MSVC warning
  • Commit 3a327cd3c: Merged in ncluehr/eigen/predux_fp16_fix (pull request PR-348)
  • Commit dd6de618c: Fix incorrect integer cast in predux<half2>().
  • Commit 3dc6ff73c: Handle PGI compiler
  • Commit 599a88da2: Disable gcc-specific workaround for Clang to allow build with AVX512
  • Commit 672bdc126: bug #1479: fix failure detection in LDLT
  • Commit 624df5094: Adds missing EIGEN_STRONG_INLINE to support MSVC properly inlining small vector calculations
  • Commit 746a6b7b8: Merged in zzp11/eigen/zzp11/a-small-mistake-quickreferencedox-edited-1510217281963 (pull request PR-346)
  • Commit d2631ef61: Merged in facaiy/eigen/ENH/exp_support_complex_for_gpu (pull request PR-359)
  • Commit 8fcbd6d4c: Merged in dtrebbien/eigen (pull request PR-369)
  • Commit e900b010c: Improve robustness of igamma and igammac to bad inputs.
  • Commit f7d17689a: Add static assertion for fixed sizes Ref<>
  • Commit f6be7289d: Implement better static assertion checking to make sure that the first assertion is a static one and not a runtime one.
  • Commit d820ab9ed: Add static assertion on selfadjoint-view's UpLo parameter.
  • Commit 0c57be407: Move up the specialization of std::numeric_limits
  • Commit 42a833466: ENH: exp supports complex type for cuda
  • Commit 912e9965e: a small mistake QuickReference.dox edited online with Bitbucket
  • Commit 4c03b3511: Fix issue with boost::multiprec in previous commit
  • Commit e9d2888e7: Improve debugging tests and output in BDCSVD
  • Commit e8468ea91: Fix overflow issues in BDCSVD
  • Commit 394961517: Merged in JonasMu/eigen (pull request PR-329)
  • Commit 11ddac57e: Merged in guillaume_michel/eigen (pull request PR-334)
  • Commit a6d875bac: Removed unecesasry #include
  • Commit f16ba2a63: Merged in LaFeuille/eigen-1/LaFeuille/typo-fix-alignmeent-alignment-1505889397887 (pull request PR-335)
  • Commit ee6ad21b2: Merged in henryiii/eigen/henryiii/device (pull request PR-343)
  • Commit 9bb26eb8f: Restore `__device__`
  • Commit 4245475d2: Fixing missing inlines on device functions for newer CUDA cards
  • Commit 8eb4b9d25: Merged in benoitsteiner/opencl (pull request PR-341)
  • Commit 2dd63ed39: Merge
  • Commit f349507e0: Specialize ThreadPoolDevice::enqueueNotification for the case with no args. As an example this reduces binary size of an TensorFlow demo app for Android by about 2.5%.
  • Commit 688451409: Merged in mehdi_goli/upstr_benoit/ComputeCppNewReleaseFix (pull request PR-16)
  • Commit 0e6e027e9: check both z13 and z14 arches
  • Commit 6c3475f11: remove debugging
  • Commit df7644aec: Merged eigen/eigen into default
  • Commit 98e52cc77: rollback 374f750ad4708408a1255a98964719fd598b0659
  • Commit c4ad35856: explicitly set conjugate mask
  • Commit 380d41fd7: added some extra debugging
  • Commit d0b7b9d0d: some Packet2cf pmul fixes
  • Commit df173f562: initial pexp() for 32-bit floats, commented out due to vec_cts()
  • Commit 3dcae2a27: initial pexp() for 32-bit floats, commented out due to vec_cts()
  • Commit c2a224648: fix predux_mul for z14/float
  • Commit 374f750ad: eliminate 'enumeral and non-enumeral type in conditional expression' warning
  • Commit bc30305d2: complete z14 port
  • Commit 0e85a677e: bug #1472: fix warning
  • Commit 857919516: bug #1468 (1/2) : add missing std:: to memcpy
  • Commit f92567fec: Add link to a useful example.
  • Commit 7ad07fc6f: Update documentation for aligned_allocator
  • Commit 7c9b07dc5: Typo fix alignmeent ->alignment
  • Commit 2062ac995: Changes required for new ComputeCpp CE version.
  • Commit 23f8b00bc: clang provides __has_feature(is_enum) (but not <type_traits>) in C++03 mode
  • Commit 0c9ad2f52: std::integral_constant is not C++03 compatible
  • Commit 1b7294f6f: Fix cut-and-paste error.
  • Commit 94e2213b3: Avoid undefined behavior in Eigen::TensorCostModel::numThreads.
  • Commit 6d42309f1: Fix compilation of Vector::operator()(enum) by treating enums as Index
  • Commit ea4e65bf4: Fixed compilation with cuda_clang.
  • Commit a91918a10: Merged in infinitei/eigen (pull request PR-328)
  • Commit 9c353dd14: Add C++11 max_digits10 for half.
  • Commit b35d1ce4a: Implement true compile-time "if" for apply_rotation_in_the_plane. This fixes a compilation issue for vectorized real type with missing vectorization for complexes, e.g. AVX512.
  • Commit 80142362a: Fix mixing types in sparse matrix products.
  • Commit 810b70ad0: Merged in JonasMu/added-an-example-for-a-contraction-to-a--1504265366851 (pull request PR-1)
  • Commit a34fb212c: Close branch JonasMu/added-an-example-for-a-contraction-to-a--1504265366851
  • Commit a991c8036: Added an example for a contraction to a scalar value, e.g. a double contraction of two second order tensors and how you can get the value of the result. I lost one day to get this doen so I think it will help some guys. I also added Eigen:: to the IndexPair and and array in the same example.
  • Commit a4089991e: Added support for CUDA 9.0.
  • Commit 6d991a959: bug #1464 : Fixes construction of EulerAngles from 3D vector expression.
  • Commit 304ef2957: Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.h
  • Commit 1affe3d8d: Merged eigen/eigen into default
  • Commit 21633e585: bug #1462: remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
  • Commit 12249849b: Make the threshold from gemm to coeff-based-product configurable, and add some explanations.
  • Commit 39864ebe1: bug #336: improve doc for PlainObjectBase::Map
  • Commit 600e52fc7: Add missing scalar conversion
  • Commit 9deee7992: bug #1457: add setUnit() methods for consistency.
  • Commit bc4dae9ae: bug #1449: fix redux_3 unit test
  • Commit bc91a2df8: bug #1461: fix compilation of Map<const Quaternion>::x()
  • Commit fc39d5954: Merged in dtrebbien/eigen/patch-1 (pull request PR-312)
  • Commit b223918ea: Doc: warn about constness in LLT::solveInPlace
  • Commit 4ce5ec519: initial support for z14
  • Commit e1e71ca4e: initial support for z14
  • Commit 84d7be103: Fixing Argmax that was breaking upstream TensorFlow.
  • Commit f0b154a4b: Code cleanup
  • Commit 575cda76b: Fixed syntax errors generated by xcode
  • Commit 5ac27d5b5: Avoid relying on cxx11 features when possible.
  • Commit c5a241ab9: Merged in benoitsteiner/opencl (pull request PR-323)
  • Commit b7ae4dd9e: Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull request PR-315)
  • Commit 9daed6795: Merged in tntnatbry/eigen (pull request PR-319)
  • Commit 6795512e5: Improved the randomness of the tensor random generator
  • Commit dc524ac71: Fixed compilation warning
  • Commit 62b4634eb: Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
  • Commit c92faf9d8: Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13)
  • Commit 53725c10b: Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)
  • Commit c010b1736: Fix warning
  • Commit 561f77707: Fix a gcc7 warning about bool * bool in abs2 default implementation.
  • Commit b651ce0ff: Fix a gcc7 warning: Wint-in-bool-context
  • Commit 157040d44: Make sure CMAKE_Fortran_COMPILER is set before checking for Fortran functions
  • Commit 24fe1de9b: merge
  • Commit b240080e6: bug #1436: fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.
  • Commit 3baef62b9: Added missing __device__ qualifier
  • Commit 449936828: Added missing __device__ qualifier
  • Commit b8e805497: Merged in benoitsteiner/opencl (pull request PR-318)
  • Commit 9fbdf0205: Enable Array(EigenBase<>) ctor for compatible scalar types only. This prevents nested arrays to look as being convertible from/to simple arrays.
  • Commit e43d8fe9d: Fix compilation of streaming nested Array, i.e., cout << Array<Array<>>
  • Commit d9d7bd6d6: Fix 1x1 case in Solve expression with EIGEN_DEFAULT_MATRIX_STORAGE_ORDER_OPTION==RowMajor
  • Commit 95ecb2b5d: Make buildtests.in more robust
  • Commit 3f7fb5a6d: Make eigen_monitor_perf.sh more robust
  • Commit 7f42a9334: Merged in alainvaucher/eigen/find-module-imported-target (pull request PR-324)
  • Commit 7cc503f9f: bug #1485: fix linking issue of non template functions
  • Commit 103c0aa6a: Add KLU in the list of third-party sparse solvers
  • Commit 00bc67c37: Move KLU support to official
  • Commit b82cd93c0: KLU: truely disable unimplemented code, add proper static assertions in solve
  • Commit 6365f937d: KLU depends on BTF but not on libSuiteSparse nor Cholmod
  • Commit 8cf63ccb9: Merged in kylemacfarlan/eigen (pull request PR-337)
  • Commit 1495b98a8: Merged in spraetor/eigen (pull request PR-305)
  • Commit fc4532438: Merged in jkflying/eigen-fix-scaling (pull request PR-302)
  • Commit d306b96fb: Merged in carpent/eigen (pull request PR-342)
  • Commit 1b2dcf9a4: Check that Schur decomposition succeed.
  • Commit 0a1cc7394: bug #1484: restore deleted line for 128 bits long doubles, and improve dispatching logic.
  • Commit f86bb89d3: Add EIGEN_MKL_NO_DIRECT_CALL option
  • Commit 5fa79f96b: Patch from Konstantin Arturov to enable MKL's direct call by default
  • Commit a020d9b13: Use col method for column-major matrix
  • Commit c0e1d510f: Add support for SuiteSparse's KLU routines
  • Commit 6dcf96655: Avoid implicit scalar conversion with accuracy loss in pow(scalar,array)
  • Commit 50e09cca0: fix tipo
  • Commit a4fd4233a: Fix compilation with some compilers
  • Commit c3e2afce0: Enable MSVC 2010 workaround from MSVC only
  • Commit 731c8c704: bug #1403: more scalar conversions fixes in BDCSVD
  • Commit 1bbcf1902: bug #1403: fix implicit scalar type conversion.
  • Commit ba5cab576: bug #1405: enable StrictlyLower/StrictlyUpper triangularView as the destination of matrix*matrix products.
  • Commit 90168c003: bug #1414: doxygen, add EigenBase to CoreModule
  • Commit 26f552c18: fix compilation of Half in C++98 (issue introduced in previous commit)
  • Commit 1d59ca245: Fix compilation with gcc 4.3 and ARM NEON
  • Commit fb1ee0408: bug #1410: fix lvalue propagation of Array/Matrix-Wrapper with a const nested expression.
  • Commit 723a59ac2: add regression test for aliasing in product rewritting
  • Commit 8640093af: fix compilation in C++98
  • Commit a7be4cd1b: Fix LeastSquareDiagonalPreconditioner for complexes (issue introduced in previous commit)
  • Commit 498aa95a8: bug #1424: add numext::abs specialization for unsigned integer types.
  • Commit d58882277: Add missing std::numeric_limits specialization for half, and complete NumTraits<half>
  • Commit 682b2ef17: bug #1423: fix LSCG\'s Jacobi preconditioner for row-major matrices.
  • Commit 4bbc32046: bug #1435: fix aliasing issue in exressions like: A = C - B*A;
  • Commit 9341f258d: Add labels to #ifdef, in TensorReductionCuda.h
  • Commit 1e736b9ea: Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7)
  • Commit 9dee55ec3: Merged eigen/eigen into default
  • Commit 0370d3576: Applying Ronnan's comments.
  • Commit 615aff4d6: Merged in a-doumoulakis/opencl (pull request PR-12)
  • Commit c3bd860de: Modification upon request
  • Commit e3f964ed5: Applying Benoit's comment;removing dead code.
  • Commit df90010cd: Merged in mehdi_goli/opencl/CmakeFixForUbuntu16.04 (pull request PR-11)
  • Commit fb853a857: Restore misplaced comment
  • Commit 7a8ba565f: Merge changed from upstream
  • Commit daf99daad: Merged in DuncanMcBain/opencl/default (pull request PR-2)
  • Commit 9ef5c948b: Fixing Cmake for gcc>=5.
  • Commit 0cb3c7c7d: Update FindComputeCpp.cmake with new changes from SDK
  • Commit 2971503fe: Specializing numeric_limits For AutoDiffScalar
  • Commit 26e8f9171: Fix compilation of matrix log with Map as input
  • Commit f2a553fb7: bug #1411: fix usage of alignment information in vectorization of quaternion product and conjugate.
  • Commit e01814260: Make sure CholmodSupport works when included in multiple compilation units (issue was reported on stackoverflow.com)
  • Commit 8508db52a: bug #1417: make LinSpace compatible with std::complex
  • Commit 9aa7c3016: Merge with Benoit.
  • Commit b42d775f1: Temporarry branch for synch with upstream
  • Commit 615733381: Merged in mehdi_goli/opencl/FixingCmakeDependency (pull request PR-2)
  • Commit 1500a67c4: Merged in mehdi_goli/opencl/TensorSupportedDevice (pull request PR-6)
  • Commit 76c0fc1f9: Fixing SYCL alignment issue required by TensorFlow.
  • Commit 2d17128d6: Fixing suported device list.
  • Commit 61d7f3664: Fixing Cmake Dependency for SYCL
  • Commit a5226ce4f: Add cmake file FindTriSYCL.cmake
  • Commit 052426b82: Add support for triSYCL
  • Commit 4343db84d: updated warning number for nvcc relase 8 (V8.0.61) for the stupid warning message 'calling a __host__ function from a __host__ __device__ function is not allowed'.
  • Commit 9bc0a3573: Fixed nested angle barckets >> issue when compiling with cuda 8
  • Commit 891ac0348: Fix dense * sparse-selfadjoint-view product.
  • Commit 949a2da38: Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer.
  • Commit d9084ac8e: Improve mixing of complex and real in the vectorized path of apply_rotation_in_the_plane
  • Commit f75dfdda7: Fix unwanted Real to Scalar to Real conversions in column-pivoting QR.
  • Commit 0f83aeb6b: Improve cmake scripts for Pastix and BLAS detection.
  • Commit 0d08165a7: Merged in benoitsteiner/opencl (pull request PR-309)
  • Commit 068cc0970: Preserve file naming conventions
  • Commit c302ea7bc: Deleted empty line of code
  • Commit a5a0c8fac: Guard sycl specific code under a EIGEN_USE_SYCL ifdef
  • Commit a1304b95b: Code cleanup
  • Commit 66c63826b: Guard the sycl specific code with EIGEN_USE_SYCL
  • Commit e3e343390: Guard the sycl specific code with a #ifdef EIGEN_USE_SYCL
  • Commit 63840d466: iGate the sycl specific code under a EIGEN_USE_SYCL define
  • Commit bc050ea9f: Fixed compilation error when sycl is enabled.
  • Commit 4910630c9: fix typos in the Tensor readme
  • Commit c1b3d5ecb: Restored code compatibility with compilers that dont support c++11 Gated more sycl code under #ifdef sycl
  • Commit e2d5d4e7b: Restore the old constructors to retain compatibility with non c++11 compilers.
  • Commit 73fcaa319: Gate the sycl specific code under #ifdef sycl
  • Commit bd64ee855: Fixing TensorArgMaxSycl.h; Removing warning related to the hardcoded type of dims to be int in Argmax.
  • Commit 511810797: Issue with mpreal and std::numeric_limits, i.e. digits is not a constant. Added a digits() traits in NumTraits with fallback to static constant. Specialization for mpreal added in MPRealSupport.
  • Commit a91417a7a: Introduces align allocator for SYCL buffer
  • Commit aae19c70a: update has_ReturnType to be more consistent with other has_ helpers
  • Commit f8a622ef3: Merged eigen/eigen into default
  • Commit fd7db52f9: Silenced compilation warning
  • Commit 9597d6f6a: Temporary: Disables cxx11_tensor_argmax_sycl test since it is causing zombie thread
  • Commit c06861d15: Fixes bug in get_sycl_supported_devices() that was reporting unsupported Intel CPU on AMD platform - causing timeouts in that configuration
  • Commit 7f31bb682: Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304)
  • Commit 89fd0c388: better check array index before using it
  • Commit 61160a21d: ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.
  • Commit f0f359111: Made the reduction code compile with cuda-clang
  • Commit f499fe949: Adding synchronisation to convolution kernel for sycl backend.
  • Commit bfd7bf9c5: Get rid of Init().
  • Commit d56ab0109: Use C++11 ctor forwarding to simplify code a bit.
  • Commit 344c2694a: Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases.
  • Commit 1b32a1005: Use name to distinguish name instead of the vendor
  • Commit aadb7405a: Fixing typo in sycl Benchmark.
  • Commit 970ff7829: bug #1401: fix compilation of "cond ? x : -x" with x an AutoDiffScalar
  • Commit 5e9a1e7a7: Adding sycl Benchmarks.
  • Commit e2e3f7853: Fixing potential race condition on sycl device.
  • Commit f84963ed9: Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.
  • Commit e5156e4d2: fix typo
  • Commit 5694315fb: remove UTF8 symbol
  • Commit e958c2baa: remove UTF8 symbols
  • Commit d96771852: do not include std header within extern C
  • Commit 659087b62: bug #1400: fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY
  • Commit 1c03d43a5: Fixed compilation with cuda-clang
  • Commit bbe717fa2: Make scaling work with non-square matrices
  • Commit a71943b9a: Made the Tensor code compile with clang 3.9
  • Commit 09ae0e658: Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that: * they're used consistently between the declaration and the definition of a function * we avoid calling host only methods from host device methods.
  • Commit 1e2d04665: Silenced a couple of compilation warnings
  • Commit c1d87ec11: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit 3a3f040ba: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit 7b6194466: Made most of the packet math primitives usable within CUDA kernel when compiling with clang
  • Commit c92406d61: Silenced clang compilation warning.
  • Commit 857adbbd5: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit c36bc2d44: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit 4a7df114c: Added missing EIGEN_DEVICE_FUNC
  • Commit de7b0fdea: Made the TensorStorage class compile with clang 3.9
  • Commit 765f4cc4b: Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet.
  • Commit e993c94f0: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit 33443ec2b: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit f3e9c4287: Added missing EIGEN_DEVICE_FUNC qualifiers
  • Commit 8296b87d7: Adding sycl backend for TensorCustomOp; fixing the partial lhs modification issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used.
  • Commit 4e98a7b2f: bug #1396: add some missing EIGEN_DEVICE_FUNC
  • Commit 478a9f53b: Fix typo.
  • Commit 889c606f8: Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops
  • Commit 193939d6a: Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods.
  • Commit ed4dc9d01: Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel
  • Commit b1fc7c9a0: Added missing EIGEN_DEVICE_FUNC qualifiers.
  • Commit 554116bec: Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase
  • Commit 34d9fce93: Avoid unecessary float to double conversions.
  • Commit e0bd6f573: Merged eigen/eigen into default
  • Commit 2fa2b617a: Adding TensorVolumePatchOP.h for sycl
  • Commit 0b7875f13: Converting fixed float type into template type for TensorContraction.
  • Commit 89dfd51fa: Adding Sycl Backend for TensorGenerator.h.
  • Commit 5c68ba41a: typos
  • Commit b0f55ef85: merge
  • Commit d29e9d711: Improve documentation of reshaped
  • Commit 9b6e36501: Fix linking issue.
  • Commit 3d200257d: Add support for automatic-size deduction in reshaped, e.g.:
  • Commit f8179385b: Add missing const version of mat(all).
  • Commit 1e3aa470f: Fix long to int conversion
  • Commit b3fc0007a: Add support for mat(all) as an alias to mat.reshaped(mat.size(),fix<1>);
  • Commit 4f07ac16b: Reducing the number of warnings.
  • Commit 76687f385: bug #1394: fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse);
  • Commit d8b1f6ceb: bug #1380: for Map<> as input of matrix exponential
  • Commit 657282570: bug #1395: fix the use of compile-time vectors as inputs of JacobiSVD.
  • Commit 79ebc8f76: Adding Sycl backend for TensorImagePatchOP.h; adding Sycl backend for TensorInflation.h.
  • Commit 9081c8f6e: Add support for RowOrder reshaped
  • Commit a811a0469: Silent warning.
  • Commit 63798df03: Fix usage of CUDACC_VER
  • Commit deefa54a5: Fix tracking of temporaries in unit tests
  • Commit f8a55cc06: Fix compilation.
  • Commit cbbf88c4d: Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON.
  • Commit 582b5e39b: bug #1393: enable Matrix/Array explicit ctor from types with conversion operators (was ok with 3.2)
  • Commit cfa0568ef: Size indices are signed.
  • Commit 91982b91c: Adding TensorLayoutSwapOp for sycl.
  • Commit b1e312edd: Adding TensorPatch.h for sycl backend.
  • Commit 31a25ab22: Merged eigen/eigen into default
  • Commit 0d153ded2: Adding TensorChippingOP for sycl backend; fixing the index value in the verification operation for cxx11_tensorChipping.cpp test
  • Commit 5937c4ae3: Fall back is_integral to std::is_integral in c++11
  • Commit 707343094: Fix overflow and make use of long long in c++11 only.
  • Commit 3453b00a1: Fix vector indexing with uint64_t
  • Commit e7ebe52bf: bug #1391: include IO.h before DenseBase to enable its usage in DenseBase plugins.
  • Commit b3750990d: Workaround some gcc 4.7 warnings
  • Commit 4b22048ce: Fallback Reshaped to MapBase when possible (same storage order and linear access to the nested expression)
  • Commit 83d6a529c: Use Eigen::fix<N> to pass compile-time sizes.
  • Commit c16ee72b2: bug #1392: fix #include <Eigen/Sparse> with mpl2-only
  • Commit e43016367: Forgot to include a file in previous commit
  • Commit 6486d4fc9: Worakound gcc 4.7 issue in c++11.
  • Commit 4a4a72951: Fix previous commits: disbale only problematic indexed view methods for old compilers instead of disabling everything. Tested with gcc 4.7 (c++03) and gcc 4.8 (c++03 & c++11)
  • Commit fad776492: Merged eigen/eigen into default
  • Commit 1ef30b809: Fixed bug introduced in previous commit
  • Commit 769208a17: Pulled latest updates from upstream
  • Commit 8b3cc54c4: Added a new EIGEN_HAS_INDEXED_VIEW define that set to 0 for older compilers that are known to fail to compile the indexed views (I used the define from the indexed_views.cpp test). Only include the indexed view methods when the compiler supports the code. This makes it possible to use Eigen again in complex code bases such as TensorFlow and older compilers such as gcc 4.8
  • Commit a1ff24f96: Fix prunning in (sparse*sparse).pruned() when the result is nearly dense.
  • Commit 0256c5235: Include clang in the list of non strict MSVC (just to be sure)
  • Commit dd58462e6: fixed inlining issue with clang-cl on visual studio (grafted from 7962ac1a5855e8b7a60d5d90e61365b71f5501a5 )
  • Commit fc8fd5fd2: Improve multi-threading heuristic for matrix products with a small number of columns.
  • Commit 0ee97b60c: Adding mean to TensorReductionSycl.h
  • Commit 42bd5c4e7: Fixing TensorReductionSycl for min and max.
  • Commit 4254b3eda: bug #1389: MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)
  • Commit bc128f9f3: Reducing the warnings in Sycl backend.
  • Commit 442e9cbb3: Silenced several compilation warnings
  • Commit 2db75c07a: fixed the ordering of the template and EIGEN_DEVICE_FUNC keywords in a few more places to get more of the Eigen codebase to compile with nvcc again.
  • Commit fcd257039: Replaced EIGEN_DEVICE_FUNC template<foo> with template<foo> EIGEN_DEVICE_FUNC to make the code compile with nvcc8.
  • Commit 84090027c: Disable a part of the unit test for gcc 4.8
  • Commit 0eceea4ef: Define EIGEN_COMP_GNUC to reflect version number: 47, 48, 49, 50, 60, ...
  • Commit ff5305003: Converting ptrdiff_t type to int64_t type in cxx11_tensor_contract_sycl.cpp in order to be the same as other tests.
  • Commit bab29936a: Reducing warnings in Sycl backend.
  • Commit 645a8e32a: Fix compilation of JacobiSVD for vectors type
  • Commit 48a20b7d9: Fixing compiler error on TensorContractionSycl.h; Silencing the compiler unused parameter warning for eval_op_indices in TensorContraction.h
  • Commit 53026d29d: bug #478: fix regression in the eigen decomposition of zero matrices.
  • Commit fbc39fd02: Merge latest changes from upstream
  • Commit 63de19c00: bug #1380: fix matrix exponential with Map<>
  • Commit c86911ac7: bug #1384: fix evaluation of "sparse/scalar" that used the wrong evaluation path.
  • Commit 82ce92419: Fixing the buffer type in memcpy.
  • Commit 24409f3ac: Use fix<> API to specify compile-time reshaped sizes.
  • Commit 9036cda36: Cleanup intitial reshape implementation: - reshape -> reshaped - make it compatible with evaluators.
  • Commit 0e89baa5d: import yoco xiao's work on reshape
  • Commit d024e9942: MSVC 1900 release is not c++14 compatible enough for us. The 1910 update seems to be fine though.
  • Commit 83592659b: merge
  • Commit 4a351be16: Fix warning
  • Commit 251ad3e04: Fix unamed type as template parametre issue.
  • Commit edaa0fc5d: Revert PR-292. After further investigation, the memcpy->memmove change was only good for Haswell on older versions of glibc. Adding a switch for small sizes is perhaps useful for string copies, but also has an overhead for larger sizes, making it a poor trade-off for general memcpy.
  • Commit 25a170357: Merged in ggael/eigen-flexidexing (pull request PR-294)
  • Commit 98dfe0c13: Fix useless ';' warning
  • Commit 28351073d: Fix unamed type as template argument (ok in c++11 only)
  • Commit 607be65a0: Fix duplicates of array_size bewteen unsupported and Core
  • Commit 7d39c6d50: Merged eigen/eigen into default
  • Commit 5c9ed4ba0: Reverse arguments for pmin in AVX.
  • Commit 850ca961d: bug #1383: fix regression in LinSpaced for integers and high<low
  • Commit 296d24be4: bug #1381: fix sparse.diagonal() used as a rvalue. The problem was that is "sparse" is not const, then sparse.diagonal() must have the LValueBit flag meaning that sparse.diagonal().coeff(i) must returns a const reference, const Scalar&. However, sparse::coeff() cannot returns a reference for a non-existing zero coefficient. The trick is to return a reference to a local member of evaluator<SparseMatrix>.
  • Commit d06a48959: bug #1383: Fix regression from 3.2 with LinSpaced(n,0,n-1) with n==0.
  • Commit ae3e43a12: Remove extra space.
  • Commit e96c77668: Merged in rmlarsen/eigen2 (pull request PR-292)
  • Commit 3be5ee235: Update copy helper to use fast_memcpy.
  • Commit e6b102022: Adds a fast memcpy function to Eigen. This takes advantage of the following:
  • Commit 7b6aaa344: Fix NaN propagation for AVX512.
  • Commit 5e144bbaa: Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
  • Commit d83db761a: Add support for std::integral_constant
  • Commit bc1020185: Add test for multiple symbols
  • Commit c43d254d1: Fix seq().reverse() in c++98
  • Commit 5783158e8: Add unit test for FixedInt and Symbolic
  • Commit ddd83f82d: Add support for "SymbolicExpr op fix<N>" in C++98/11 mode.
  • Commit 228fef1b3: Extended the set of arithmetic operators supported by FixedInt (-,+,*,/,%,&,|)
  • Commit bb52f74e6: Add internal doc
  • Commit 41c523a0a: Rename fix_t to FixedInt
  • Commit 156e6234f: bug #1375: fix cmake installation with cmake 2.8
  • Commit ba3f97794: bug #1376: add missing assertion on size mismatch with compound assignment operators (e.g., mat += mat.col(j))
  • Commit b0db4eff3: bug #1382: move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)
  • Commit ca79c1545: Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t
  • Commit 4b607b569: Use Index instead of size_t
  • Commit bf44fed9b: Allows AMD APU
  • Commit 0fe278f7b: bug #1379: fix compilation in sparse*diagonal*dense with openmp
  • Commit 22a172751: bug #1378: fix doc (DiagonalIndex vs Diagonal)
  • Commit 602f8c27f: Reverting back to the previous TensorDeviceSycl.h as the total number of buffer is not enough for tensorflow.
  • Commit 4d302a080: Recover compile-time size from seq(A,B) when A and B are fixed values. (c++11 only)
  • Commit 54f3fbee2: Exploit fixed values in seq and reverse with C++98 compatibility
  • Commit 7691723e3: Add support for fixed-value in symbolic expression, c++11 only for now.
  • Commit 924600a0e: Made sure that enabling avx2 instructions enables avx and sse instructions as well.
  • Commit 77cc4d06c: Removing unused variables
  • Commit 837fdbdcb: Merging with Benoit's upstream.
  • Commit 6bdd15f57: Adding non-deferrenciable pointer track for ComputeCpp backend; Adding TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class.
  • Commit aa7fb88df: Merged in LaFeuille/eigen (pull request PR-289)
  • Commit e84ed7b6e: Remove dead code
  • Commit f3ccbe041: Add a Symbolic::FixedExpr helper expression to make sure the compiler fully optimize the usage of last and end.
  • Commit c6f7b3383: Applying Benoit's comment. Embedding synchronisation inside device memcpy so there is no need to externally call synchronise() for device memcopy.
  • Commit 15471432f: Add a .reverse() member to ArithmeticSequence.
  • Commit e4f8dd860: Add missing operator*
  • Commit 198507141: Update all block expressions to accept compile-time sizes passed by fix<N> or fix<N>(n)
  • Commit 5484ddd35: Merge the generic and dynamic overloads of block()
  • Commit 655ba783f: Defer set-to-zero in triangular = product so that no aliasing issue occur in the common: A.triangularView() = B*A.sefladjointView()*B.adjoint() case that used to work in 3.2.
  • Commit 5e36ec3b6: Fix regression when passing enums to operator()
  • Commit f7852c3d1: Fix -Wunnamed-type-template-args
  • Commit 4f36dcfda: Add a generic block() method compatible with Eigen::fix
  • Commit 71e5b7135: Add a get_runtime_value helper to deal with pointer-to-function hack, plus some refactoring to make the internals more consistent.
  • Commit 59801a325: Add \newin{3.x} doxygen command
  • Commit 23bfcfc15: Add missing overload of get_compile_time for c++98/11
  • Commit edff32c2c: Disambiguate the two versions of fix for doxygen
  • Commit 4989922be: Add support for symbolic expressions as arguments of operator()
  • Commit 12e22a284: typos in doc
  • Commit e70c4c97f: Typo
  • Commit a9232af84: Introduce a variable_or_fixed<N> proxy returned by fix<N>(val) to pass both a compile-time and runtime fallback value in case N means "runtime". This mechanism is used by the seq/seqN functions. The proxy object is immediately converted to pure compile-time (as fix<N>) or pure runtime (i.e., an Index) to avoid redundant template instantiations.
  • Commit 6e9769816: Introduce a EIGEN_HAS_CXX14 macro
  • Commit e46e72238: Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying Tensor Contractsycl to be located in any place in the expression tree.
  • Commit 23778a15d: Reverting unintentional change to Eigen/Geometry
  • Commit 1b19b80c0: Fix a typo
  • Commit 8245d3c7a: Fix case-sensitivity of file include
  • Commit 752bd92ba: Large code refactoring: - generalize some utilities and move them to Meta (size(), array_size()) - move handling of all and single indices to IndexedViewHelper.h - several cleanup changes
  • Commit f93d1c58e: Make get_compile_time compatible with variable_if_dynamic
  • Commit c020d307a: Make variable_if_dynamic<T> implicitely convertible to T
  • Commit 43c617e2e: merge
  • Commit 152cd57bb: Enable generation of doc for static variables in Eigen's namespace.
  • Commit b1dc0fa81: Move fix and symbolic to their own file, and improve doxygen compatibility
  • Commit 04397f17e: Add 1D overloads of operator()
  • Commit 45199b977: Fix typo
  • Commit 1b5570988: Add doc to seq, seqN, ArithmeticSequence, operator(), etc.
  • Commit 17eac6044: Factorize const and non-const version of the generic operator() method.
  • Commit d072fc4b1: add writeable IndexedView
  • Commit c9d5e5c6d: Simplify Symbolic API: std::tuple is now used internally and automatically built.
  • Commit 407e7b7a9: Simplify symbolic API by using "symbol=value" to associate a runtime value to a symbol.
  • Commit 96e6cf9aa: Fix linking issue.
  • Commit e63678bc8: Fix ambiguous call
  • Commit 8e247744a: Fix linking issue
  • Commit b47a7e5c3: Add doc for IndexedView
  • Commit 87963f441: Fallback to Block<> when possible (Index, all, seq with > increment). This is important to take advantage of the optimized implementations (evaluator, products, etc.), and to support sparse matrices.
  • Commit a98c7efb1: Add a more generic evaluation mechanism and minimalistic doc.
  • Commit 13d954f27: Cleanup Eigen's namespace
  • Commit 9eaab4f9e: Refactoring: move all symbolic stuff into its own namespace
  • Commit acd08900c: Move 'last' and 'end' to their own namespace
  • Commit 1df2377d7: Implement c++98 version of seq()
  • Commit ecd9cc541: Isolate legacy code (we keep it for performance comparison purpose)
  • Commit b50c3e967: Add a minimalistic symbolic scalar type with expression template and make use of it to define the last placeholder and to unify the return type of seq and seqN.
  • Commit 68064e14f: Rename span/range to seqN/seq
  • Commit ad3eef760: Add link to SO
  • Commit 75aef5b37: Fix extraction of compile-time size of std::array with gcc
  • Commit 233dff1b3: Add support for plain arrays for columns and both rows/columns
  • Commit 76e183bd5: Propagate compile-time size for plain arrays
  • Commit 3264d3c76: Add support for plain-array as indices, e.g., mat({1,2,3,4})
  • Commit 831fffe87: Add missing doc of SparseView
  • Commit a875167d9: Propagate compile-time increment and strides. Had to introduce a UndefinedIncr constant for non structured list of indices.
  • Commit e383d6159: MSVC 2015 has all we want about c++11 and MSVC 2017 fails on binder1st/binder2nd
  • Commit fad1fa75b: Propagate compile-time size with "all" and add c++11 array unit test
  • Commit 3730e3ca9: Use "fix" for compile-time values, propagate compile-time sizes for span, clean some cleanup.
  • Commit 60e99ad8d: Add unit test for indexed views
  • Commit ac7e4ac9c: Initial commit to add a generic indexed-based view of matrices. This version already works as a read-only expression. Numerous refactoring, renaming, extension, tuning passes are expected...
  • Commit f3f026c9a: Convert integers to real numbers when computing relative L2 error
  • Commit 0c226644d: LLT: const the arg to solveInPlace() to allow passing .transpose(), .block(), etc.
  • Commit be281e528: LLT: avoid making a copy when decomposing in place
  • Commit e27f17bf5: Gub 1453: fix Map with non-default inner-stride but no outer-stride.
  • Commit 21d0a0bcf: bug #1456: add perf recommendation for LLT and storage format
  • Commit 2c3d70d91: Re-enable hidden doc in LLT
  • Commit a6e7a41a5: bug #1455: Cholesky module depends on Jacobi for rank-updates.
  • Commit e6021cc8c: bug #1458: fix documentation of LLT and LDLT info() method.
  • Commit 2810ba194: Clarify MKL_DIRECT_CALL doc.
  • Commit f72784465: use MKL's lapacke.h header when using MKL
  • Commit 8c858bd89: Clarify doc regarding the usage of MKL_DIRECT_CALL
  • Commit b95f92843: Fix support for MKL's BLAS when using MKL_DIRECT_CALL.
  • Commit 89c01a494: Add unit test for has_ReturnType
  • Commit 687bedfca: Make NoAlias and JacobiRotation compatible with CUDA.
  • Commit 1f4b24d2d: Do not preallocate more space than the matrix size (when the sparse matrix boils down to a vector
  • Commit d580a90c9: Disable BDCSVD preallocation check.
  • Commit 55d718155: Fix lazyness of operator* with CUDA
  • Commit cda47c42c: Fix compilation in c++98 mode.
  • Commit a74b9ba7c: Update documentation for CUDA
  • Commit 3182bdbae: Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined
  • Commit 9f8136ff7: disable nvcc boolean-expr-is-constant warning
  • Commit bbd97b409: Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
  • Commit 2299717fd: Fix and workaround several doxygen issues/warnings
  • Commit 90c5bc8d6: Fixes auto appearance in functor template argument for reduction.
  • Commit ee6f7f6c0: Add doc for sparse triangular solve functions
  • Commit 5165de97a: Add missing snippet files.
  • Commit a0a36ad0e: bug #1336: workaround doxygen failing to include numerous members of MatriBase in Matrix
  • Commit 29a1a5811: Document selfadjointView
  • Commit a5ebc92f8: bug #1336: fix doxygen issue regarding EIGEN_CWISE_BINARY_RETURN_TYPE
  • Commit 45b289505: Add debug output
  • Commit 5838f078a: Fix inclusion
  • Commit 870256217: bug #1370: add doc for StorageIndex
  • Commit 575c07875: bug #1370: rename _Index to _StorageIndex in SparseMatrix, and add a warning in the doc regarding the 3.2 to 3.3 change of SparseMatrix::Index
  • Commit c4fc2611b: add cmake-option to enable/disable creation of tests * * * disable unsupportet/test when test are disabled * * * rename EIGEN_ENABLE_TESTS to BUILD_TESTING * * * consider BUILD_TESTING in blas
  • Commit d3c5525c2: Added += and + operators to inner iterators
  • Commit 5c2796245: Move common cwise-unary method from MatrixBase/ArrayBase to the common DenseBase class.
  • Commit 4ebf69394: doc: Fix trivial typo in AsciiQuickReference.txt * * * fixup!
  • Commit 8d7810a47: bug #1365: fix another type mismatch warning
  • Commit 97812ff0d: bug #1369: fix type mismatch warning.
  • Commit 7713e20fd: Fix compilation
  • Commit ab69a7f6d: Cleanup because trait<CwiseBinaryOp>::Flags now expose the correct storage order
  • Commit d32a43e33: Make sure that traits<CwiseBinaryOp>::Flags reports the correct storage order so that methods like .outerSize()/.innerSize() work properly.
  • Commit 713626746: Add missing .outer() member to iterators of evaluators of cwise sparse binary expression
  • Commit fe0ee7239: Fix check of storage order mismatch for "sparse cwiseop sparse".
  • Commit 6b8f637ab: Harmless typo
  • Commit 3eda02d78: Fixed the sycl benchmarking code
  • Commit 8b1c2108b: Reverting asynchronous exec to Synchronous exec regarding random race condition.
  • Commit 354baa0fb: Avoid using horizontal adds since they're not very efficient.
  • Commit d7825b670: Use native AVX512 types instead of Eigen Packets whenever possible.
  • Commit 660da83e1: Pulled latest update from trunk
  • Commit 4236aebe1: Simplified the contraction code`
  • Commit 3cfa16f41: Merged in benoitsteiner/opencl (pull request PR-279)
  • Commit 519d63d35: Added support for libxsmm kernel in multithreaded contractions
  • Commit 065722856: Simplified the way we link libxsmm
  • Commit bbca405f0: Pulled latest updates from trunk
  • Commit b91be6022: Automatically include and link libxsmm when present.
  • Commit c6882a72e: Merged in joaoruileal/eigen (pull request PR-276)
  • Commit f9eff17e9: Leverage libxsmm kernels within signle threaded contractions
  • Commit c19fe5e9e: Added support for libxsmm in the eigen makefiles
  • Commit a34d4ebd7: Merged in benoitsteiner/opencl (pull request PR-278)
  • Commit c55ecfd82: Fix for auto appearing in functor template argument.
  • Commit c8c89b5e1: renamed methods umfpackReportControl(), umfpackReportInfo(), and umfpackReportStatus() from UmfPackLU to printUmfpackControl(), printUmfpackInfo(), and printUmfpackStatus()
  • Commit 0f577d474: Merged eigen/eigen into default
  • Commit f2f9df8aa: Remove MSVC warning 4127 - conditional expression is constant from the disabled list as we now have a local workaround.
  • Commit 2b3fc981b: bug #1362: workaround constant conditional warning produced by MSVC
  • Commit 29186f766: Fixed order of initialisation in ExecExprFunctorKernel functor.
  • Commit 94e8d8902: Fix bug #1367: compilation fix for gcc 4.1!
  • Commit e8d6862f1: Properly adjust precision when saving to Market format.
  • Commit e2f4ee1c2: Speed up parsing of sparse Market file.
  • Commit 8245851d1: Matching parameters order between lambda and the functor.
  • Commit 684cfc762: Add transpose, adjoint, conjugate methods to SelfAdjointView (useful to write generic code)
  • Commit 8bd0d3aa3: merge
  • Commit 11f55b297: Optimize storage layout of Cwise* and PlainObjectBase evaluator to remove the functor or outer-stride if they are empty. For instance, sizeof("(A-B).cwiseAbs2()") with A,B Vector4f is now 16 bytes, instead of 48 before this optimization. In theory, evaluators should be completely optimized away by the compiler, but this might help in some cases.
  • Commit 5271474b1: Remove common "noncopyable" base class from evaluator_base to get a chance to get EBO (Empty Base Optimization) Note: we should probbaly get rid of this class and define a macro instead.
  • Commit 1c024e558: Added some possible temporaries to .hgignore
  • Commit 316673bbd: Clean-up usage of ExpressionTraits in all/any implementation.
  • Commit 548ed30a1: Added an OpenCL regression test
  • Commit 10c6bcdc2: Add support for long indexes and for (real-valued) row-major matrices to CholmodSupport module
  • Commit f5d644b41: Make sure that HyperPlane::transform manitains a unit normal vector in the Affine case.
  • Commit 27ceb43bf: Fixed race condition in the tensor_shuffling_sycl test
  • Commit 923acadfa: Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
  • Commit 751e097c5: Use 32 registers on ARM64
  • Commit fb1d0138e: Include SSE packet instructions when compiling with avx512 enabled.
  • Commit 95b804c0f: it is now possible to change Umfpack control settings before factorizations; added access to the report functions of Umfpack
  • Commit 8c0e70150: bug #1360: fix sign issue with pmull on altivec
  • Commit fc94258e7: Fix unused warning
  • Commit 0e0d92d34: Merged in benoitsteiner/opencl (pull request PR-275)
  • Commit 9e03dfb45: Made sure EIGEN_HAS_C99_MATH is defined when compiling OpenCL code
  • Commit 70d0172f0: Merged eigen/eigen into default
  • Commit 8910442e1: Fixed memcpy, memcpyHostToDevice and memcpyDeviceToHost for Sycl.
  • Commit 54db66c5d: struct -> class in order to silence compilation warning.
  • Commit 35bae513a: Converting all parallel for lambda to functor in order to prevent kernel duplication name error; adding tensorConcatinationOp backend for sycl.
  • Commit d60cca32e: Transformation methods added to ParametrizedLine class.
  • Commit 7949849eb: refactor common row/column iteration code into its own class
  • Commit d7bc64328: add display of entries to gdb sparse matrix prettyprinter
  • Commit ff424927b: Introduce a simple pretty printer for sparse matrices (no contents)
  • Commit 5ce541863: Correct prettyprinter comment - Quaternions are in fact supported
  • Commit 8f11df266: NumTraits.h: For the values 'ReadCost, AddCost and MulCost', information about value Eigen::HugeCost
  • Commit 7d5303a08: Partly revert changeset 642dddcce29269f266d35e34d34ee83d99a7c116 , just in case the x87 issue popup again
  • Commit 2f7c2459b: Merged in benoitsteiner/opencl (pull request PR-272)
  • Commit c5e854630: Adding asynchandler to sycl queue as lack of it can cause undefined behaviour.
  • Commit 4247d35d4: Fixed bug which (extremely rarely) could end in an infinite loop
  • Commit 642dddcce: Fix nonnull-compare warning
  • Commit 1324ffef2: Reenabled the use of constexpr on OpenCL devices
  • Commit 5d00fdf0e: bug #1363: fix mingw's ABI issue
  • Commit 2c2e21847: Avoid using #define since they can conflict with user code
  • Commit 3beb180ee: Don't call EnvThread::OnCancel by default since it doesn't do anything.
  • Commit 9ff5d0f82: Merged eigen/eigen into default
  • Commit 730eb9fe1: Adding asynchronous execution as it improves the performance.
  • Commit 11b492e99: bug #1358: fix compilation for sparse += sparse.selfadjointView();
  • Commit e67397bfa: bug #1359: fix compilation of col_major_sparse.row() *= scalar (used to work in 3.2.9 though the expression is not really writable)
  • Commit 98d745827: bug #1359: fix sparse /=scalar and *=scalar implementation. InnerIterators must be obtained from an evaluator.
  • Commit 2d4a091be: Adding tensor contraction operation backend for Sycl; adding test for contractionOp sycl backend; adding temporary solution to prevent memory leak in buffer; cleaning up cxx11_tensor_buildins_sycl.h
  • Commit c817ce3ba: bug #1361: fix compilation issue in mat=perm.inverse()
  • Commit a432fc102: Moved the choice of ThreadPool to unsupported/Eigen/CXX11/ThreadPool
  • Commit 8ae68924e: Made ThreadPoolInterface::Cancel() an optional functionality
  • Commit 57acb05ee: Update and extend doc on alignment issues.
  • Commit 76fca2213: Use a more accurate timer to sleep on Linux systems.
  • Commit 4deafd35b: Introduce a portable EIGEN_SLEEP macro.
  • Commit aafa97f4d: Fixed build error with MSVC
  • Commit 2f5b7a199: Reworked the threadpool cancellation mechanism to not depend on pthread_cancel since it turns out that pthread_cancel doesn't work properly on numerous platforms.
  • Commit 3d59a4772: Added a message to ease the detection of platforms on which thread cancellation isn't supported.
  • Commit 28ee8f42b: Added a Flush method to the RunQueue
  • Commit 69ef267a7: Added the new threadpool cancel method to the threadpool interface based class.
  • Commit 7bfff8535: Added support for thread cancellation on Linux
  • Commit 6811e6cf4: Merged in srvasude/eigen/fix_cuda_exp (pull request PR-268)
  • Commit 747202d33: typo
  • Commit bb297abb9: make sure we use the right eigen version
  • Commit 8b4b00d27: fix usage of custom compiler
  • Commit 710559689: Add missing include and use -O3
  • Commit 780f3c1ad: Fix call to convert on linux
  • Commit 3855ab472: Cleanup file structure
  • Commit 59a59fa8e: Update perf monitoring scripts to generate html/svg outputs
  • Commit 769468499: Remove superfluous const's (can cause warnings on some Intel compilers) (grafted from e236d3443c79f38aa721d95e64c275abbb5df10f )
  • Commit f2c506b03: Add a script example to run and upload performance tests
  • Commit 1b4e085a7: generate png file for web upload
  • Commit f725f1ceb: Mention the CMAKE_PREFIX_PATH variable.
  • Commit f90c4aebc: Update monitored changeset lists
  • Commit eb621413c: Revert vec/y to vec*(1/y) in row-major TRSM: - div is extremely costly - this is consistent with the column-major case - this is consistent with all other BLAS implementations
  • Commit 8365c2c94: Fix BLAS backend for symmetric rank K updates.
  • Commit 0c4d05b00: Explain how to choose your favorite Eigen version
  • Commit e049a2a72: Added relocatable cmake support also for CMake before 3.0 and after 2.8.8
  • Commit e6c8b5500: Change comparisons to use Scalar instead of RealScalar.
  • Commit f7d7c33a2: Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).
  • Commit 18481b518: Make CMake config file relocatable
  • Commit c68c8631e: fix compilation of BTL's blaze interface
  • Commit 1ff1d4a12: Add performance monitoring for LLT
  • Commit 09ee7f0c8: Fix small nit where I changed name of plog1p to pexpm1.
  • Commit a0d3ac760: Sync from Head.
  • Commit 218764ee1: Added support for expm1 in Eigen.
  • Commit 66f65ccc3: Ease compiler job to generate clean and efficient code in mat*vec.
  • Commit fe696022e: Operators += and -= do not resize!
  • Commit 18de92329: use numext::abs (grafted from 0a08d4c60b652d1f24b2fa062c818c4b93890c59 )
  • Commit e8a6aa518: 1. Add explicit template to abs2 (resolves deduction for some arithmetic types) 2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned) (grafted from 4086187e49760d4bde72750dfa20ae9451263417 )
  • Commit a6b971e29: Fix memory leak in Ref<Sparse>
  • Commit 8640ffac6: Optimize SparseLU::solve for rhs vectors
  • Commit 62acd6790: remove temporary in SparseLU::solve
  • Commit 0db6d5b3f: bug #1356: fix calls to evaluator::coeffRef(0,0) to get the address of the destination by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).
  • Commit 91003f3b8: typo
  • Commit 445c01575: extend monitoring benchmarks with transpose matrix-vector and triangular matrix-vectors.
  • Commit e3f613cbd: Improve performance of row-major-dense-matrix * vector products for recent CPUs. This revised version does not bother about aligned loads/stores, and rather processes 8 rows at ones for better instruction pipelining.
  • Commit 3abc82735: Clean debugging code
  • Commit 462c28e77: Merged in srvasude/eigen (pull request PR-265)
  • Commit 4465d2040: Add missing generic load methods.
  • Commit 6a5fe8609: Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU. The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive. This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA. According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast. Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching. We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).
  • Commit 2bfece5cd: Merged eigen/eigen into default
  • Commit 592acc5bf: Makingt default numeric_list works with sycl.
  • Commit 8dfb3e00b: merge
  • Commit 4c0d5f3c0: Add perf monitoring for gemv
  • Commit d2718d662: Re-enable A^T*A action in BTL
  • Commit 22f7d398e: bug #1355: Fixed wrong line-endings on two files
  • Commit 27873008d: Clean up SparseCore module regarding ReverseInnerIterator
  • Commit 8c24723a0: typo UIntPtr (grafted from b6f04a2dd4d68fe1858524709813a5df5b9a085b )
  • Commit aeba0d865: fix two warnings(unused typedef, unused variable) and a typo (grafted from a9aa3bcf50d55b63c8adb493a06c903ec34251c6 )
  • Commit 181138a1c: fix member order
  • Commit 9f297d57a: Merged in rmlarsen/eigen (pull request PR-256)
  • Commit f95e3b84a: merge
  • Commit 7ff26ddcb: Merged eigen/eigen into default
  • Commit 037b46762: Fix misleading-indentation warnings.
  • Commit 79aa2b784: Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.
  • Commit a70393fd0: Cleaned up forward declarations
  • Commit e073de96d: Moved the MemCopyFunctor back to TensorSyclDevice since it's the only caller and it makes TensorFlow compile again
  • Commit fca27350e: Added the deallocate_all() method back
  • Commit e633a8371: Simplified includes
  • Commit 7cd33df4c: Improved formatting
  • Commit fd1dc3363: Merged eigen/eigen into default
  • Commit f5107010e: Udated the Sizes class to work on AMD gpus without requiring a separate implementation
  • Commit e37c2c52d: Added an implementation of numeric_list that works with sycl
  • Commit 8df272af8: Fix slection of product implementation for dynamic size matrices with fixed max size.
  • Commit faa2ff99c: Pulled latest update from trunk
  • Commit df3da0780: Updated customIndices2Array to handle various index sizes.
  • Commit c927af60e: Fix a performance regression in (mat*mat)*vec for which mat*mat was evaluated multiple times.
  • Commit 26fff1c5b: Added EIGEN_STRONG_INLINE to get_sycl_supported_device().
  • Commit ab4ef5e66: bug #1351: fix compilation of random with old compilers
  • Commit 5e3c5c42f: cmake: remove architecture dependency from Eigen3ConfigVersion.cmake
  • Commit 3440b46e2: doc: mention the NO_MODULE option and target availability (grafted from 65f09be8d2aaeda054cce574ea14a74b00507011 )
  • Commit a0329f64f: Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.
  • Commit 577ce7808: Adding TensorShuffling backend for sycl; adding TensorReshaping backend for sycl; cleaning up the sycl backend.
  • Commit 3011dc94e: Call internal::array_prod to compute the total size of the tensor.
  • Commit 02080e2b6: Merged eigen/eigen into default
  • Commit 9fd081cdd: Fixed compilation warnings
  • Commit 9f8fbd943: Merged eigen/eigen into default
  • Commit 67b2c41f3: Avoided unnecessary type conversion
  • Commit 7fe704596: Added missing array_get method for numeric_list
  • Commit 7318daf88: Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h.
  • Commit 7ad37606d: Fixed the documentation of Scalar Tensors
  • Commit 3be1afca1: Disabled the "remove the call to 'std::abs' since unsigned values cannot be negative" warning introduced in clang 3.5
  • Commit 308961c05: Fix compilation.
  • Commit 21d0286d8: bug #1348: Document EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES, and reflect in the doc that EIGEN_DONT_ALIGN* are deprecated.
  • Commit b8cc5635d: Removing unsupported device from test case; cleaning the tensor device sycl.
  • Commit 7f6333c32: Merged in tal500/eigen-eulerangles (pull request PR-237)
  • Commit f12b36841: Extend polynomial solver unit tests to complexes
  • Commit 56e5ec07c: Automatically switch between EigenSolver and ComplexEigenSolver, and fix a few Real versus Scalar issues.
  • Commit 924658712: Patch from Oleg Shirokobrod to extend polynomial solver to complexes
  • Commit e340866c8: Fix compilation with gcc and old ABI version
  • Commit a91de27e9: Fix compilation issue with MSVC: MSVC always messes up with shadowed template arguments, for instance in: struct B { typedef float T; } template<typename T> struct A : B { T g; }; The type of A<double>::g will be float and not double.
  • Commit 74637fa4e: Optimize predux<Packet8f> (AVX)
  • Commit 178c08485: Disable usage of SSE3 _mm_hadd_ps that is extremely slow.
  • Commit 7dd894e40: Optimize predux<Packet4d> (AVX)
  • Commit f3fb0a194: Disable usage of SSE3 haddpd that is extremely slow.
  • Commit 5c516e4e0: cmake: added Eigen3::Eigen imported target (grafted from a287140f7292b9c15719bc6a3a4494ac7874e3cd )
  • Commit 6a84246a6: Fix regression in assigment of sparse block to spasre block.
  • Commit f11da1d83: Made the QueueInterface thread safe
  • Commit ed839c585: Enable the use of constant expressions with clang >= 3.6
  • Commit 6d781e3e5: Merged eigen/eigen into default
  • Commit 79a07b891: Fixed a typo
  • Commit 465ede0f2: Fix compilation issue in mat = permutation (regression introduced in 8193ffb3d38b56c9295f204dc57dc6bac74f58aa )
  • Commit 81151bd47: Fixed merge conflicts
  • Commit 9265ca707: Made it possible to check the state of a sycl device without synchronization
  • Commit 2d1aec15a: Added missing include
  • Commit af67335e0: Added test for cwiseMin, cwiseMax and operator%.
  • Commit 1bdf1b9ce: Merged in benoitsteiner/opencl (pull request PR-253)
  • Commit a357fe1fb: Code cleanup
  • Commit 1c6eafb46: Updated cxx11_tensor_device_sycl to run only on the OpenCL devices available on the host
  • Commit ca754caa2: Only runs the cxx11_tensor_reduction_sycl on devices that are available.
  • Commit dc601d79d: Added the ability to run test exclusively OpenCL devices that are listed by sycl::device::get_devices().
  • Commit 8649e16c2: Enable EIGEN_HAS_C99_MATH when building with the latest version of Visual Studio
  • Commit 110b7f8d9: Deleted unnecessary semicolons
  • Commit b5e3285e1: Test broadcasting on OpenCL devices with 64 bit indexing
  • Commit 164414c56: Merged in ChunW/eigen (pull request PR-252)
  • Commit 37c2c516a: Cleaned up the sycl device code
  • Commit 7335c4920: Fixed the cxx11_tensor_device_sycl test
  • Commit 15e226d7d: adding Benoit changes on the TensorDeviceSycl.h
  • Commit 622805a0c: Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}.
  • Commit 5159675c3: Added isnan, isfinite and isinf for SYCL device. Plus test for that.
  • Commit 76b2a3e6e: Allow to construct EulerAngles from 3D vector directly. Using assignment template struct to distinguish between 3D vector and 3D rotation matrix.
  • Commit 927bd62d2: Now testing out (+=, =) in.FUNC() and out (+=, =) out.FUNC()
  • Commit 8193ffb3d: bug #1343: fix compilation regression in mat+=selfadjoint_view. Generic EigenBase2EigenBase assignment was incomplete.
  • Commit cebff7e3a: bug #1343: fix compilation regression in array = matrix_product
  • Commit 7c30078b9: Merged eigen/eigen into default
  • Commit 553f50b24: Added a way to detect errors generated by the opencl device from the host
  • Commit 72a45d32e: Cleanup
  • Commit 4349fc640: Created a test to check that the sycl runtime can successfully report errors (like ivision by 0). Small cleanup
  • Commit a6a3fd070: Made TensorDeviceCuda.h compile on windows
  • Commit 0d0948c3b: Workaround for error in VS2012 with /clr
  • Commit 004344cf5: Avoid calling log(0) or 1/0
  • Commit a1d5c503f: replace sizeof(Packet) with PacketSize else it breaks for ZVector.Packet4f
  • Commit 672aa97d4: implement float/std::complex<float> for ZVector as well, minor fixes to ZVector
  • Commit 8290e21fb: replace sizeof(Packet) with PacketSize else it breaks for ZVector.Packet4f
  • Commit 7878756de: Fixed existing test.
  • Commit c5130dedb: Specialised basic math functions for SYCL device.
  • Commit f2e8b7325: Enable the use of AVX512 instruction by default
  • Commit 7b09e4dd8: bump default branch to 3.3.90
  • Commit dff9a049c: Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs
  • Commit b5c75351e: Merged eigen/eigen into default
  • Commit 32df1b104: Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics).
  • Commit 05e8c2a1d: Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.
  • Commit f8ca89397: Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.
  • Commit 0ee92aa38: Optimize sparse<bool> && sparse<bool> to use the same path as for coeff-wise products.
  • Commit 2e334f5da: bug #426: move operator && and || to MatrixBase and SparseMatrixBase.
  • Commit a048aba14: Merged in olesalscheider/eigen (pull request PR-248)
  • Commit eedb87f4b: Fix regression in SparseMatrix::ReverseInnerIterator
  • Commit 51fef8740: Make sure not to call numext::maxi on expression templates
  • Commit a5c3f1568: Adding comment to TensorDeviceSycl.h and cleaning the code.
  • Commit f4722aa47: Merged in benoitsteiner/opencl (pull request PR-247)
  • Commit 3be396302: Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor.
  • Commit 12387abad: adding the missing in eigen_assert!
  • Commit 2e704d425: Adding Memset; optimising MecopyDeviceToHost by removing double copying;
  • Commit 75c080b17: Added a test to validate memory transfers between host and sycl device
  • Commit 15eca2432: Euler tests: Tighter precision when no roll exists and clean code.
  • Commit 6f4f12d1e: Add isApprox() and cast() functions.
  • Commit 7402cfd4c: Add safty for near pole cases and test them better.
  • Commit 58f5d7d05: Fix calc bug, docs and better testing.
  • Commit 078a20262: Merge Hongkai Dai correct range calculation, and remove ranges from API. Docs updated.
  • Commit 014d9f1d9: implement euler angles with the right ranges
  • Commit e19b58e67: alias template for matrix and array classes
  • Commit 15f273b63: fix reshape flag and test case
  • Commit b64a09acc: fix reshape's Max[Row/Col]AtCompileTime
  • Commit f8ad87f22: Reshape always non-directly-access
  • Commit 515bbf8bb: Improve reshape test case
  • Commit 009047db2: Fix Reshape traits flag calculate bug
  • Commit 2b8908090: Remove reshape InnerPanel, add test, fix bug
  • Commit 03723abda: Remove useless reshape row/col ctor
  • Commit 342c8e532: Fix Reshape DirectAccessBit bug
  • Commit 24e1c0f2a: add reshape test for const and static-size matrix
  • Commit 150796337: Add unit-test for reshape
  • Commit 497a7b0ce: remove c++11, make c++03 compatible
  • Commit 9c832fad6: add reshape() snippets
  • Commit 1e1d0c15b: add example code for Reshape class
  • Commit fe2ad0647: reshape now supported
  • Commit 7bd58ad0b: add Eigen/src/Core/Reshape.h