Skip to content

Use Kokkos:Parallel in SmootherTake and Prepare COO/CSR Matrix#239

Open
julianlitz wants to merge 18 commits intomainfrom
litz_007
Open

Use Kokkos:Parallel in SmootherTake and Prepare COO/CSR Matrix#239
julianlitz wants to merge 18 commits intomainfrom
litz_007

Conversation

@julianlitz
Copy link
Copy Markdown
Collaborator

Merge Request - GuideLine Checklist

Guideline to check code before resolve WIP and approval, respectively.
As many checkboxes as possible should be ticked.

Checks by code author:

Always to be checked:

  • There is at least one issue associated with the pull request.
  • New code adheres with the coding guidelines
  • No large data files have been added to the repository. Maximum size for files should be of the order of KB not MB. In particular avoid adding of pdf, word, or other files that cannot be change-tracked correctly by git.

If functions were changed or functionality was added:

  • Tests for new functionality has been added
  • A local test was succesful

If new functionality was added:

  • There is appropriate documentation of your work. (use doxygen style comments)

If new third party software is used:

  • Did you pay attention to its license? Please remember to add it to the wiki after successful merging.

If new mathematical methods or epidemiological terms are used:

  • Are new methods referenced? Did you provide further documentation?

Checks by code reviewer(s):

  • Is the code clean of development artifacts e.g., unnecessary comments, prints, ...
  • The ticket goals for each associated issue are reached or problems are clearly addressed (i.e., a new issue was introduced).
  • There are appropriate unit tests and they pass.
  • The git history is clean and linearized for the merge request. All reviewers should squash commits and write a simple and meaningful commit message.
  • Coverage report for new code is acceptable.
  • No large data files have been added to the repository. Maximum size for files should be of the order of KB not MB. In particular avoid adding of pdf, word, or other files that cannot be change-tracked correctly by git.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.80%. Comparing base (5952563) to head (ae5c560).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #239      +/-   ##
==========================================
+ Coverage   90.72%   90.80%   +0.08%     
==========================================
  Files          86       86              
  Lines        9473     9556      +83     
==========================================
+ Hits         8594     8677      +83     
  Misses        879      879              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@julianlitz julianlitz changed the title Use Kokkos:Parallel in SmootherTake and Prepare COO/CSR Use Kokkos:Parallel in SmootherTake and Prepare COO/CSR Matrix May 2, 2026
@julianlitz
Copy link
Copy Markdown
Collaborator Author

julianlitz commented May 4, 2026

@EmilyBourne If you think that *innerBoundaryMatrix_ptr doesnt work on GPU, then we probably need to delete my custom matrix copy constructor/assignment of the Coo/Csr matrix.
Otherwise it does a true copy and after the kokkos kernel all values remain untouched since it only operated on the true copy.

@EmilyBourne
Copy link
Copy Markdown
Collaborator

@EmilyBourne If you think that *innerBoundaryMatrix_ptr doesnt work on GPU, then we probably need to delete my custom matrix copy constructor/assignment of the Coo/Csr matrix. Otherwise it does a true copy and after the kokkos kernel all values remain untouched since it only operated on the true copy.

I'm already working on this. I renamed to copy so you can still have a full copy if you need

@julianlitz
Copy link
Copy Markdown
Collaborator Author

I think its better to mark the copy constructor as = delete so we dont mess things up

@EmilyBourne
Copy link
Copy Markdown
Collaborator

I think its better to mark the copy constructor as = delete so we dont mess things up

That was my first step. But it is impossible to delete the copy constructor and have things working on GPU as it is used to copy data to the GPU

@julianlitz
Copy link
Copy Markdown
Collaborator Author

Ah yes you are right

@julianlitz
Copy link
Copy Markdown
Collaborator Author

julianlitz commented May 5, 2026

@EmilyBourne Claude suggested that the issue was the use of the member Vector instead of AllocatableVector.

I think that in

  • LevelCache
  • PolarGrid
  • COO Matrix
  • CSR Matrix
  • Tridiagonal Matrix

we need to use AllocatableVector as the private member.

"Why AllocatableVector fixes the by-value capture

Kokkos View objects are designed with reference semantics: copying a View does not copy the underlying data, it merely increments a reference count and shares the same data pointer. This is the behaviour you want when capturing a BatchedTridiagonalSolver by value in a Kokkos lambda.

The problem was the Vector<T> type alias:

template <typename T>
using Vector = Kokkos::View<T*, Kokkos::LayoutRight, Kokkos::HostSpace> const;

The const here applies to the View handle itself, not the data it points to. A const member variable cannot be copy-assigned, which causes the compiler-generated copy constructor of BatchedTridiagonalSolver to be deleted. Without a valid copy constructor, passing the solver by value into a lambda no longer invokes Kokkos's shallow-copy mechanism — the object simply cannot be copied correctly.

Switching the private members to AllocatableVector<T>:

template <typename T>
using AllocatableVector = Kokkos::View<T*, Kokkos::LayoutRight, Kokkos::HostSpace>;

removes the const from the handle, allowing the compiler-generated copy constructor to invoke the View copy constructor on each member. This gives you exactly the intended shallow-copy behaviour: passing BatchedTridiagonalSolver by value into a lambda increments the reference count and shares the underlying data, with no allocation or data movement."

@julianlitz julianlitz mentioned this pull request May 7, 2026
14 tasks
@EmilyBourne
Copy link
Copy Markdown
Collaborator

Claude suggested that the issue was the use of the member Vector instead of AllocatableVector.

I disagree with Claude.
The copy-constructor does not exist in BatchedTridiagonalSolver because once one constructor is explicitly defined there are no default constructors. Adding:

KOKKOS_DEFAULTED_FUNCTION BatchedTridiagonalSolver(const BatchedTridiagonalSolver&) = default;

should be sufficient to create a BatchedTridiagonalSolver containing a shallow copy of the vectors.

Copy link
Copy Markdown
Collaborator

@EmilyBourne EmilyBourne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just noticed something that I missed in #162. You are using KOKKOS_LAMBDA in non-static class functions. This will not work on GPU.

See https://kokkos.org/kokkos-core-wiki/API/core/macros-special/host_device_macros.html#kokkos-lambda

When creating lambdas inside of class member functions you may need to use KOKKOS_CLASS_LAMBDA instead.

The copy-constructor also needs to be marked with KOKKOS_DEFAULTED_FUNCTION

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants