Sparse Solvers ============== After assembly and boundary-condition condensation you have a sparse system — linear or nonlinear — to solve. TensorMesh delegates this work to `torch-sla `_, a standalone differentiable sparse-linear-algebra library. The same FEM code retargets between CPU and GPU, and between iterative and direct solvers, by changing one keyword argument; nonlinear systems are driven by the same backend through a Newton / Picard / Anderson interface. Install ``torch-sla`` --------------------- .. important:: ``torch-sla`` is a **hard, import-time** dependency of :mod:`tensormesh.sparse`. The module will not import without it. It is also the library where all current and future solver work lands — TensorMesh tracks ``torch-sla`` releases as the canonical sparse-linear-algebra layer and we recommend keeping it up to date. The base wheel ships the CPU stack (SciPy / native PyTorch). GPU backends are opt-in extras — you can install one or both, depending on which CUDA solver you want: .. code-block:: bash pip install torch-sla # CPU stack only pip install "torch-sla[cupy]" # + CuPy backend (iterative + SuperLU) pip install "torch-sla[cudss]" # + cuDSS backend (fastest GPU direct) pip install "torch-sla[all]" # both GPU backends + dev tooling TensorMesh also exposes the GPU extras at its own install layer (``pip install "tensormesh-fem[cupy]"`` etc.) — see :doc:`/getting_started/installation`. Inspect available backends ~~~~~~~~~~~~~~~~~~~~~~~~~~ After installing, ``torch_sla.show_backends()`` prints a status table showing which backends are usable on the current machine and how to install any that are missing: .. code-block:: python >>> import torch_sla >>> torch_sla.show_backends() torch-sla backend status (CUDA: available) scipy [CPU] available eigen [CPU] not available — JIT-compiled C++ extension (requires a C++ compiler) pytorch [CPU/CUDA] available cupy [CUDA] not available — pip install torch-sla[cupy] cudss [CUDA] not available — pip install torch-sla[cudss] What ``torch-sla`` gives TensorMesh: * :class:`~tensormesh.sparse.SparseTensor` — the COO data type that :class:`~tensormesh.sparse.SparseMatrix` extends. Element assembly hands you a :class:`~tensormesh.sparse.SparseMatrix` directly, so every operator method on the parent class (``@``, ``.solve``, ``.nonlinear_solve``, ``.is_symmetric``, …) is available without any extra wrapping. * A unified ``solve`` op with a custom backward pass (an adjoint sparse solve), so gradients flow through every linear system. * A pluggable backend layer covering CPU, GPU, iterative and direct solvers behind a single dispatch entry point. Solving ``A x = b``: the ``.solve()`` method -------------------------------------------- The canonical entry point is ``SparseMatrix.solve`` — inherited unchanged from ``torch_sla.SparseTensor``. Assembly already returns a :class:`~tensormesh.sparse.SparseMatrix`, so this is the only call you normally need: .. code-block:: python K = LaplaceElementAssembler.from_mesh(mesh)() x = K.solve(b) # auto backend + method x = K.solve(b, backend="cudss") # force NVIDIA GPU direct x = K.solve(b, method="bicgstab") # force a specific iterative method **Symmetry / positive-definiteness is auto-detected.** On every call, ``torch-sla`` computes ``is_symmetric()`` and ``is_positive_definite()`` on the matrix, then picks CG / Cholesky for SPD systems and BiCGStab / LU otherwise. You do **not** pass an ``is_spd`` hint; passing one would be ignored. Key keyword arguments (full list in the ``torch-sla`` reference): * ``backend``: ``"auto"`` (CPU → SciPy, CUDA → cuDSS when available, else CuPy / native PyTorch), or one of ``"scipy"``, ``"eigen"``, ``"pytorch"``, ``"cupy"``, ``"cudss"``. * ``method``: ``"auto"`` (chosen from matrix properties + backend), or an iterative method (``"cg"``, ``"bicgstab"``, ``"minres"``, ``"gmres"``, ``"lgmres"``) or a direct factorization (``"lu"``, ``"umfpack"``, ``"cholesky"``, ``"ldlt"``). Not every backend supports every method — see the table below. * ``atol`` (default ``1e-10``), ``tol`` (default ``1e-12``), ``maxiter`` (default ``10000``) for iterative convergence. * ``verbose=True`` prints a one-line summary of the auto-selected backend / method and the detected matrix properties — handy when you want to confirm that ``"auto"`` picked the path you expected: .. code-block:: pycon >>> x = K.solve(b, verbose=True) [torch-sla] solve: n=1024, nnz=7168, dtype=float64, device=cpu, symmetric=True, spd=True, backend=scipy, method=lu Supported backends ------------------ .. list-table:: :header-rows: 1 :widths: 14 12 12 16 46 * - Backend - String - Device - Direct / Iter - Methods * - SciPy - ``"scipy"`` - CPU - both - ``lu``, ``umfpack``, ``cg``, ``bicgstab``, ``gmres``, ``lgmres``, ``minres``, ``qmr``. Default on CPU. * - Eigen - ``"eigen"`` - CPU - iterative - ``cg``, ``bicgstab``. C++ Eigen via pybind11. * - Native PyTorch - ``"pytorch"`` - CPU / GPU - iterative - ``cg``, ``bicgstab`` with Jacobi preconditioning. Pure torch — fully autograd-traced. * - CuPy - ``"cupy"`` - GPU - both - ``lu``, ``cg``, ``cgs``, ``gmres``, ``minres``, ``lsqr``, ``lsmr``. * - cuDSS - ``"cudss"`` - GPU - direct - ``lu``, ``cholesky``, ``ldlt``. NVIDIA cuDSS — fastest GPU path; default on CUDA when memory allows. Batched right-hand sides ------------------------ When ``b`` has shape ``[n_dof, n_batch]``, ``torch-sla`` automatically amortizes the factorization across the batch — one factor pass, ``n_batch`` back-substitutions — instead of running an iterative solver independently per column. This is the workhorse of the :mod:`tensormesh.dataset` ML workflow. .. code-block:: python B = torch.randn(K.shape[0], 64) # 64 right-hand sides X = K.solve(B) # one factorization See :doc:`batched_workflows` for the broader story on batching. Nonlinear systems: ``.nonlinear_solve()`` ----------------------------------------- For problems where the residual ``F(u, params) = 0`` is nonlinear in ``u`` — hyperelasticity, plasticity, phase-field — use ``SparseMatrix.nonlinear_solve`` (inherited from ``torch_sla.SparseTensor``). The matrix is passed automatically into the residual closure, the Jacobian is obtained through autograd, and the backward pass uses the adjoint method. .. code-block:: python from torch_sla import SparseTensor # SparseMatrix also works # Nonlinear PDE: A @ u + u**2 = f def residual(u, A, f): return A @ u + u**2 - f f = torch.randn(n, requires_grad=True) u0 = torch.zeros(n) u = A.nonlinear_solve(residual, u0, f, method='newton') # Gradients flow via the implicit-function theorem loss = u.sum() loss.backward() print(f.grad) # dL/df Key keyword arguments: * ``method``: ``"newton"`` (default, Newton-Raphson with Armijo line search), ``"picard"`` (fixed-point), or ``"anderson"`` (Anderson acceleration). * ``line_search`` (default ``True``) — toggle Armijo backtracking. * ``tol`` / ``atol`` / ``max_iter`` for the outer nonlinear loop. * ``linear_solver`` / ``linear_method`` to pick the backend used for each inner linear solve. The forward pass runs the chosen nonlinear iteration; the backward pass solves a single adjoint system, so gradients of ``u`` w.r.t. any ``requires_grad`` parameter (including ``A.values``) cost roughly one extra linear solve regardless of how many nonlinear iterations the forward took. Lower-level entry points (legacy) --------------------------------- .. deprecated:: 0.x :func:`tensormesh.sparse.spsolve` and :func:`tensormesh.sparse.nonlinear_solve` are TensorMesh-internal fallbacks that pre-date the ``torch-sla`` integration. They are **scheduled for removal**; the canonical solver path is the :class:`~tensormesh.sparse.SparseMatrix` (i.e. ``torch_sla.SparseTensor``) methods above. :func:`tensormesh.sparse.spsolve` is a free function that operates on raw COO arrays ``(edata, row, col, shape, b)`` instead of a :class:`~tensormesh.sparse.SparseMatrix` object. In practice this is rarely useful — assembly produces a :class:`~tensormesh.sparse.SparseMatrix`, condensation produces a :class:`~tensormesh.sparse.SparseMatrix`, and ``K.solve(b)`` covers every FEM workflow in this guide. ``tensormesh.sparse.spsolve`` will be retired once the remaining call sites have moved over to ``SparseMatrix.solve`` / ``torch_sla.spsolve``. :func:`tensormesh.sparse.nonlinear_solve` is the in-tree Newton-Raphson helper. It requires the user to supply an explicit Jacobian closure and lacks line search, Picard / Anderson modes, and the autograd-based Jacobian assembly that ``torch_sla.SparseTensor.nonlinear_solve`` provides. Migrate to ``A.nonlinear_solve(residual, u0, *params)``. What's next ----------- * :doc:`time_integration` — wrap a linear solve in a time-stepping loop for transient problems. * :doc:`batched_workflows` — three different axes of batching, and when each one applies. * :doc:`differentiability` — how the adjoint backward works and how to wire a learnable parameter through a sparse solve.