tensormesh.distributed¶
Mesh partitioning and parallel assembly across multiple devices,
with integration into torch-sla’s distributed sparse solver. See
Distributed FEM for a worked walkthrough.
DistributedMesh¶
- class DistributedMesh(mesh: Mesh, num_partitions: int | None = None, method: str = 'coordinate', devices: List[device] | None = None)[source]¶
Bases:
objectPartitioned mesh for multi-GPU parallel assembly.
Wraps
partition_meshto split a global mesh into submeshes, each assigned to a separate device. Each submesh stores anorig_nidmapping (local node index → global node index) inpoint_data.- Parameters:
mesh (Mesh) – Global mesh (typically on CPU).
num_partitions (int, optional) – Number of partitions. Defaults to
torch.cuda.device_count(), or 2 if no CUDA devices are available.method (str, optional) – Partitioning method:
'coordinate'(default, fast RCB),'spectral', or'metis'.devices (list of device, optional) – Devices to assign partitions to. Defaults to
cuda:0, cuda:1, ...orcpuif CUDA is unavailable.
Examples
>>> mesh = tm.Mesh.gen_rectangle(chara_length=0.05) >>> dmesh = DistributedMesh(mesh, num_partitions=4) >>> print(dmesh.num_partitions, dmesh.n_global_points)
Distributed assembly¶
- distributed_element_assemble(assembler_cls: Type[ElementAssembler], dmesh: DistributedMesh, quadrature_order: int = 2, project: str = 'reduce', call_kwargs: dict | None = None, **assembler_kwargs) DSparseTensor[source]¶
Assemble element matrix in parallel across multiple devices.
Assemblers are created sequentially (for CUDA thread-safety), then assembly computation runs in parallel threads on separate GPUs.
- Parameters:
assembler_cls (Type[ElementAssembler]) – The assembler class (e.g.,
LaplaceElementAssembler).dmesh (DistributedMesh) – Partitioned mesh with device assignments.
quadrature_order (int, optional) – Quadrature order for integration. Default: 2.
project (str, optional) – Projection method:
'reduce'or'sparse'. Default:'reduce'.call_kwargs (dict, optional) – Extra keyword arguments passed to
assembler.__call__()(e.g.,point_data,scalar_data).**assembler_kwargs – Extra keyword arguments passed to
assembler_cls.from_mesh().
- Returns:
Distributed sparse matrix ready for distributed solve.
- Return type:
DSparseTensor
- distributed_element_assemble_to_sparse(assembler_cls: Type[ElementAssembler], dmesh: DistributedMesh, quadrature_order: int = 2, project: str = 'reduce', call_kwargs: dict | None = None, **assembler_kwargs) SparseMatrix[source]¶
Assemble element matrix in parallel, returning a global SparseMatrix.
Same as
distributed_element_assemble()but returns a standardSparseMatrixinstead of torch-sla’sDSparseTensor.
- distributed_node_assemble(assembler_cls: Type[NodeAssembler], dmesh: DistributedMesh, quadrature_order: int = 2, project: str = 'reduce', point_data: Dict[str, Tensor] | None = None, call_kwargs: dict | None = None, **assembler_kwargs) Tensor[source]¶
Assemble node vector (RHS) in parallel across multiple devices.