#### Four-dimensional tomographic reconstruction by time domain decomposition | 2018

**Viktor V. NikitinMarcus CarlssonFredrik AnderssonRajmund Mokso**

Since the beginnings of tomography, the requirement that the sample does not change during the acquisition of one tomographic rotation is unchanged. We derived and successfully implemented a tomographic reconstruction method which relaxes this decades-old requirement of static samples. In the presented method, dynamic tomographic data sets are decomposed in the temporal domain using basis functions and deploying an L1 regularization technique where the penalty factor is taken for spatial and temporal derivatives. We implemented the iterative algorithm for solving the regularization problem on modern GPU systems to demonstrate its practical use.

#### An iterative domain decomposition, spectral finite element method on non-conforming meshes suitable for high frequency Helmholtz problems | 2018

**Ryan GalaguszSteve McFee**

The purpose of this research is to describe an efficient iterative method suitable for obtaining high accuracy solutions to high frequency time-harmonic scattering problems. The method allows for both refinement of local polynomial degree and non-conforming mesh refinement, including multiple hanging nodes per edge. Rather than globally assemble the finite element system, we describe an iterative domain decomposition method which can use element-wise fast solvers for elements of large degree. Since continuity between elements is enforced through moment equations, the resulting constraint equations are hierarchical. We show that, for high frequency problems, a subset of these constraints should be directly enforced, providing the coarse space in the dual-primal domain decomposition method. The subset of constraints is chosen based on a dispersion criterion involving mesh size and wavenumber. By increasing the size of the coarse space according to this criterion, the number of iterations in the domain decomposition method depends only weakly on the wavenumber. We demonstrate this convergence behaviour on standard domain decomposition test problems and conclude the paper with application of the method to electromagnetic problems in two dimensions. These examples include beam steering by lenses and photonic crystal waveguides, as well as radar cross section computation for dielectric, perfect electric conductor, and electromagnetic cloak scatterers.

#### A monolithic approach to fluid-structure interaction based on a hybrid Eulerian-ALE fluid domain decomposition involving cut elements | 2018

**Benedikt SchottChristoph AgerWolfgang A. Wall**

A novel method for complex fluid-structure interaction (FSI) involving large structural deformation and motion is proposed. The new approach is based on a hybrid fluid formulation that combines the advantages of purely Eulerian (fixed-grid) and Arbitrary-Lagrangian-Eulerian (ALE moving mesh) formulations in the context of FSI. The structure - as commonly given in Lagrangian description - is surrounded by a fine resolved layer of fluid elements based on an ALE-framework. This ALE-fluid patch, which is embedded in an Eulerian background fluid domain, follows the deformation and motion of the structural interface. This approximation technique is not limited to Finite Element Methods, but can can also be realized within other frameworks like Finite Volume or Discontinuous Galerkin Methods. In this work, the surface coupling between the two disjoint fluid subdomains is imposed weakly using a stabilized Nitsche's technique in a Cut Finite Element Method (CutFEM) framework. At the fluid-solid interface, standard weak coupling of node-matching or non-matching finite element approximations can be utilized. As the fluid subdomains can be meshed independently, a sufficient mesh quality in the vicinity of the common fluid-structure interface can be assured. To our knowledge the proposed method is the only method (despite some overlapping domain decomposition approaches that suffer from other issues) that allows for capturing boundary layers and flow detachment via appropriate grids around largely moving and deforming bodies and is able to do this e.g. without the necessity of costly remeshing procedures. In addition it might also help to safe computational costs as now background grids can be much coarser. Various FSI-cases of rising complexity conclude the work.

#### Parallel SPH modeling using dynamic domain decomposition and load balancing displacement of Voronoi subdomains | 2018

**M. S. EgorovaS. A. DyachkovA. N. ParshikovV. V. Zhakhovsky**

A highly adaptive load balancing algorithm for parallel simulations using particle methods, such as molecular dynamics and smoothed particle hydrodynamics (SPH), is developed. Our algorithm is based on the dynamic spatial decomposition of simulated material samples between Voronoi subdomains, where each subdomain with all its particles is handled by a single computational process which is typically run on a single CPU core of a multiprocessor computing cluster. The algorithm displaces the positions of neighbor Voronoi subdomains in accordance with the local load imbalance between the corresponding processes. It results in particle transfers from heavy-loaded processes to less-loaded ones. Iteration of the algorithm puts into alignment the processor loads. Convergence to a well-balanced decomposition from imbalanced one is improved by the usage of multi-body terms in the balancing displacements. The high adaptability of the balancing algorithm to simulation conditions is illustrated by SPH modeling of the dynamic behavior of materials under extreme conditions, which are characterized by large pressure and velocity gradients, as a result of which the spatial distribution of particles varies greatly in time. The higher parallel efficiency of our algorithm in such conditions is demonstrated by comparison with the corresponding static decomposition of the computational domain. Our algorithm shows almost perfect strong scalability in tests using from tens to several thousand processes.

#### Galerkin-Collocation domain decomposition method for arbitrary binary black holes | 2018

**W. BarretoP. C. M. ClementeH. P. de OliveiraB. Rodriguez-Mueller**

We present a new computational framework for the Galerkin-collocation method for double domain in the context of ADM 3+1 approach in numerical relativity. This work enables us to perform high resolution calculations for initial sets of two arbitrary black holes. We use the Bowen-York method for binary systems and the puncture method to solve the Hamiltonian constraint. The nonlinear numerical code solves the set of equations for the spectral modes using the standard Newton-Raphson method, LU decomposition and Gaussian quadratures. We show convergence of our code for the conformal factor and the ADM mass. Thus, we display features of the conformal factor for different masses, spins and linear momenta.

#### Additive domain decomposition operator splittings -- convergence analyses in a dissipative framework | 2018

**Eskil HansenErik Henningsson**

We analyze temporal approximation schemes based on overlapping domain decompositions. As such schemes enable computations on parallel and distributed hardware, they are commonly used when integrating large-scale parabolic systems. Our analysis is conducted by first casting the domain decomposition procedure into a variational framework based on weighted Sobolev spaces. The time integration of a parabolic system can then be interpreted as an operator splitting scheme applied to an abstract evolution equation governed by a maximal dissipative vector field. By utilizing this abstract setting, we derive an optimal temporal error analysis for the two most common choices of domain decomposition based integrators. Namely, alternating direction implicit schemes and additive splitting schemes of first and second order. For the standard first-order additive splitting scheme we also extend the error analysis to semilinear evolution equations, which may only have mild solutions.

#### A Galerkin-Collocation domain decomposition method: application to the evolution of cylindrical gravitational waves | 2018

**W. O. BarretoJ. A. CrespoH. P. de OliveiraE. L. Rodrigues**

We present an innovative Galerkin-Collocation domain decomposition algorithm applied to the evolution of cylindrical unpolarized gravitational waves. We display how the domain decomposition effectiveness was successfully implemented for those initial data with localized high gradients. We have characterized the gravitational radiation using the standard Weyl scalar using the Newman-Penrose tetrad. Thus, it was possible to unveil aspects of the interaction between both gravitational polarization wave modes by generating the corresponding templates at the radiation zone.

#### A linear domain decomposition method for two-phase flow in porous media | 2017

**David SeusFlorin A. RaduChristian Rohde**

This article is a follow up of our submitted paper [11] in which a decomposition of the Richards equation along two soil layers was discussed. A decomposed problem was formulated and a decoupling and linearisation technique was presented to solve the problem in each time step in a fixed point type iteration. This article extends these ideas to the case of two-phase in porous media and the convergence of the proposed domain decomposition method is rigorously shown.

#### Convergence estimates for multigrid algorithms with SSC smoothers and applications to overlapping domain decomposition | 2017

**Eugenio AulisaGiorgio BorniaSara CalandriniGiacomo Capodaglio**

In this paper we study convergence estimates for a multigrid algorithm with smoothers of successive subspace correction (SSC) type, applied to symmetric elliptic PDEs. First, we revisit a general convergence analysis on a class of multigrid algorithms in a fairly general setting, where no regularity assumptions are made on the solution. In this framework, we are able to explicitly highlight the dependence of the multigrid error bound on the number of smoothing steps. For the case of no regularity assumptions, this represents a new addition to the existing theory. Then, we analyze successive subspace correction smoothing schemes for a set of uniform and local refinement applications with either nested or non-nested overlapping subdomains. For these applications, we explicitly derive bounds for the multigrid error, and identify sufficient conditions for these bounds to be independent of the number of multigrid levels. For the local refinement applications, finite element grids with arbitrary hanging nodes configurations are considered. The analysis of these smoothing schemes is cast within the far-reaching multiplicative Schwarz framework.

#### Robust and scalable domain decomposition solvers for unfitted finite element methods | 2017

**Santiago BadiaFrancesc Verdugo**

Unfitted finite element methods, e.g., extended finite element techniques or the so-called finite cell method, have a great potential for large scale simulations, since they avoid the generation of body-fitted meshes and the use of graph partitioning techniques, two main bottlenecks for problems with non-trivial geometries. However, the linear systems that arise from these discretizations can be much more ill-conditioned, due to the so-called small cut cell problem. The state-of-the-art approach is to rely on sparse direct methods, which have quadratic complexity and are thus not well suited for large scale simulations. In order to solve this situation, in this work we investigate the use of domain decomposition preconditioners (balancing domain decomposition by constraints) for unfitted methods. We observe that a straightforward application of these preconditioners to the unfitted case has a very poor behavior. As a result, we propose a {customization} of the classical BDDC methods based on the stiffness weighting operator and an improved definition of the coarse degrees of freedom in the definition of the preconditioner. These changes lead to a robust and algorithmically scalable solver able to deal with unfitted grids. A complete set of complex 3D numerical experiments show the good performance of the proposed preconditioners.

#### Stochastic basis adaptation and spatial domain decomposition for PDEs with random coefficients | 2017

**Ramakrishna TipireddyPanos StinisAlexandre Tartakovsky**

We present a novel uncertainty quantification approach for high-dimensional stochastic partial differential equations that reduces the computational cost of polynomial chaos methods by decomposing the computational domain into non-overlapping subdomains and adapting the stochastic basis in each subdomain so the local solution has a lower dimensional random space representation. The local solutions are coupled using the Neumann-Neumann algorithm, where we first estimate the interface solution then evaluate the interior solution in each subdomain using the interface solution as a boundary condition. The interior solutions in each subdomain are computed independently of each other, which reduces the operation count from $O(N^\alpha)$ to $O(M^\alpha),$ where $N$ is the total number of degrees of freedom, $M$ is the number of degrees of freedom in each subdomain, and the exponent $\alpha>1$ depends on the uncertainty quantification method used. In addition, the localized nature of solutions makes the proposed approach highly parallelizable. We illustrate the accuracy and efficiency of the approach for linear and nonlinear differential equations with random coefficients.

#### Hybrid discontinuous Galerkin discretisation and domain decomposition preconditioners for the Stokes problem | 2017

**Gabriel R. BarrenecheaMichał BosyVictorita DoleanFrédéric NatafPierre-Henri Tournier**

Solving the Stokes equation by an optimal domain decomposition method derived algebraically involves the use of non standard interface conditions whose discretisation is not trivial. For this reason the use of approximation methods such as hybrid discontinuous Galerkin appears as an appropriate strategy: on the one hand they provide the best compromise in terms of the number of degrees of freedom in between standard continuous and discontinuous Galerkin methods, and on the other hand the degrees of freedom used in the non standard interface conditions are naturally defined at the boundary between elements. In this paper we introduce the coupling between a well chosen discretisation method (hybrid discontinuous Galerkin) and a novel and efficient domain decomposition method to solve the Stokes system. We present the detailed analysis of the hybrid discontinuous Galerkin method for the Stokes problem with non standard boundary conditions. This analysis is supported by numerical evidence. In addition, the advantage of the new preconditioners over more classical choices is also supported by numerical experiments.

#### A linear domain decomposition method for partially saturated flow in porous media | 2017

**David SeusKoondanibha MitraIuliu Sorin PopFlorin Adrian RaduChristian Rohde**

The Richards equation is a nonlinear parabolic equation that is commonly used for modelling saturated/unsaturated flow in porous media. We assume that the medium occupies a bounded Lipschitz domain partitioned into two disjoint subdomains separated by a fixed interface $\Gamma$. This leads to two problems defined on the subdomains which are coupled through conditions expressing flux and pressure continuity at $\Gamma$. After an Euler implicit discretisation of the resulting nonlinear subproblems a linear iterative ($L$-type) domain decomposition scheme is proposed. The convergence of the scheme is proved rigorously. In the last part we present numerical results that are in line with the theoretical finding, in particular the unconditional convergence of the scheme. We further compare the scheme to other approaches not making use of a domain decomposition. Namely, we compare to a Newton and a Picard scheme. We show that the proposed scheme is more stable than the Newton scheme while remaining comparable in computational time, even if no parallelisation is being adopted. Finally we present a parametric study that can be used to optimize the proposed scheme.

#### Convergence analysis of domain decomposition based time integrators for degenerate parabolic equations | 2017

**Monika EisenmannEskil Hansen**

Domain decomposition based time integrators allow the usage of parallel and distributed hardware, making them well-suited for the temporal discretization of parabolic systems, in general, and degenerate parabolic problems, in particular. The latter is due to the degenerate equations' finite speed of propagation. In this study, a rigours convergence analysis is given for such integrators without assuming any restrictive regularity on the solutions or the domains. The analysis is conducted by first deriving a new variational framework for the domain decomposition, which is applicable to the two standard degenerate examples. That is, the $p$-Laplace and the porous medium type vector fields. Secondly, the decomposed vector fields are restricted to the underlying pivot space and the time integration of the parabolic problem can then be interpreted as an operators splitting applied to a dissipative evolution equation. The convergence results then follow by employing elements of the approximation theory for nonlinear semigroups.

#### Numerical assessment of two-level domain decomposition preconditioners for incompressible Stokes and elasticity equations | 2017

**Gabriel R. BarrenecheaMichał BosyVictorita Dolean**

Solving the linear elasticity and Stokes equations by an optimal domain decomposition method derived algebraically involves the use of non standard interface conditions. The one-level domain decomposition preconditioners are based on the solution of local problems. This has the undesired consequence that the results are not scalable, it means that the number of iterations needed to reach convergence increases with the number of subdomains. This is the reason why in this work we introduce, and test numerically, two-level preconditioners. Such preconditioners use a coarse space in their construction. We consider the nearly incompressible elasticity problems and Stokes equations, and discretise them by using two finite element methods, namely, the hybrid discontinuous Galerkin and Taylor-Hood discretisations.

#### On overlapping domain decomposition methods for high-contrast multiscale problems | 2017

**Juan GalvisEric ChungYalchin EfendievWing Tat Leung**

We review some important ideas in the design and analysis of robust overlapping domain decomposition algorithms for high-contrast multiscale problems and propose a domain decomposition method better performance in terms of the number of iterations. The main novelty of our approaches is the construction of coarse spaces, which are computed using spectral information of local bilinear forms. We present several approaches to incorporate the spectral information into the coarse problem in order to obtain minimal coarse space dimension. We show that using these coarse spaces, we can obtain a domain decomposition preconditioner with the condition number independent of contrast and small scales. To minimize further the number of iterations until convergence, we use this minimal dimensional coarse spaces in a construction combining them with large overlap local problems that take advantage of the possibility of localizing global fields orthogonal to the coarse space. We obtain a condition number close to 1 for the new method. We discuss possible drawbacks and further extensions.

#### Two-component domain decomposition scheme with overlapping subdomains for parabolic equations | 2017

**Petr N. Vabishchevich**

An iteration-free method of domain decomposition is considered for approximate solving a boundary value problem for a second-order parabolic equation. A standard approach to constructing domain decomposition schemes is based on a partition of unity for the domain under the consideration. Here a new general approach is proposed for constructing domain decomposition schemes with overlapping subdomains based on indicator functions of subdomains. The basic peculiarity of this method is connected with a representation of the problem operator as the sum of two operators, which are constructed for two separate subdomains with the subtraction of the operator that is associated with the intersection of the subdomains. There is developed a two-component factorized scheme, which can be treated as a generalization of the standard Alternating Direction Implicit (ADI) schemes to the case of a special three-component splitting. There are obtained conditions for the unconditional stability of regionally additive schemes constructed using indicator functions of subdomains. Numerical results are presented for a model two-dimensional problem.

#### Coupling parallel adaptive mesh refinement with a nonoverlapping domain decomposition solver | 2017

**Pavel KůsJakub Šístek**

We study the effect of adaptive mesh refinement on a parallel domain decomposition solver of a linear system of algebraic equations. These concepts need to be combined within a parallel adaptive finite element software. A prototype implementation is presented for this purpose. It uses adaptive mesh refinement with one level of hanging nodes. Two and three-level versions of the Balancing Domain Decomposition based on Constraints (BDDC) method are used to solve the arising system of algebraic equations. The basic concepts are recalled and components necessary for the combination are studied in detail. Of particular interest is the effect of disconnected subdomains, a typical output of the employed mesh partitioning based on space-filling curves, on the convergence and solution time of the BDDC method. It is demonstrated using a large set of experiments that while both refined meshes and disconnected subdomains have a negative effect on the convergence of BDDC, the number of iterations remains acceptable. In addition, scalability of the three-level BDDC solver remains good on up to a few thousands of processor cores. The largest presented problem using adaptive mesh refinement has over 10^9 unknowns and is solved on 2048 cores.

#### Parallel energy-stable phase field crystal simulations based on domain decomposition methods | 2017

**Ying WeiChao YangJizu Huang**

In this paper, we present a parallel numerical algorithm for solving the phase field crystal equation. In the algorithm, a semi-implicit finite difference scheme is derived based on the discrete variational derivative method. Theoretical analysis is provided to show that the scheme is unconditionally energy stable and can achieve second-order accuracy in both space and time. An adaptive time step strategy is adopted such that the time step size can be flexibly controlled based on the dynamical evolution of the problem. At each time step, a nonlinear algebraic system is constructed from the discretization of the phase field crystal equation and solved by a domain decomposition based, parallel Newton--Krylov--Schwarz method with improved boundary conditions for subdomain problems. Numerical experiments with several two and three dimensional test cases show that the proposed algorithm is second-order accurate in both space and time, energy stable with large time steps, and highly scalable to over ten thousands processor cores on the Sunway TaihuLight supercomputer.

#### Nonoverlapping domain decomposition preconditioners for discontinuous Galerkin approximations of Hamilton--Jacobi--Bellman equations | 2017

**Iain Smears**

We analyse a class of nonoverlapping domain decomposition preconditioners for nonsymmetric linear systems arising from discontinuous Galerkin finite element approximation of fully nonlinear Hamilton--Jacobi--Bellman (HJB) partial differential equations. These nonsymmetric linear systems are uniformly bounded and coercive with respect to a related symmetric bilinear form, that is associated to a matrix $\mathbf{A}$. In this work, we construct a nonoverlapping domain decomposition preconditioner $\mathbf{P}$, that is based on $\mathbf{A}$, and we then show that the effectiveness of the preconditioner for solving the} nonsymmetric problems can be studied in terms of the condition number $\kappa(\mathbf{P}^{-1}\mathbf{A})$. In particular, we establish the bound $\kappa(\mathbf{P}^{-1}\mathbf{A}) \lesssim 1+ p^6 H^3 /q^3 h^3$, where $H$ and $h$ are respectively the coarse and fine mesh sizes, and $q$ and $p$ are respectively the coarse and fine mesh polynomial degrees. This represents the first such result for this class of methods that explicitly accounts for the dependence of the condition number on $q$; our analysis is founded upon an original optimal order approximation result between fine and coarse discontinuous finite element spaces. Numerical experiments demonstrate the sharpness of this bound. Although the preconditioners are not robust with respect to the polynomial degree, our bounds quantify the effect of the coarse and fine space polynomial degrees. Furthermore, we show computationally that these methods are effective in practical applications to nonsymmetric, fully nonlinear HJB equations under $h$-refinement for moderate polynomial degrees.

#### Space-time balancing domain decomposition | 2017

**Santiago BadiaMarc Olm**

In this work, we propose two-level space-time domain decomposition preconditioners for parabolic problems discretized using finite elements. They are motivated as an extension to space-time of balancing domain decomposition by constraints preconditioners. The key ingredients to be defined are the sub-assembled space and operator, the coarse degrees of freedom (DOFs) in which we want to enforce continuity among subdomains at the preconditioner level, and the transfer operator from the sub-assembled to the original finite element space. With regard to the sub-assembled operator, a perturbation of the time derivative is needed to end up with a well-posed preconditioner. The set of coarse DOFs includes the time average (at the space-time subdomain) of classical space constraints plus new constraints between consecutive subdomains in time. Numerical experiments show that the proposed schemes are weakly scalable in time, i.e., we can efficiently exploit increasing computational resources to solve more time steps in the same {total elapsed} time. Further, the scheme is also weakly space-time scalable, since it leads to asymptotically constant iterations when solving larger problems both in space and time. Excellent {wall clock} time weak scalability is achieved for space-time parallel solvers on some thousands of cores.

#### Nested domain decomposition with polarized traces for the 2D Helmholtz equation | 2016

**Leonardo Zepeda-NúñezLaurent Demanet**

We present a solver for the 2D high-frequency Helmholtz equation in heterogeneous, constant density, acoustic media, with online parallel complexity that scales empirically as $\mathcal{O}(\frac{N}{P})$, where $N$ is the number of volume unknowns, and $P$ is the number of processors, as long as $P = \mathcal{O}(N^{1/5})$. This sublinear scaling is achieved by domain decomposition, not distributed linear algebra, and improves on the $P =\mathcal{O}(N^{1/8})$ scaling reported earlier in [L. Zepeda-N\'u\~nez and L. Demanet, J. Comput. Phys., 308 (2016), pp. 347-388 ]. The solver relies on a two-level nested domain decomposition: a layered partition on the outer level, and a further decomposition of each layer in cells at the inner level. The Helmholtz equation is reduced to a surface integral equation (SIE) posed at the interfaces between layers, efficiently solved via a nested version of the polarized traces preconditioner [L. Zepeda-N\'u\~nez and L. Demanet, J. Comput. Phys., 308 (2016), pp. 347-388.]. The favorable complexity is achieved via an efficient application of the integral operators involved in the SIE.

#### An improved pure source transfer domain decomposition method for Helmholtz equations in unbounded domain | 2016

**Yu DuHaijun Wu**

We propose an improved pure source transfer domain decomposition method (PSTDDM) for solving the truncated perfectly matched layer (PML) approximation in bounded domain of Helmholtz scattering problem. The method is based on the the source transfer domain decomposition method (STDDM) proposed by Chen and Xiang. The idea is decomposing the domain into non-overlapping layers and transferring the sources equivalently layer by layer so that the solution in each layer can be solved using a PML method defined locally outside its two adjacent layers. Furthermore, we divide the domain into non-overlapping blocks and solve the solution in each block by using a PML method defined locally outside its adjacent blocks. The convergence analysis of the method is provided for the case of constant wave number. Finally, numerical examples are provided where the method is used as both a direct solver and an efficient preconditioner in the GMRES method for solving the Helmholtz equation.

#### Improved recovery of admissible stress in domain decomposition methods - application to heterogeneous structures and new error bounds for FETI-DP | 2016

**Augustin Parret-FréaudValentine ReyPierre GosseletChristian Rey**

This paper investigates the question of the building of admissible stress field in a substructured context. More precisely we analyze the special role played by multiple points. This study leads to (1) an improved recovery of the stress field, (2) an opportunity to minimize the estimator in the case of heterogeneous structures (in the parallel and sequential case), (3) a procedure to build admissible fields for FETI-DP and BDDC methods leading to an error bound which separates the contributions of the solver and of the discretization.

#### Basis adaptation and domain decomposition for steady partial differential equations with random coefficients | 2016

**Ramkrishna TipireddyPanos StinisAlexandre Tartakovsky**

We present a novel approach for solving steady-state stochastic partial differential equations (PDEs) with high-dimensional random parameter space. The proposed approach combines spatial domain decomposition with basis adaptation for each subdomain. The basis adaptation is used to address the curse of dimensionality by constructing an accurate low-dimensional representation of the stochastic PDE solution (probability density function and/or its leading statistical moments) in each subdomain. Restricting the basis adaptation to a specific subdomain affords finding a locally accurate solution. Then, the solutions from all of the subdomains are stitched together to provide a global solution. We support our construction with numerical experiments for a steady-state diffusion equation with a random spatially dependent coefficient. Our results show that highly accurate global solutions can be obtained with significantly reduced computational costs.

#### An improved sweeping domain decomposition preconditioner for the Helmholtz equation | 2016

**Christiaan C. Stolk**

In this paper we generalize and improve a recently developed domain decomposition preconditioner for the iterative solution of discretized Helmholtz equations. We introduce an improved method for transmission at the internal boundaries using perfectly matched layers. Simultaneous forward and backward sweeps are introduced, thereby improving the possibilities for parallellization. Finally, the method is combined with an outer two-grid iteration. The method is studied theoretically and with numerical examples. It is shown that the modifications lead to substantial decreases in computation time and memory use, so that computation times become comparable to that of the fastests methods currently in the literature for problems with up to 10^8 degrees of freedom.

#### Strict lower bounds with separation of sources of error in non-overlapping domain decomposition methods | 2016

**Valentine ReyPierre GosseletChristian Rey**

This article deals with the computation of guaranteed lower bounds of the error in the framework of finite element (FE) and domain decomposition (DD) methods. In addition to a fully parallel computation, the proposed lower bounds separate the algebraic error (due to the use of a DD iterative solver) from the discretization error (due to the FE), which enables the steering of the iterative solver by the discretization error. These lower bounds are also used to improve the goal-oriented error estimation in a substructured context. Assessments on 2D static linear mechanic problems illustrate the relevance of the separation of sources of error and the lower bounds' independence from the substructuring. We also steer the iterative solver by an objective of precision on a quantity of interest. This strategy consists in a sequence of solvings and takes advantage of adaptive remeshing and recycling of search directions.

#### SWIFT: Using task-based parallelism, fully asynchronous communication, and graph partition-based domain decomposition for strong scaling on more than 100,000 cores | 2016

**Matthieu SchallerPedro GonnetAidan B. G. ChalkPeter W. Draper**

We present a new open-source cosmological code, called SWIFT, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared/distributed-memory architectures. SWIFT was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialized accelerator hardware. This performance is due to three main computational approaches: (1) Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. (2) Graph-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. (3) Fully dynamic and asynchronous communication, in which communication is modelled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives. In order to use these approaches, the code had to be re-written from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold, on both x86-based and Power8-based architectures.

#### Space-time domain decomposition for advection-diffusion problems in mixed formulations | 2016

**Thi-Thao-Phuong HoangCaroline JaphetMichel KernJean E. Roberts**

This paper is concerned with the numerical solution of porous-media flow and transport problems , i. e. heterogeneous, advection-diffusion problems. Its aim is to investigate numerical schemes for these problems in which different time steps can be used in different parts of the domain. Global-in-time, non-overlapping domain-decomposition methods are coupled with operator splitting making possible the different treatment of the advection and diffusion terms. Two domain-decomposition methods are considered: one uses the time-dependent Steklov--Poincar{\'e} operator and the other uses optimized Schwarz waveform relaxation (OSWR) based on Robin transmission conditions. For each method, a mixed formulation of an interface problem on the space-time interface is derived, and different time grids are employed to adapt to different time scales in the subdomains. A generalized Neumann-Neumann preconditioner is proposed for the first method. To illustrate the two methods numerical results for two-dimensional problems with strong heterogeneities are presented. These include both academic problems and more realistic prototypes for simulations for the underground storage of nuclear waste.

#### Stochastic domain decomposition for the solution of the two-dimensional magnetotelluric problem | 2016

**Alexander BihloColin G. FarquharsonRonald D. HaynesJ. Concepcion Loredo-Osti**

Stochastic domain decomposition is proposed as a novel method for solving the two-dimensional Maxwell's equations as used in the magnetotelluric method. The stochastic form of the exact solution of Maxwell's equations is evaluated using Monte-Carlo methods taking into consideration that the domain may be divided into neighboring sub-domains. These sub-domains can be naturally chosen by splitting the sub-surface domain into regions of constant (or at least continuous) conductivity. The solution over each sub-domain is obtained by solving Maxwell's equations in the strong form. The sub-domain solver used for this purpose is a meshless method resting on radial basis function based finite differences. The method is demonstrated by solving a number of classical magnetotelluric problems, including the quarter-space problem, the block-in-half-space problem and the triangle-in-half-space problem.

#### Improved non-overlapping domain decomposition algorithms for the eddy current problem | 2016

**Y. BoubendirH. HaddarM. K. Riahi**

A domain decomposition method is proposed based on carefully chosen impedance transmission operators for a hybrid formulation of the eddy current problem. Preliminary analysis and numerical results are provided in the spherical case showing the potential of these conditions in accelerating the convergence rate

#### Fitted and unfitted domain decomposition using penalty free Nitsche method for the Poisson problem with discontinuous material parameters | 2015

**Thomas Boiveau**

In this paper, we study the stability of the non symmetric version of the Nitsche's method without penalty for domain decomposition. The Poisson problem is considered as a model problem. The computational domain is divided into two subdomain that can have different material parameters. In the first half of the paper we are interested in nonconforming domain decomposition, each subdomain is meshed independently of each other. In the second half, we study unfitted domain decomposition, the computational domain has only one mesh and we allow the interface to cut elements of the mesh. The fictitious domain method is used to handle this specificity. We prove $H^1$-convergence and $L^2$-convergence of the error in both cases. Some numerical results are provided to corroborate the theoretical study.

#### Two-level space-time domain decomposition methods for unsteady inverse problems | 2015

**Xiaomao DengXiao-chuan CaiJun Zou**

As the number of processor cores on supercomputers becomes larger and larger, algorithms with high degree of parallelism attract more attention. In this work, we propose a novel space-time coupled algorithm for solving an inverse problem associated with the time-dependent convection-diffusion equation in three dimensions. We introduce a mixed finite element/finite difference method and a one-level and a two-level space-time parallel domain decomposition preconditioner for the Karush-Kuhn-Tucker (KKT) system induced from reformulating the inverse problem as an output least-squares optimization problem in the space-time domain. The new full space approach eliminates the sequential steps of the optimization outer loop and the inner forward and backward time marching processes, thus achieves high degree of parallelism. Numerical experiments validate that this approach is effective and robust for recovering unsteady moving sources. We report strong scalability results obtained on a supercomputer with more than 1,000 processors.

#### A parallel space-time domain decomposition method for unsteady source inversion problems | 2015

**Xiaomao DengXiao-chuan CaiJun Zou**

In this paper, we propose a parallel space-time domain decomposition method for solving an unsteady source identification problem governed by the linear convection-diffusion equation. Traditional approaches require to solve repeatedly a forward parabolic system, an adjoint system and a system with respect to the unknowns. The three systems have to be solved one after another. These sequential steps are not desirable for large scale parallel computing. A space-time restrictive additive Schwarz method is proposed for a fully implicit space-time coupled discretization scheme to recover the time-dependent pollutant source intensity functions. We show with numerical experiments that the scheme works well with noise in the observation data. More importantly it is demonstrated that the parallel space-time Schwarz preconditioner is scalable on a supercomputer with over $10^3$ processors, thus promising for large scale applications.

#### A massively parallel multi-level approach to a domain decomposition method for the optical flow estimation with varying illumination | 2015

**Diane Gilliocq-HirtzZakaria Belhachmi**

We consider a variational method to solve the optical flow problem with varying illumination. We apply an adaptive control of the regularization parameter which allows us to preserve the edges and fine features of the computed flow. To reduce the complexity of the estimation for high resolution images and the time of computations, we implement a multi-level parallel approach based on the domain decomposition with the Schwarz overlapping method. The second level of parallelism uses the massively parallel solver MUMPS. We perform some numerical simulations to show the efficiency of our approach and to validate it on classical and real-world image sequences.

#### A non-conforming domain decomposition approximation for the Helmholtz screen problem with hypersingular operator | 2015

**Norbert HeuerGredy Salmerón**

We present and analyze a non-conforming domain decomposition approximation for a hypersingular operator governed by the Helmholtz equation in three dimensions. This operator appears when considering the corresponding Neumann problem in unbounded domains exterior to open surfaces. We consider small wave numbers and low-order approximations with Nitsche coupling across interfaces. Under appropriate assumptions on mapping properties of the weakly singular and hypersingular operators with Helmholtz kernel, we prove that this method converges almost quasi-optimally. Numerical experiments confirm our error estimate.

#### Low-rank correction methods for algebraic domain decomposition preconditioners | 2015

**Ruipeng LiYousef Saad**

This paper presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits the domain decomposition method and low-rank corrections. The domain decomposition approach decouples the matrix and once inverted, a low-rank approximation is applied by exploiting the Sherman-Morrison-Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisons with other distributed-memory preconditioning methods are presented.

#### Schur Complement based domain decomposition preconditioners with Low-rank corrections | 2015

**Ruipeng LiYuanzhe XiYousef Saad**

This paper introduces a robust preconditioner for general sparse symmetric matrices, that is based on low-rank approximations of the Schur complement in a Domain Decomposition (DD) framework. In this "Schur Low Rank" (SLR) preconditioning approach, the coefficient matrix is first decoupled by DD, and then a low-rank correction is exploited to compute an approximate inverse of the Schur complement associated with the interface points. The method avoids explicit formation of the Schur complement matrix. We show the feasibility of this strategy for a model problem, and conduct a detailed spectral analysis for the relationship between the low-rank correction and the quality of the preconditioning. Numerical experiments on general matrices illustrate the robustness and efficiency of the proposed approach.

#### Local Exponential Methods: a domain decomposition approach to exponential time integration of PDEs | 2015

**Luca Bonaventura**

A local approach to the time integration of PDEs by exponential methods is proposed, motivated by theoretical estimates by A.Iserles on the decay of off-diagonal terms in the exponentials of sparse matrices. An overlapping domain decomposition technique is outlined, that allows to replace the computation of a global exponential matrix by a number of independent and easily parallelizable local problems. Advantages and potential problems of the proposed technique are discussed. Numerical experiments on simple, yet relevant model problems show that the resulting method allows to increase computational efficiency with respect to standard implementations of exponential methods.

#### Convergence of penalty Robin-Robin domain decomposition methods for unilateral multibody contact problems of elasticity | 2015

**Ivan I. DyyakIhor I. ProkopyshynIvan A. Prokopyshyn**

The paper is devoted to the penalty Robin-Robin domain decomposition methods (DDMs), proposed by us for the solution of unilateral multibody contact problems of elasticity. These DDMs are based on the penalty method for variational inequalities and some stationary and nonstationary iterative methods for nonlinear variational equations. The main result of the paper is that we give the mathematical justification of proposed DDMs and prove theorems on their convergence. We also investigate the numerical efficiency of these methods using the finite element approximations.

#### Stochastic domain decomposition for time dependent adaptive mesh generation | 2015

**Alexander BihloRonald D. HaynesEmily J. Walsh**

The efficient generation of meshes is an important component in the numerical solution of problems in physics and engineering. Of interest are situations where global mesh quality and a tight coupling to the solution of the physical partial differential equation (PDE) is important. We consider parabolic PDE mesh generation and present a method for the construction of adaptive meshes in two spatial dimensions using stochastic domain decomposition that is suitable for an implementation in a multi- or many-core environment. Methods for mesh generation on periodic domains are also provided. The mesh generator is coupled to a time dependent physical PDE and the system is evolved using an alternating solution procedure. The method uses the stochastic representation of the exact solution of a parabolic linear mesh generator to find the location of an adaptive mesh along the (artificial) subdomain interfaces. The deterministic evaluation of the mesh over each subdomain can then be obtained completely independently using the probabilistically computed solutions as boundary conditions. The parallel performance of this general stochastic domain decomposition approach has previously been shown. We demonstrate the approach numerically for the mesh generation context and compare the mesh obtained with the corresponding single domain mesh using a representative mesh quality measure.

#### Strict bounding of quantities of interest in computations based on domain decomposition | 2015

**Valentine ReyPierre GosseletChristian Rey**

This paper deals with bounding the error on the estimation of quantities of interest obtained by finite element and domain decomposition methods. The proposed bounds are written in order to separate the two errors involved in the resolution of reference and adjoint problems : on the one hand the discretization error due to the finite element method and on the other hand the algebraic error due to the use of the iterative solver. Beside practical considerations on the parallel computation of the bounds, it is shown that the interface conformity can be slightly relaxed so that local enrichment or refinement are possible in the subdomains bearing singularities or quantities of interest which simplifies the improvement of the estimation. Academic assessments are given on 2D static linear mechanic problems.

#### A dynamic domain decomposition for a class of second order semi-linear equations | 2015

**Simone CacaceMaurizio Falcone**

We propose a parallel algorithm for the numerical solution of a class of second order semi-linear equations coming from stochastic optimal control problems, by means of a dynamic domain decomposition technique. The new method is an extension of the patchy domain decomposition method presented in a previous work for first order Hamilton-Jacobi-Bellman equations related to deterministic optimal control problems. The semi-Lagrangian scheme underlying the original method is modified in order to deal with (possibly degenerate) diffusion, by approximating the stochastic optimal control problem associated to the equation via discrete time Markov chains. We show that under suitable conditions on the discretization parameters and for sufficiently small values of the diffusion coefficient, the parallel computation on the proposed dynamic decomposition is faster than that on a static decomposition. To this end, we combine the parallelization with some well known techniques in the context of fast-marching-like methods for first order Hamilton-Jacobi equations. Several numerical tests in dimension two are presented, in order to show the features of the proposed method.

#### Diffusion approximations and domain decomposition method of linear transport equations: asymptotics and numerics | 2015

**Qin LiJianfeng LuWeiran Sun**

In this paper we construct numerical schemes to approximate linear transport equations with slab geometry by diffusion equations. We treat both the case of pure diffusive scaling and the case where kinetic and diffusive scalings coexist. The diffusion equations and their data are derived from asymptotic and layer analysis which allows general scattering kernels and general data. We apply the half-space solver in [20] to resolve the boundary layer equation and obtain the boundary data for the diffusion equation. The algorithms are validated by numerical experiments and also by error analysis for the pure diffusive scaling case.

#### Electromagnetic wave propagation and absorption in magnetised plasmas: variational formulations and domain decomposition | 2015

**Aurore BackTakashi HattoriSimon LabrunieJean-Rodolphe RochePierre Bertrand**

We consider a model for the propagation and absorption of electromagnetic waves (in the time-harmonic regime) in a magnetised plasma. We present a rigorous derivation of the model and several boundary conditions modelling wave injection into the plasma. Then we propose several variational formulations, mixed and non-mixed, and prove their well-posedness thanks to a theorem by S\'ebelin et~al. Finally, we propose a non-overlapping domain decomposition framework, show its well-posedness and equivalence with the one-domain formulation. These results appear strongly linked to the spectral properties of the plasma dielectric tensor.

#### A new approach to improve ill-conditioned parabolic optimal control problem via time domain decomposition | 2015

**Mohamed Kamel Riahi**

In this paper we present a new steepest-descent type algorithm for convex optimization problems. Our algorithm pieces the unknown into sub-blocs of unknowns and considers a partial optimization over each sub-bloc. In quadratic optimization, our method involves Newton technique to compute the step-lengths for the sub-blocs resulting descent directions. Our optimization method is fully parallel and easily implementable, we first presents it in a general linear algebra setting, then we highlight its applicability to a parabolic optimal control problem, where we consider the blocs of unknowns with respect to the time dependency of the control variable. The parallel tasks, in the last problem, turn "on" the control during a specific time-window and turn it "off" elsewhere. We show that our algorithm significantly improves the computational time compared with recognized methods. Convergence analysis of the new optimal control algorithm is provided for an arbitrary choice of partition. Numerical experiments are presented to illustrate the efficiency and the rapid convergence of the method.