Lesson 10 of 13 20 min

Linear Solvers

After assembly and boundary condition application, we have a system:

$$[K]\{u\} = \{F\}$$

This is a system of linear equations. For a mesh with $n$ nodes and $d$ DOFs per node, we have $N = n \times d$ equations. Solving this efficiently is crucial — it's often the most expensive part of the analysis.

The Challenge

FEA matrices have special properties:

  • Large: Millions of DOFs in industrial problems
  • Sparse: Most entries are zero (nodes only connect to neighbors)
  • Symmetric: $K_{ij} = K_{ji}$ (from energy principles)
  • Positive definite: $\{u\}^T[K]\{u\} > 0$ for all non-zero $\{u\}$
  • Banded: Non-zeros cluster near the diagonal
Visualize sparsity patterns for different mesh sizes. Note how bandwidth depends on node numbering.

Direct Solvers

Direct solvers compute the exact solution (to machine precision) through matrix factorization.

LU Decomposition

Factor the matrix as $[K] = [L][U]$:

  • $[L]$ = lower triangular
  • $[U]$ = upper triangular

Then solve in two steps:

  • Forward substitution: $[L]\{y\} = \{F\}$
  • Back substitution: $[U]\{u\} = \{y\}$

Cholesky Decomposition

For symmetric positive definite matrices:

$$[K] = [L][L]^T$$

Only need to store $[L]$ — half the memory of LU!

Computational Cost

OperationDense MatrixBanded (bandwidth b)Sparse
Factorization$O(N^3)$$O(Nb^2)$$O(N \cdot f^2)$
Solve$O(N^2)$$O(Nb)$$O(N \cdot f)$
Memory$O(N^2)$$O(Nb)$$O(N \cdot f)$

Where $f$ = fill-in factor (depends on ordering).

Fill-In Problem

During factorization, zeros can become non-zeros ("fill-in"):

Original:           After factorization:
[x . . x]           [x . . x]
[. x . .]    →      [. x . .]
[. . x .]           [f f x .]   ← fill-in!
[x . . x]           [x f f x]

Good node ordering minimizes fill-in:

  • Cuthill-McKee: Reduces bandwidth
  • Nested Dissection: Minimizes fill-in for 2D/3D meshes
  • Minimum Degree: Greedy local optimization

When to Use Direct Solvers

Advantages:
  • Exact solution (no convergence issues)
  • Robust — always works for valid matrices
  • Efficient for multiple right-hand sides (factor once, solve many)
  • Predictable runtime
Disadvantages:
  • Memory intensive for large problems
  • $O(N^{1.5})$ to $O(N^2)$ memory for 3D
  • Can't exploit parallel hardware as well
Best for:
  • Small to medium problems (< 500K DOFs)
  • Problems with many load cases
  • Nonlinear analysis (frequent resolves)
  • When robustness is critical

Iterative Solvers

Iterative solvers approximate the solution through successive refinements:

$$\{u\}^{(k+1)} = \{u\}^{(k)} + \text{correction}$$

Conjugate Gradient (CG)

The standard for symmetric positive definite systems:

r₀ = F - K·u₀           // Initial residual
p₀ = r₀                  // Initial search direction

for k = 0, 1, 2, ...
    α = (rₖᵀrₖ) / (pₖᵀK·pₖ)    // Step size
    uₖ₊₁ = uₖ + α·pₖ           // Update solution
    rₖ₊₁ = rₖ - α·K·pₖ         // Update residual

    if ||rₖ₊₁|| < tolerance: break

    β = (rₖ₊₁ᵀrₖ₊₁) / (rₖᵀrₖ)  // Direction update
    pₖ₊₁ = rₖ₊₁ + β·pₖ         // New search direction
Key properties:
  • Guaranteed convergence in $N$ iterations (exact arithmetic)
  • Usually converges much faster with preconditioning
  • Only needs matrix-vector products (sparse-friendly)
  • Memory: $O(N)$ — just a few vectors

Convergence Rate

Without preconditioning:

$$\|u - u^{(k)}\| \leq 2 \left(\frac{\sqrt{\kappa} - 1}{\sqrt{\kappa} + 1}\right)^k \|u - u^{(0)}\|$$

Where $\kappa = \lambda_{max}/\lambda_{min}$ is the condition number.

Problem: FEA matrices are often ill-conditioned ($\kappa \sim 10^6$ or worse), leading to slow convergence.

Preconditioning

Transform the system to improve conditioning:

$$[M]^{-1}[K]\{u\} = [M]^{-1}\{F\}$$

Ideally, $[M] \approx [K]$ but cheap to invert.

Common preconditioners:
TypeDescriptionCostEffectiveness
Jacobi$M = \text{diag}(K)$$O(N)$Low
SSORSymmetric SOR$O(N)$Moderate
ILU(0)Incomplete LU, no fill$O(N)$Good
ILU(k)Incomplete LU, level-k fill$O(N \cdot k)$Better
AMGAlgebraic Multigrid$O(N)$Excellent

Multigrid Methods

The most powerful iterative approach:

Idea: Smooth errors on fine grid, correct on coarse grid.
  • Smooth high-frequency errors on fine mesh
  • Restrict residual to coarser mesh
  • Solve (approximately) on coarse mesh
  • Prolongate correction back to fine mesh
  • Smooth again
Complexity: $O(N)$ — optimal! Types:
  • Geometric Multigrid: Requires mesh hierarchy
  • Algebraic Multigrid (AMG): Constructs hierarchy from matrix

When to Use Iterative Solvers

Advantages:
  • Memory efficient: $O(N)$
  • Parallelizes well
  • Can exploit sparsity perfectly
  • Natural for transient problems
Disadvantages:
  • May not converge for ill-conditioned systems
  • Convergence rate problem-dependent
  • Requires good preconditioner selection
  • Sensitive to matrix properties
Best for:
  • Large problems (> 500K DOFs)
  • 3D solid mechanics
  • When memory is limited
  • Parallel computing environments

Solver Comparison

AspectDirectIterative
SolutionExactApproximate
Memory$O(N^{1.5})$ to $O(N^2)$$O(N)$
TimePredictableProblem-dependent
RobustnessHighNeeds tuning
ParallelismModerateExcellent
Multiple RHSEfficientMust re-solve

Practical Considerations

Matrix Storage Formats

Compressed Sparse Row (CSR):
  • Values array: non-zero values
  • Column indices: column of each value
  • Row pointers: start of each row
Memory: $O(\text{nnz})$ where nnz = number of non-zeros

Parallel Solvers

Domain decomposition:
  • Partition mesh into subdomains
  • Solve each subdomain independently
  • Iterate to satisfy interface conditions
Popular parallel solvers:
  • MUMPS: Parallel direct (MPI)
  • PARDISO: Shared-memory direct
  • PETSc: Iterative framework
  • Trilinos: Comprehensive solver library

Choosing a Solver

if problem_size < 100K_DOFs:
    use Direct (Cholesky/LU)
elif problem_size < 1M_DOFs:
    try Direct first
    if memory_exceeded: use Iterative + AMG
else:  # > 1M DOFs
    use Iterative + AMG preconditioner

if multiple_load_cases:
    prefer Direct (amortize factorization)

if nonlinear:
    Direct often better (frequent refactorization)

Convergence Monitoring

For iterative solvers, always check:

  • Relative residual: $\|F - Ku\| / \|F\| < \epsilon$
  • Energy norm: $\sqrt{u^T K u}$ should stabilize
  • Iteration count: Should be $\ll N$
Warning signs:
  • Iterations > 1000
  • Residual stagnating
  • Residual increasing

Common Issues

1. Singular Matrix

Symptom: Direct solver fails, iterative diverges Causes:
  • Insufficient boundary conditions
  • Duplicate nodes
  • Material with zero stiffness
Fix: Check constraints and mesh

2. Ill-Conditioning

Symptom: Slow convergence, inaccurate results Causes:
  • Highly distorted elements
  • Large stiffness ratios
  • Nearly rigid connections
Fix: Improve mesh quality, use preconditioning

3. Memory Overflow

Symptom: Out-of-memory error Causes:
  • Dense factorization of sparse matrix
  • Poor node ordering
  • Problem too large
Fix: Use sparse storage, reorder nodes, switch to iterative

Key Takeaways

  • Direct solvers: Exact, robust, but memory-intensive ($O(N^{1.5+})$)
  • Iterative solvers: Memory-efficient ($O(N)$), but need preconditioning
  • Cholesky: Best direct method for symmetric positive definite
  • CG + AMG: Gold standard for large FEA problems
  • Node ordering: Critical for direct solver efficiency
  • Condition number: Determines iterative convergence rate
  • Problem size: < 500K → direct; > 1M → iterative
  • Always monitor residuals and iteration counts

What's Next

We can now solve FEA problems mathematically. But how do we know the results are correct? The next lesson covers Validation & Verification — ensuring our analysis is accurate and trustworthy.