Linear Solvers

After assembly and boundary condition application, we have a system:

$$[K]\{u\} = \{F\}$$

This is a system of linear equations. For a mesh with $n$ nodes and $d$ DOFs per node, we have $N = n \times d$ equations. Solving this efficiently is crucial — it's often the most expensive part of the analysis.

The Challenge

FEA matrices have special properties:

Large: Millions of DOFs in industrial problems
Sparse: Most entries are zero (nodes only connect to neighbors)
Symmetric: $K_{ij} = K_{ji}$ (from energy principles)
Positive definite: $\{u\}^T[K]\{u\} > 0$ for all non-zero $\{u\}$
Banded: Non-zeros cluster near the diagonal

Visualize sparsity patterns for different mesh sizes. Note how bandwidth depends on node numbering.

Direct Solvers

Direct solvers compute the exact solution (to machine precision) through matrix factorization.

LU Decomposition

Factor the matrix as $[K] = [L][U]$:

Cholesky Decomposition

For symmetric positive definite matrices:

$$[K] = [L][L]^T$$

Computational Cost

Operation	Dense Matrix	Banded (bandwidth b)	Sparse
Factorization	$O(N^3)$	$O(Nb^2)$	$O(N \cdot f^2)$
Solve	$O(N^2)$	$O(Nb)$	$O(N \cdot f)$
Memory	$O(N^2)$	$O(Nb)$	$O(N \cdot f)$

Where $f$ = fill-in factor (depends on ordering).

Fill-In Problem

During factorization, zeros can become non-zeros ("fill-in"):

Original:           After factorization:
[x . . x]           [x . . x]
[. x . .]    →      [. x . .]
[. . x .]           [f f x .]   ← fill-in!
[x . . x]           [x f f x]

Good node ordering minimizes fill-in:

Cuthill-McKee: Reduces bandwidth
Nested Dissection: Minimizes fill-in for 2D/3D meshes
Minimum Degree: Greedy local optimization

When to Use Direct Solvers

Advantages:

Exact solution (no convergence issues)
Robust — always works for valid matrices
Efficient for multiple right-hand sides (factor once, solve many)
Predictable runtime

Disadvantages:

Memory intensive for large problems
$O(N^{1.5})$ to $O(N^2)$ memory for 3D
Can't exploit parallel hardware as well

Best for:

Small to medium problems (< 500K DOFs)
Problems with many load cases
Nonlinear analysis (frequent resolves)
When robustness is critical

Iterative Solvers

Iterative solvers approximate the solution through successive refinements:

$$\{u\}^{(k+1)} = \{u\}^{(k)} + \text{correction}$$

Conjugate Gradient (CG)

The standard for symmetric positive definite systems:

r₀ = F - K·u₀           // Initial residual
p₀ = r₀                  // Initial search direction

for k = 0, 1, 2, ...
    α = (rₖᵀrₖ) / (pₖᵀK·pₖ)    // Step size
    uₖ₊₁ = uₖ + α·pₖ           // Update solution
    rₖ₊₁ = rₖ - α·K·pₖ         // Update residual

    if ||rₖ₊₁|| < tolerance: break

    β = (rₖ₊₁ᵀrₖ₊₁) / (rₖᵀrₖ)  // Direction update
    pₖ₊₁ = rₖ₊₁ + β·pₖ         // New search direction

Key properties:

Guaranteed convergence in $N$ iterations (exact arithmetic)
Usually converges much faster with preconditioning
Only needs matrix-vector products (sparse-friendly)
Memory: $O(N)$ — just a few vectors

Convergence Rate

Without preconditioning:

$$\|u - u^{(k)}\| \leq 2 \left(\frac{\sqrt{\kappa} - 1}{\sqrt{\kappa} + 1}\right)^k \|u - u^{(0)}\|$$

Where $\kappa = \lambda_{max}/\lambda_{min}$ is the condition number.

Problem: FEA matrices are often ill-conditioned ($\kappa \sim 10^6$ or worse), leading to slow convergence.

Preconditioning

Transform the system to improve conditioning:

$$[M]^{-1}[K]\{u\} = [M]^{-1}\{F\}$$

Ideally, $[M] \approx [K]$ but cheap to invert.

Common preconditioners:

Type	Description	Cost	Effectiveness
Jacobi	$M = \text{diag}(K)$	$O(N)$	Low
SSOR	Symmetric SOR	$O(N)$	Moderate
ILU(0)	Incomplete LU, no fill	$O(N)$	Good
ILU(k)	Incomplete LU, level-k fill	$O(N \cdot k)$	Better
AMG	Algebraic Multigrid	$O(N)$	Excellent

Multigrid Methods

The most powerful iterative approach:

Idea: Smooth errors on fine grid, correct on coarse grid.

Smooth high-frequency errors on fine mesh
Restrict residual to coarser mesh
Solve (approximately) on coarse mesh
Prolongate correction back to fine mesh
Smooth again

Complexity: $O(N)$ — optimal! Types:

Geometric Multigrid: Requires mesh hierarchy
Algebraic Multigrid (AMG): Constructs hierarchy from matrix

When to Use Iterative Solvers

Advantages:

Memory efficient: $O(N)$
Parallelizes well
Can exploit sparsity perfectly
Natural for transient problems

Disadvantages:

May not converge for ill-conditioned systems
Convergence rate problem-dependent
Requires good preconditioner selection
Sensitive to matrix properties

Best for:

Large problems (> 500K DOFs)
3D solid mechanics
When memory is limited
Parallel computing environments

Solver Comparison

Aspect	Direct	Iterative
Solution	Exact	Approximate
Memory	$O(N^{1.5})$ to $O(N^2)$	$O(N)$
Time	Predictable	Problem-dependent
Robustness	High	Needs tuning
Parallelism	Moderate	Excellent
Multiple RHS	Efficient	Must re-solve

Practical Considerations

Matrix Storage Formats

Compressed Sparse Row (CSR):

Values array: non-zero values
Column indices: column of each value
Row pointers: start of each row

Memory: $O(\text{nnz})$ where nnz = number of non-zeros

Parallel Solvers

Domain decomposition:

Partition mesh into subdomains
Solve each subdomain independently
Iterate to satisfy interface conditions

Popular parallel solvers:

MUMPS: Parallel direct (MPI)
PARDISO: Shared-memory direct
PETSc: Iterative framework
Trilinos: Comprehensive solver library

Choosing a Solver

if problem_size < 100K_DOFs:
    use Direct (Cholesky/LU)
elif problem_size < 1M_DOFs:
    try Direct first
    if memory_exceeded: use Iterative + AMG
else:  # > 1M DOFs
    use Iterative + AMG preconditioner

if multiple_load_cases:
    prefer Direct (amortize factorization)

if nonlinear:
    Direct often better (frequent refactorization)

Convergence Monitoring

For iterative solvers, always check:

Relative residual: $\|F - Ku\| / \|F\| < \epsilon$
Energy norm: $\sqrt{u^T K u}$ should stabilize
Iteration count: Should be $\ll N$

Warning signs:

Iterations > 1000
Residual stagnating
Residual increasing

Common Issues

1. Singular Matrix

Symptom: Direct solver fails, iterative diverges Causes:

Insufficient boundary conditions
Duplicate nodes
Material with zero stiffness

Fix: Check constraints and mesh

2. Ill-Conditioning

Symptom: Slow convergence, inaccurate results Causes:

Highly distorted elements
Large stiffness ratios
Nearly rigid connections

Fix: Improve mesh quality, use preconditioning

3. Memory Overflow

Symptom: Out-of-memory error Causes:

Dense factorization of sparse matrix
Poor node ordering
Problem too large

Fix: Use sparse storage, reorder nodes, switch to iterative

Key Takeaways

Direct solvers: Exact, robust, but memory-intensive ($O(N^{1.5+})$)
Iterative solvers: Memory-efficient ($O(N)$), but need preconditioning
Cholesky: Best direct method for symmetric positive definite
CG + AMG: Gold standard for large FEA problems
Node ordering: Critical for direct solver efficiency
Condition number: Determines iterative convergence rate
Problem size: < 500K → direct; > 1M → iterative
Always monitor residuals and iteration counts

What's Next

We can now solve FEA problems mathematically. But how do we know the results are correct? The next lesson covers Validation & Verification — ensuring our analysis is accurate and trustworthy.