Linear Solvers
After assembly and boundary condition application, we have a system:
$$[K]\{u\} = \{F\}$$
This is a system of linear equations. For a mesh with $n$ nodes and $d$ DOFs per node, we have $N = n \times d$ equations. Solving this efficiently is crucial — it's often the most expensive part of the analysis.
The Challenge
FEA matrices have special properties:
- Large: Millions of DOFs in industrial problems
- Sparse: Most entries are zero (nodes only connect to neighbors)
- Symmetric: $K_{ij} = K_{ji}$ (from energy principles)
- Positive definite: $\{u\}^T[K]\{u\} > 0$ for all non-zero $\{u\}$
- Banded: Non-zeros cluster near the diagonal
Direct Solvers
Direct solvers compute the exact solution (to machine precision) through matrix factorization.
LU Decomposition
Factor the matrix as $[K] = [L][U]$:
- $[L]$ = lower triangular
- $[U]$ = upper triangular
Then solve in two steps:
- Forward substitution: $[L]\{y\} = \{F\}$
- Back substitution: $[U]\{u\} = \{y\}$
Cholesky Decomposition
For symmetric positive definite matrices:
$$[K] = [L][L]^T$$
Only need to store $[L]$ — half the memory of LU!
Computational Cost
| Operation | Dense Matrix | Banded (bandwidth b) | Sparse |
|---|---|---|---|
| Factorization | $O(N^3)$ | $O(Nb^2)$ | $O(N \cdot f^2)$ |
| Solve | $O(N^2)$ | $O(Nb)$ | $O(N \cdot f)$ |
| Memory | $O(N^2)$ | $O(Nb)$ | $O(N \cdot f)$ |
Where $f$ = fill-in factor (depends on ordering).
Fill-In Problem
During factorization, zeros can become non-zeros ("fill-in"):
Original: After factorization:
[x . . x] [x . . x]
[. x . .] → [. x . .]
[. . x .] [f f x .] ← fill-in!
[x . . x] [x f f x]
Good node ordering minimizes fill-in:
- Cuthill-McKee: Reduces bandwidth
- Nested Dissection: Minimizes fill-in for 2D/3D meshes
- Minimum Degree: Greedy local optimization
When to Use Direct Solvers
Advantages:- Exact solution (no convergence issues)
- Robust — always works for valid matrices
- Efficient for multiple right-hand sides (factor once, solve many)
- Predictable runtime
- Memory intensive for large problems
- $O(N^{1.5})$ to $O(N^2)$ memory for 3D
- Can't exploit parallel hardware as well
- Small to medium problems (< 500K DOFs)
- Problems with many load cases
- Nonlinear analysis (frequent resolves)
- When robustness is critical
Iterative Solvers
Iterative solvers approximate the solution through successive refinements:
$$\{u\}^{(k+1)} = \{u\}^{(k)} + \text{correction}$$
Conjugate Gradient (CG)
The standard for symmetric positive definite systems:
r₀ = F - K·u₀ // Initial residual
p₀ = r₀ // Initial search direction
for k = 0, 1, 2, ...
α = (rₖᵀrₖ) / (pₖᵀK·pₖ) // Step size
uₖ₊₁ = uₖ + α·pₖ // Update solution
rₖ₊₁ = rₖ - α·K·pₖ // Update residual
if ||rₖ₊₁|| < tolerance: break
β = (rₖ₊₁ᵀrₖ₊₁) / (rₖᵀrₖ) // Direction update
pₖ₊₁ = rₖ₊₁ + β·pₖ // New search direction
Key properties:
- Guaranteed convergence in $N$ iterations (exact arithmetic)
- Usually converges much faster with preconditioning
- Only needs matrix-vector products (sparse-friendly)
- Memory: $O(N)$ — just a few vectors
Convergence Rate
Without preconditioning:
$$\|u - u^{(k)}\| \leq 2 \left(\frac{\sqrt{\kappa} - 1}{\sqrt{\kappa} + 1}\right)^k \|u - u^{(0)}\|$$
Where $\kappa = \lambda_{max}/\lambda_{min}$ is the condition number.
Problem: FEA matrices are often ill-conditioned ($\kappa \sim 10^6$ or worse), leading to slow convergence.Preconditioning
Transform the system to improve conditioning:
$$[M]^{-1}[K]\{u\} = [M]^{-1}\{F\}$$
Ideally, $[M] \approx [K]$ but cheap to invert.
Common preconditioners:| Type | Description | Cost | Effectiveness |
|---|---|---|---|
| Jacobi | $M = \text{diag}(K)$ | $O(N)$ | Low |
| SSOR | Symmetric SOR | $O(N)$ | Moderate |
| ILU(0) | Incomplete LU, no fill | $O(N)$ | Good |
| ILU(k) | Incomplete LU, level-k fill | $O(N \cdot k)$ | Better |
| AMG | Algebraic Multigrid | $O(N)$ | Excellent |
Multigrid Methods
The most powerful iterative approach:
Idea: Smooth errors on fine grid, correct on coarse grid.- Smooth high-frequency errors on fine mesh
- Restrict residual to coarser mesh
- Solve (approximately) on coarse mesh
- Prolongate correction back to fine mesh
- Smooth again
- Geometric Multigrid: Requires mesh hierarchy
- Algebraic Multigrid (AMG): Constructs hierarchy from matrix
When to Use Iterative Solvers
Advantages:- Memory efficient: $O(N)$
- Parallelizes well
- Can exploit sparsity perfectly
- Natural for transient problems
- May not converge for ill-conditioned systems
- Convergence rate problem-dependent
- Requires good preconditioner selection
- Sensitive to matrix properties
- Large problems (> 500K DOFs)
- 3D solid mechanics
- When memory is limited
- Parallel computing environments
Solver Comparison
| Aspect | Direct | Iterative |
|---|---|---|
| Solution | Exact | Approximate |
| Memory | $O(N^{1.5})$ to $O(N^2)$ | $O(N)$ |
| Time | Predictable | Problem-dependent |
| Robustness | High | Needs tuning |
| Parallelism | Moderate | Excellent |
| Multiple RHS | Efficient | Must re-solve |
Practical Considerations
Matrix Storage Formats
Compressed Sparse Row (CSR):- Values array: non-zero values
- Column indices: column of each value
- Row pointers: start of each row
Parallel Solvers
Domain decomposition:- Partition mesh into subdomains
- Solve each subdomain independently
- Iterate to satisfy interface conditions
- MUMPS: Parallel direct (MPI)
- PARDISO: Shared-memory direct
- PETSc: Iterative framework
- Trilinos: Comprehensive solver library
Choosing a Solver
if problem_size < 100K_DOFs:
use Direct (Cholesky/LU)
elif problem_size < 1M_DOFs:
try Direct first
if memory_exceeded: use Iterative + AMG
else: # > 1M DOFs
use Iterative + AMG preconditioner
if multiple_load_cases:
prefer Direct (amortize factorization)
if nonlinear:
Direct often better (frequent refactorization)
Convergence Monitoring
For iterative solvers, always check:
- Relative residual: $\|F - Ku\| / \|F\| < \epsilon$
- Energy norm: $\sqrt{u^T K u}$ should stabilize
- Iteration count: Should be $\ll N$
- Iterations > 1000
- Residual stagnating
- Residual increasing
Common Issues
1. Singular Matrix
Symptom: Direct solver fails, iterative diverges Causes:- Insufficient boundary conditions
- Duplicate nodes
- Material with zero stiffness
2. Ill-Conditioning
Symptom: Slow convergence, inaccurate results Causes:- Highly distorted elements
- Large stiffness ratios
- Nearly rigid connections
3. Memory Overflow
Symptom: Out-of-memory error Causes:- Dense factorization of sparse matrix
- Poor node ordering
- Problem too large
Key Takeaways
- Direct solvers: Exact, robust, but memory-intensive ($O(N^{1.5+})$)
- Iterative solvers: Memory-efficient ($O(N)$), but need preconditioning
- Cholesky: Best direct method for symmetric positive definite
- CG + AMG: Gold standard for large FEA problems
- Node ordering: Critical for direct solver efficiency
- Condition number: Determines iterative convergence rate
- Problem size: < 500K → direct; > 1M → iterative
- Always monitor residuals and iteration counts
What's Next
We can now solve FEA problems mathematically. But how do we know the results are correct? The next lesson covers Validation & Verification — ensuring our analysis is accurate and trustworthy.