Cross-Section Data Pipeline
Key Facts
Read this before modifying the cross-section pipeline.
421-group microscopic XS from GENDF (GXS) files → HDF5 via
data/micro_xs/Isotopedataclass: sig_t, sig_c, sig_f, sig_el, sig_inel, nu, chi (421 groups)Sigma-zero iteration:
data/macro_xs/sigma_zeros.py— self-shieldingMixturedataclass: macroscopic XS withSigS[l][g_from, g_to]conventionConsistency: \(\Sigma_t = \Sigma_c + \Sigma_f + \sum_{g'} \Sigma_{s,g \to g'}\)
load_isotope()auto-selects HDF5 or fallback .m parserVerification uses synthetic XS from
derivations/_xs_library.py(regions A/B/C/D), NOT this pipeline
Overview
Every solver in ORPHEUS relies on multi-group microscopic cross sections
for the 12 nuclides in the 421-energy-group JEFF-3.1 library. This
chapter documents the complete data pipeline from the authoritative IAEA
source files to the internal Isotope dataclass:
GENDF format — the IAEA distributes processed nuclear data in the GENDF (Groupwise ENDF) format, which uses fixed-width 80-column records inherited from punched-card conventions.
Parsing — the
gendf.pymodule reads GENDF files directly, bypassing the MATLAB CSV intermediary.Scattering assembly — elastic, inelastic, and thermal scattering matrices are combined with careful treatment of thermal-group boundaries.
HDF5 serialisation — the parsed data is stored in compressed HDF5 files for fast loading at runtime.
Loading —
load_isotope()provides a uniform API that auto-selects the HDF5 backend when available.
The 12 nuclides in the library are:
Nuclide |
GXS File |
Temperatures (K) |
Sigma-zeros |
|---|---|---|---|
H-1 |
|
294, 350, 400, 450, 500, 550, 600, 650 |
1 |
O-16 |
|
294, 600, 900, 1200, 1500, 1800 |
6 |
B-10 |
|
294, 600, 900, 1200 |
4 |
B-11 |
|
294, 600, 900, 1200 |
4 |
Na-23 |
|
294, 600, 900, 1200 |
4 |
U-235 |
|
294, 600, 900, 1200, 1500, 1800 |
10 |
U-238 |
|
294, 600, 900, 1200, 1500, 1800 |
10 |
Zr-90 |
|
294, 600, 900, 1200 |
4 |
Zr-91 |
|
294, 600, 900, 1200 |
4 |
Zr-92 |
|
294, 600, 900, 1200 |
4 |
Zr-94 |
|
294, 600, 900, 1200 |
4 |
Zr-96 |
|
294, 600, 900, 1200 |
4 |
The GENDF Format
Source
The GENDF (Groupwise ENDF) files are obtained from the IAEA Nuclear Data Services:
These are the 421-group JEFF-3.1 processed nuclear data files. Each file contains all reaction cross sections and transfer matrices for one nuclide at multiple temperatures.
Record Layout
Every line in a GENDF file is exactly 80 characters wide, following the ENDF-6 format [ENDF102] inherited from the punched-card era:
Columns 1-66: 6 data fields, 11 characters each
Columns 67-70: MAT number (material identifier)
Columns 71-72: MF number (file type)
Columns 73-75: MT number (reaction type)
Columns 76-80: line sequence number
For example, the first data record of H-1 looks like:
1.001000+3 9.991673-1 0 1 -1 1 125 1451 1
|----11----|----11----|----11----|----11----|----11----|----11----| MAT|MF|MT |SEQ|
Compact Float Notation
Data fields use a compact Fortran notation where the E in scientific
notation is omitted. The exponent sign immediately follows the mantissa:
1.001000+3 → 1.001000E+3 = 1001.0
9.991673-1 → 9.991673E-1 = 0.9991673
2.407191-7 → 2.407191E-7 = 2.407191×10⁻⁷
The parser _parse_gendf_field in gendf.py handles this
by inserting E before any + or - sign that follows a digit:
s = re.sub(r"(\d)([+-])", r"\1E\2", s)
return float(s)
MF and MT Numbers
The MF (file) and MT (reaction) numbers identify the type of data in each section:
MF=1 — General information:
MT |
Content |
|---|---|
451 |
Header: temperatures, sigma-zero base points, energy group boundaries (422 values for 421 groups) |
MF=3 — Cross sections (sigma-zero dependent, one value per group):
MT |
Reaction |
|---|---|
1 |
Total cross section (does not include upscattering) |
18 |
Fission |
102 |
Radiative capture \((n,\gamma)\) |
107 |
\((n,\alpha)\) |
452 |
Total \(\bar{\nu}\) (average neutrons per fission) |
MF=6 — Transfer matrices (group-to-group scattering):
MT |
Reaction |
|---|---|
2 |
Elastic scattering (sigma-zero dependent) |
16 |
\((n,2n)\) reaction |
18 |
Fission spectrum \(\chi(g)\) |
51–91 |
Discrete inelastic scattering levels |
221 |
Free-gas thermal scattering |
222 |
Thermal scattering for H bound in water (\(S(\alpha,\beta)\)) |
MF=3 Record Structure
Each MF=3 section begins with a header record followed by per-group data records. The structure for a section with \(N_\ell\) Legendre components and \(N_{\sigma_0}\) sigma-zero values is:
Record 1 (section header):
[ZA, AWR, NL, N_sig0, LRFLAG, NG, MAT, MF, MT, 1]
For each group g = 1, ..., NG:
Record (group header):
[TEMP, 0, NL, N_sig0, NW, IG, MAT, MF, MT, line]
Record(s) (data):
NW = 2 × NL × N_sig0 words packed 6 per line
The first half of the NW words contains flux weights; the second half contains the cross-section values organised as:
This is the Legendre-0 component for each sigma-zero. Higher Legendre components follow in the same block.
MF=6 Record Structure
Transfer matrices in MF=6 are stored per source group in a sparse representation. For each source group \(g\):
Record (group header):
[TEMP(?), 0, NG2, IG2LO, NW, IG, MAT, MF, MT, line]
Record(s) (data):
NW words packed 6 per line
where:
NG2— number of secondary (target) groups with non-zero valuesIG2LO— 1-based index of the lowest non-zero target groupNW— total words to read (includes flux weights)IG— 1-based source group index
The data layout per source group is:
Flux weights: \(N_\ell \times N_{\sigma_0}\) values (skipped)
Transfer values: for each target group from
IG2LOtoIG2LO + NG2 - 2, and for each sigma-zero and Legendre order:for i_to = IG2LO to IG2LO + NG2 - 2: for i_sig0 = 1 to N_sig0: for i_lgn = 1 to N_lgn: sigma_s(IG → i_to, Legendre=i_lgn, sig0=i_sig0)
Scattering Matrix Assembly
The scattering matrix \(\Sigma_{\mathrm{s},\ell}^{(\sigma_0)}\) is assembled from three separate GENDF sections. This is one of the most delicate parts of the pipeline.
Thermal-Group Boundary
The energy group structure uses a thermal cutoff at group index 95 (corresponding to \(E \approx 4\) eV). Below this energy, the free-atom elastic scattering model breaks down because the target atoms are bound in a lattice or molecule (thermal motion affects scattering).
The GENDF file provides two models:
MT=2 — free-atom elastic scattering (valid above ~4 eV)
MT=221 — free-gas thermal scattering (all isotopes except H-1)
MT=222 — \(S(\alpha,\beta)\) thermal scattering for H bound in water (H-1 only)
Assembly Algorithm
The scattering matrix is built in four stages:
Elastic (MT=2): Extract the elastic scattering transfer matrix. Zero out all entries where the source group \(g \le 95\) (the thermal range), because thermal scattering replaces elastic in that range. Add \(10^{-30}\) to all values (matching the MATLAB convention to avoid exact zeros in sparse matrices).
vals[thermal_mask] = 0.0 vals += 1e-30 sigS[lgn][sig0] = sparse(ifrom-1, ito-1, vals, NG, NG)
Inelastic (MT=51–91): For each discrete inelastic level that exists, extract the transfer matrix and add it to sigS. Inelastic scattering is sigma-zero independent (same values for all sigma-zero variants), so the first sigma-zero’s data is used for all.
Thermal (MT=221 or MT=222): Extract the thermal scattering kernel and add it to sigS. This replaces the zeroed elastic entries in the thermal range. Like inelastic, thermal scattering is sigma-zero independent.
The final scattering matrix structure is a list of lists:
sigS[legendre_order][sig0_index], each acsr_matrix(NG, NG). Three Legendre orders (P0, P1, P2) are always stored.
Important
The elastic scattering data for groups \(g > 95\) (epithermal and fast) does depend on sigma-zero. Each sigma-zero variant has different elastic values at these groups. For groups \(g \le 95\), the elastic is zeroed and replaced by the sigma-zero-independent thermal kernel.
Reactions Not Included
The GENDF files for heavy isotopes (U-235, U-238) contain MF=6 entries for \((n,3n)\) (MT=17) and \((n,4n)\) (MT=37). These are not extracted, matching the original MATLAB code which only processes MT=51–91. The impact is negligible for thermal reactor applications (threshold energies are 6–15 MeV).
Total Cross Section
The total cross section \(\Sigma_{\mathrm{t},g}(\sigma_0)\) is computed from the components rather than read from MF=3 MT=1:
This approach is used because:
MF=3 MT=1 does not include upscattering (stated in the MATLAB source: “note that mf=3 mt=1 does not include upscatters”).
Computing from components ensures self-consistency between the total and the reaction rates used by the solver.
sigT Consistency Issue (Historical)
Warning
The legacy MATLAB .m data files contain a systematic discrepancy
in the stored sigT values. This section documents the issue for
future reference.
The MATLAB convertCSVtoM.m script computes sigT from
full-precision intermediate variables and writes it with %13.6e
format (6 decimal places in scientific notation). It independently
truncates all component cross sections (sigC, sigF, sigS) to the same
format.
When the .m file is loaded and sigT is recomputed from the
stored (truncated) components, the result differs from the stored
sigT by a constant offset of 10–30 barns for heavy isotopes:
Isotope |
.m sigT[0,0] |
Recomputed |
Offset |
|---|---|---|---|
U-235 (294K) |
15,523.0 |
15,504.2 |
18.8 |
U-238 (600K) |
108.14 |
77.87 |
30.27 |
Zr-90 (600K) |
(offset) |
(recomputed) |
8.3 |
H-1 (294K) |
(matches) |
(matches) |
~0 |
The offset is constant across all energy groups and all sigma-zero rows for a given isotope/temperature. Isotopes with \(N_{\sigma_0} = 1\) (like H-1) show no discrepancy.
Root cause: The full-precision sigS row sums differ from the
truncated-then-resummed version. Although each individual truncation
error is \(O(10^{-7})\) relative, the scattering matrices have
thousands of non-zero entries per row at resonance energies, and the
accumulation of truncation errors is significant.
Impact on sigma-zero iterations: The sigma-zero solver interpolates
sigT(\sigma_0) from the tabulated values. Using the GENDF-computed
sigT instead of the .m file’s stored sigT shifts the
converged sigma-zeros, which propagates to different interpolated
cross sections and ultimately a ~0.4% shift in the PWR-like mixture’s
\(\kinf\) (1.01771 vs 1.01357).
HDF5 Storage Format
Each element is stored in a single HDF5 file (e.g., U_235.h5) with
one group per temperature:
/{temp_K}K/
@aw : scalar (atomic weight in amu)
@temp : scalar (temperature in K)
eg : (NG+1,) — energy group boundaries (eV)
sig0 : (N_sig0,) — sigma-zero base points (barns)
sigC : (N_sig0, NG) — radiative capture
sigL : (N_sig0, NG) — (n,alpha)
sigF : (N_sig0, NG) — fission
sigT : (N_sig0, NG) — total
nubar : (NG,) — average neutrons per fission
chi : (NG,) — fission spectrum (normalised to 1)
sig2/
row : (nnz,) int32 — COO row indices
col : (nnz,) int32 — COO column indices
data : (nnz,) float64 — COO values
sigS/
L{j}_S{k}/ — Legendre order j, sigma-zero k
row : (nnz,)
col : (nnz,)
data : (nnz,)
Dense arrays use gzip compression (level 4). Sparse matrices are stored as COO triplets to avoid scipy-specific formats.
File Sizes
Element |
Temperatures |
HDF5 Size (MB) |
|---|---|---|
H-1 |
8 |
12.3 |
U-235 |
6 |
50.0 |
U-238 |
6 |
37.8 |
O-16 |
6 |
10.8 |
Zr isotopes (×5) |
4 each |
~11 each |
Data Loading API
The load_isotope function provides a uniform API:
from data.micro_xs import load_isotope
iso = load_isotope("U_235", 600)
# iso.sigC — shape (10, 421), capture XS for 10 sigma-zeros
# iso.sigS[0][0] — csr_matrix(421, 421), P0 scattering at sig0=0
# iso.eg — shape (422,), energy group boundaries in eV
The loader reads from the HDF5 files in data/micro_xs/{name}.h5.
Conversion Script
To regenerate the HDF5 files from the GENDF sources:
cd data/micro_xs
python convert_gxs_to_hdf5.py
This processes all 12 .GXS files and writes the corresponding
.h5 files. Runtime is approximately 2–3 minutes on a modern
laptop.
Validation
The HDF5 data pipeline was validated by running both homogeneous reactor cases and comparing against the MATLAB reference:
Case |
Python \(\kinf\) |
MATLAB \(\kinf\) |
Match |
|---|---|---|---|
Aqueous (H₂O + U-235, 294K) |
1.03596 |
1.03596 |
Yes |
PWR-like (UO₂ + Zry + H₂O+B, 600K) |
1.01357 |
1.01357 |
Yes |
Per-component validation for H-1 at 294K (1 sigma-zero, simplest case):
Quantity |
Max diff (GXS vs .m) |
Status |
|---|---|---|
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
77,627 = 77,627 |
Exact |
Per-component validation for U-235 at 294K (10 sigma-zeros):
Quantity |
Max diff (GXS vs .m) |
Status |
|---|---|---|
|
0 |
Exact |
|
0 |
Exact |
|
0 |
Exact |
|
\(9.6 \times 10^{-7}\) |
Negligible |
|
6,067 = 6,067 |
Exact |
See also
Homogeneous Infinite-Medium Reactor — first consumer of the XS pipeline; demonstrates the full path from
load_isotope()to \(k_\infty\).Verification Suite — verification uses synthetic cross sections (regions A/B/C/D), not this pipeline.
Collision Probability Method, Discrete Ordinates Method (SN), Method of Characteristics (MOC), Monte Carlo Neutron Transport — all transport solvers consume
Mixtureobjects from this pipeline.
References
M.A. Kellett, O. Bersillon, R.W. Mills, “The JEFF-3.1/-3.1.1 Radioactive Decay Data and Fission Yields Sub-libraries”, OECD/NEA, 2009. ENDF-6 format manual: BNL-NCS-44945 (Rev. 2012).