1.Lethal and sub-viable knockout mouse lines require whole-embryo 3D imaging to connect genotype to phenotype (Dickinson et al., 2016; Cacheiro et al., 2022). There are often far fewer samples of in-class (e.g., homozygous knockouts) than wildtype or normative samples. Such extreme subject-level imbalance degrades both statistical anatomy and deep learning, often yielding saliency maps that highlight noise rather than lesion-specific signal (Adebayo et al., 2018; Buda et al., 2018; Johnson & Khoshgoftaar, 2019). We therefore asked whether focal loss (Lin et al., 2017) in combination with model-capacity control and seed ensembling can stabilize explanations without compromising classification accuracy.
Exencephaly is an neural tube defect characterized by incomplete closure of the cephalic neural tube and dorsally exposed,disorganized neural tissue (Greene & Copp, 2014; Noden & de Lahunta, 1985). In our late-gestation screen, approximately 10% of embryos had exencephaly, creating severe class imbalance and underscoring the need for interpretable automation.
As a test case for imbalance-aware, interpretable phenotyping, we analyzed 253 diceCT scans of E15.5 embryos (24 with exencephaly). A self-supervised transformer was fine-tuned with three regimes: cross-entropy (CE-Large), focal-loss equal-capacity (FL-Large) and focal-loss reduced-capacity (FL-Small). Five random seeds per regime yielded 15 models. Integrated Gradients saliency was quantified, and explanation quality was measured by saliency entropy (sparsity), cross-seed Dice/Jaccard similarity (reproducibility), and expert visual inspection.
All 15 models achieved near perfect phenotype recognition on held-out data with 0.996 {+/-} 0.002 mean accuracy with some seeds/regimens reached 1.000. Focal loss reduced saliency entropy by up to 1.5 bits and doubled cross-seed Dice overlap, concentrating attribution on the malformed cranial vault. Ensemble-averaged heat-maps show that both focal-loss regimes concentrate attribution on the malformed cranial vault while suppressing spurious body-wide signals.
Focal loss, modest capacity, and seed ensembling within a modified M3T transformer yielded sparse, reproducible, anatomically focused attribution while preserving perfect sensitivity. This supports trustworthy high-throughput phenotyping in severely imbalanced embryo screens. The workflow relies only on standard atlas registration and image pre-processing, requires no voxel-level annotations, and is readily adaptable to other structural malformations and developmental stages.
来源出处
Trustworthy detection of exencephaly in high-throughput micro-CT embryo scree…
https://www.biorxiv.org/content/10.1101/2025.08.12.669840v1?rss=1