To address these gaps, we conducted EWAS in three independent cohorts comprising judicially or socially certified CM cases and matched controls. Our primary aim was to identify robust epigenetic signatures associated with CM. We conducted EWAS separately in each cohort, followed by meta-analysis to identify common methylation markers. We further examined associations between these markers and CM-related biological indicators, and evaluated their utility in classifying CM exposure. To our knowledge, this is the first EWAS of CM employing multiple independent cohorts analyzed in parallel using a unified analytical pipeline, enabling subsequent meta-analysis and synthesis of findings. Our study thus provides a critical step toward identifying reliable epigenetic biomarkers for early detection and prevention of CM, overcoming limitations of previous research.
This study included three Japanese cohorts comprising a total of 226 children for genome-wide DNA methylation analyses (Table 1). The cohorts consisted of judicial autopsy cases, children sheltered in residential childcare facilities, and typically developing (TD) children raised by biological families recruited from the local community as controls. Children in residential childcare facilities had been legally removed from their biological parents by Child Protection Services or equivalent authorities, and most had documented histories of physical, emotional, or sexual abuse, or neglect prior to placement. Participants with documented maltreatment histories were classified as the CM group (ICD-10-CM Code T74). Psychosocial difficulties and depressive symptoms were assessed using the Strengths and Difficulties Questionnaire (SDQ) [14] and the Depression Self-Rating Scale for Children (DSRS-C) [15], respectively.
Twenty-six children whose deaths were judicially authenticated by a forensic pathologist (M.N.) between 2000 and 2021 were included. Of these, 15 cases (CM) had causes of death attributed to child abuse or neglect, and the remaining 11 cases (TD) were due to fatal accidents or illnesses (Supplementary Table S1). Thymus weight records were available for 24 cases (CM:15, TD:9). Whole blood samples from 18 cases (CM:11, TD:7) had been stored at -20 °C. Formalin-fixed paraffin-embedded (FFPE) brain tissues or formalin-immersed brain blocks were preserved for 24 cases (CM:13, TD:11). Prefrontal cortex tissues were selected for methylation analysis, as this region was consistently preserved across all cases. The study protocol was approved by the Ethics Committee of the University of Fukui (approval no. 20200030) and the Research Ethics Review Board of Hiroshima University (approval no. E-2032), and was conducted in accordance with the Declaration of Helsinki.
One hundred twenty-two children aged 0-9 years participated in this cohort between 2017 and 2021. Participants underwent assessments of social cognitive function using gaze pattern analysis and provided buccal mucosa samples [16,17,18]. Genome-wide methylation analysis was conducted on 85 participants (CM:36, TD:49) who passed the quality control (QC) procedures described below, had no repeated measurements, and completed cognitive assessments using either the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV), Kyoto Scale of Psychological Development (KSPD), or equivalent developmental scales (Table 1). The study protocol was approved by the Ethics Committee of the University of Fukui (approval nos. 20140142, 20150068, and 20190107) and conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all parents or childcare facility directors.
Two hundred thirty-seven children and adolescents aged 6-18 years (CM:83, TD:154) participated in this cohort between 2013 and 2022, undergoing brain MRI scans [4, 5, 19,20,21,22] (Supplementary Table S2). Group comparisons of brain gray matter (GM) structures were conducted using the full dataset. Saliva samples were collected from 141 participants. Genome-wide methylation analysis was performed on a subset of 123 participants (CM:61, TD:62) who passed the QC procedures described below, had no repeated measurements, and completed full-scale IQ (FSIQ) assessments (Table 1). The study protocol was approved by the Ethics Committee of the University of Fukui (approval nos. 20110104, 20130157, 20138031, 20150068, 20190107, 20210004, 20220034, and 20220039) and conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all parents or childcare facility directors.
For the Judicial Autopsy Cases, DNA was extracted from whole blood samples using the AllPrep DNA/RNA/miRNA Universal Kit (QIAGEN, Venlo, Netherlands). Brain DNA was extracted from FFPE tissues using the High Pure FFPET DNA Isolation Kit (Roche, Basel, Switzerland), starting from approximately 10 mg tissue blocks or six slices (12 µm thickness, 3 cm²) until a total yield of 250 ng DNA was obtained. A modified pre-processing protocol was employed to improve extraction efficiency, including two five-minute ethanol washes, an overnight PBS wash (50 °C, 600 rpm), and overnight lysis (56 °C, 600 rpm) [23, 24].
In the Toddler Social Cognition cohort, buccal swab samples were collected using commercially available cotton swabs, with either one swab (CM:13, TD:16) or four swabs (CM:23, TD:33) per individual [16, 18]. DNA was extracted using the QIAamp DNA Mini Kit (QIAGEN, Venlo, Netherlands). The first 16 TD swabs had unintentionally been stored at room temperature for an average of 461 ± 25 (SD) days before DNA extraction; thus, storage duration was included as a covariate in subsequent analyses. In the Adolescent Brain Imaging cohort, saliva samples were collected using the Oragene Discover OGR-500 kit (DNA Genotek Inc., Ottawa, Canada), and DNA was extracted using the prepIT®·L2P reagent (DNA Genotek) [4, 5, 25]. A total of 119 individuals (CM:58, TD:61) underwent both brain MRI and saliva collection and were available for imaging epigenetics analysis. Within this group, 72 (CM:38, TD:34) saliva samples were collected on the day of brain imaging or within several days; however, the remaining 47 (CM:20, TD:27) samples showed discrepancies in the dates of collection. DNA concentration was quantified using the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific Inc., Pittsburgh, PA, USA).
Genome-wide DNA methylation was assessed using the Infinium HumanMethylationEPIC BeadChip Kit (Illumina). DNA samples (500 ng for peripheral and 250 ng for brain tissues) were bisulfite-converted using the EZ DNA Methylation™ Kit (Zymo Research). Standardized QC procedures for peripheral tissues (blood, buccal mucosa, and saliva) were conducted separately for each cohort, including rigorous probe and sample filtering, data normalization, and correction for batch effects (see Supplementary Methods). For brain samples, customized QC criteria were applied to accommodate variations in sample quality (see Supplementary Methods).
Thymus weight ratio, reflecting the severity and duration of child abuse or neglect [26], was calculated relative to age-specific normal ranges established by the Medico-Legal Society of Japan [27] (see Supplementary Methods).
Social cognitive function was assessed using gaze fixation time on the eye region measured by the eye-tracking system (Gazefinder®), as previously described [17, 18, 28] (see Supplementary Methods).
Structural MRI data were acquired from 237 participants using 3-Tesla scanners, and images were preprocessed using SPM12. Regional differences in gray matter volume (GMV) between groups were analyzed using VBM, adjusting for age, FSIQ, scanner type, and total GMV (see Supplementary Methods).
Two sets of analyses -- differentially methylated probe (DMP) analysis and Gene-based Association test for Multiple Traits (GAMuT) -- were performed for each cohort to identify methylation and gene associated with CM.
For the DMP analysis, we applied a multiple linear regression model using the limma package [29], with DNA methylation as the dependent variable and group (CM vs. TD) as the independent variable, adjusting for relevant covariates. Specifically, covariates included age, sex, days after death, and estimated cell-type proportions (CD8T, CD4T, NK, B, and monocytes) for the Judicial Autopsy Cases; age, sex, IQ or DQ, days until DNA extraction, and buccal cell proportions for the Toddler Social Cognition cohort; and age, sex, FSIQ, and buccal cell proportions for the Adolescent Brain Imaging cohort. Genome-wide analyses often exhibit inflation (reflected by an increased λ), typically due to residual confounding factors, leading to inflated type I error rates and false-positive findings. To correct for this inflation, we applied the bacon method [30] to the DMP analysis results. A Storey's q-value < 0.05 was considered statistically significant. For the top 20 CpGs identified in these DMP analyses, we conducted association analyses with the endophenotypes and clinical measures described above for each cohort.
The GAMuT is a recently developed genome-wide analytical method designed to examine epigenetic associations across multiple phenotype domains simultaneously [31]. In this study, we conducted gene-based analyses to estimate epigenetic associations across maltreatment-type domains. Each CpG site was assigned to its closest gene using hiAnnotator [32] and Ensembl gene predictions, following the approach described by Hüls et al. [31]. All CpG sites within 20 kb of the nearest gene were included. In addition to the group variable, each maltreatment domain (physical abuse [PA], emotional abuse [EA], sexual abuse [SA], and neglect [NG]) was tested individually or aggregated into a composite maltreatment measure. Covariates included in the GAMuT model were identical to those used in the DMP analyses. To control for multiple testing, we adopted a significance threshold of P < 5.0E-05, consistent with the suggestive threshold used by Hüls et al. [31]. Genes exceeding this threshold in any group, composite measure, or individual maltreatment domain were identified. As a secondary analysis, we performed DMP analyses for each probe within these significant genes, testing associations with group and individual maltreatment domains. Probes with P < 0.001 in at least one domain were considered significant, following previously established criteria [31]. GAMuT analyses were not conducted for the Judicial Autopsy Cases, as multiple-trait maltreatment information was unavailable.
For brain methylation data, multiple regression analyses were performed individually for each probe meeting the customized QC criteria, using a model similar to that described above for blood methylation, except without adjustment for cell-type proportions.
A meta-analysis was conducted on 816,366 CpGs common to all three cohorts using the weighted sum of z-scores method [33]. Multiple testing correction was performed using Storey's q-value. To assess potential heterogeneity across cohorts, we calculated heterogeneity metrics (Cochran's Q-statistic and I²) for significant CpG sites identified in the meta-analysis. For the CpGs identified as significant in the meta-analysis, we conducted association analyses with maltreatment history (described in detail below), as well as with the endophenotypes and clinical measures described above for each cohort.
To obtain a comprehensive overview of biological functions enriched among the top-ranked CpGs identified in the meta-analysis, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted on the top 1500 CpG sites. GO enrichment results were visualized using a GO semantic similarity matrix [34], clustering significantly enriched GO terms (P < 0.05). KEGG pathways with P < 0.05 were considered significant.
Candidate CpGs previously reported in five independent EWAS examining associations with CM [8, 10,11,12,13] were evaluated within the analytical framework of the meta-analysis.
Because maltreatment history variables (age at exposure, type, and duration of maltreatment) inherently presume the presence of CM, analyses examining their associations were restricted to the CM-associated CpGs identified in our meta-analysis. To evaluate sensitive developmental periods and the relative importance of maltreatment types, we conducted random forest regression analyses as previously described [4] (see Supplementary Methods).
Validation analyses were conducted using the publicly available dataset comprises blood DNA methylation data (EPIC array) from institutionalized and family-raised children (GSE118940) [10]. Logistic regression and methylation risk scores (MRS) [35] were used to evaluate the predictive utility of the identified CpGs (see Supplementary Methods).
To assess correlations between peripheral tissue methylation and brain methylation, the DMPs identified in each cohort and the meta-analysis were cross-referenced with the AMAZE-CpG database [36], which we previously developed for Japanese and Asian populations. This database provides Spearman's correlation coefficients (rho) and associated P-values calculated from paired samples of live brain and peripheral tissues (blood, buccal, and saliva) obtained from 19 Japanese individuals.