![Malgorzata Dawidowska](/image/photo_user/no_image.jpg)
Contributions
Abstract: EP322
Type: E-Poster Presentation
Session title: Acute lymphoblastic leukemia - Biology & Translational Research
Background
T-cell acute lymphoblastic leukemia (T-ALL) is a highly heterogeneous and aggressive hematological malignancy. Genomic landscape of T-ALL has been largely characterized by whole exome sequencing (WES) and RNA-seq. Whole genome sequencing (WGS) enables to extend beyond the coding sequence but also offers more uniform sequencing coverage as compared to WES. Here we focus on protein coding genes affected by mutations detected by WGS in pediatric T-ALL.
Aims
To provide novel insights into genomic landscape of pediatric T-ALL by WGS
Methods
We performed WGS in 64 T-ALL samples from diagnosis (Dg) and 24 matched remission samples (Rem). TruSeq Nano DNA library kit and Illumina HiSeq X-Ten were used to generate 150bp paired-end reads. WGS coverage was 60x for Dg and 30x for Rem samples. FastQC and FastQ Screen were used for sequencing data quality control. Reads were aligned to GRCh38 using BWA mem, achieving an average alignment rate of 99.6%. Mapped reads were de-duplicated using MarkDuplicates (Picard tools) and processed with BaseRecalibrator (GATK v4.1.0.0). Average read duplication rate was 10.3%, thus true mean coverage across the entire genome was 72x for Dg samples (54x-82x) and 36x for Rem (31x-45x). Somatic variants were identified with Mutect2 (v4.1.4.0) and annotated using VariantEffectPredictor (v97). Potential germline variants were removed based on AF>0.05 in the NFE population (non-Finnish Europeans) as reported in gnomAD database and POLGENOM database (126 whole genomes of healthy long living Polish individuals). We then focused on single nucleotide variants (SNVs) and indels affecting coding sequence with moderate and high impact. Variants and affected genes were analyzed for potential involvement in processes classified by cancer-related terms selected from KEGG, Gene Ontology and Reactome databases. Affected genes were checked for previous reports as genes mutated in T-ALL patients in WES, WGS and RNA-seq studies [1-5].
Results
We identified 5535 somatic SNVs and indels affecting 4117 protein coding genes. In total, we identified 3817 genes not previously reported as affected in T-ALL studies [1-5]. Out of these genes, 764 (20%) were mutated in ≥2 Dg samples (2/64; frequency >3%), 54 genes (1.4%) were mutated in ≥5 Dg samples (5/64; frequency >7%). Among ‘novel’ T-ALL-mutated genes with >10% frequency, we identified 5 members of Neuroblastoma Breakpoint Family (NBPF) of putative roles in several cancer types. By functional classification of all mutated genes, according to KEGG terms, we revealed several processes to be affected in T-ALL patients, including: signal transduction (in 100% of T-ALL samples), transport and catabolism (97%), cell growth and death (86%), signaling molecules and interaction (81%), immune system (80%), translation (70%), folding, sorting and degradation (61%), transcription (48%), replication and repair (47%), membrane transport (38%).
Conclusion
WGS with high coverage (>60x) performed in a relatively high number of T-ALL Dg samples, identified numerous genes recurrently mutated in T-ALL patients, not previously reported to be affected in T-ALL. These findings provide further insights into genomic landscape of this heterogeneous malignancy. Results of this study form the basis for investigation of the biological and clinical relevance of novel genes in T-ALL.
References
1/De Keersmaecker et al. Nat Genet. 2013; 2/ Li et al. PLoS Med. 2016; 3/Liu et al. Nat Genet. 2017; 4/Chen et al. Proc Natl Acad Sci USA 2018; 5/Kimura et al. Cancer Sci. 2019
Keyword(s): Genomics, T-ALL
Abstract: EP322
Type: E-Poster Presentation
Session title: Acute lymphoblastic leukemia - Biology & Translational Research
Background
T-cell acute lymphoblastic leukemia (T-ALL) is a highly heterogeneous and aggressive hematological malignancy. Genomic landscape of T-ALL has been largely characterized by whole exome sequencing (WES) and RNA-seq. Whole genome sequencing (WGS) enables to extend beyond the coding sequence but also offers more uniform sequencing coverage as compared to WES. Here we focus on protein coding genes affected by mutations detected by WGS in pediatric T-ALL.
Aims
To provide novel insights into genomic landscape of pediatric T-ALL by WGS
Methods
We performed WGS in 64 T-ALL samples from diagnosis (Dg) and 24 matched remission samples (Rem). TruSeq Nano DNA library kit and Illumina HiSeq X-Ten were used to generate 150bp paired-end reads. WGS coverage was 60x for Dg and 30x for Rem samples. FastQC and FastQ Screen were used for sequencing data quality control. Reads were aligned to GRCh38 using BWA mem, achieving an average alignment rate of 99.6%. Mapped reads were de-duplicated using MarkDuplicates (Picard tools) and processed with BaseRecalibrator (GATK v4.1.0.0). Average read duplication rate was 10.3%, thus true mean coverage across the entire genome was 72x for Dg samples (54x-82x) and 36x for Rem (31x-45x). Somatic variants were identified with Mutect2 (v4.1.4.0) and annotated using VariantEffectPredictor (v97). Potential germline variants were removed based on AF>0.05 in the NFE population (non-Finnish Europeans) as reported in gnomAD database and POLGENOM database (126 whole genomes of healthy long living Polish individuals). We then focused on single nucleotide variants (SNVs) and indels affecting coding sequence with moderate and high impact. Variants and affected genes were analyzed for potential involvement in processes classified by cancer-related terms selected from KEGG, Gene Ontology and Reactome databases. Affected genes were checked for previous reports as genes mutated in T-ALL patients in WES, WGS and RNA-seq studies [1-5].
Results
We identified 5535 somatic SNVs and indels affecting 4117 protein coding genes. In total, we identified 3817 genes not previously reported as affected in T-ALL studies [1-5]. Out of these genes, 764 (20%) were mutated in ≥2 Dg samples (2/64; frequency >3%), 54 genes (1.4%) were mutated in ≥5 Dg samples (5/64; frequency >7%). Among ‘novel’ T-ALL-mutated genes with >10% frequency, we identified 5 members of Neuroblastoma Breakpoint Family (NBPF) of putative roles in several cancer types. By functional classification of all mutated genes, according to KEGG terms, we revealed several processes to be affected in T-ALL patients, including: signal transduction (in 100% of T-ALL samples), transport and catabolism (97%), cell growth and death (86%), signaling molecules and interaction (81%), immune system (80%), translation (70%), folding, sorting and degradation (61%), transcription (48%), replication and repair (47%), membrane transport (38%).
Conclusion
WGS with high coverage (>60x) performed in a relatively high number of T-ALL Dg samples, identified numerous genes recurrently mutated in T-ALL patients, not previously reported to be affected in T-ALL. These findings provide further insights into genomic landscape of this heterogeneous malignancy. Results of this study form the basis for investigation of the biological and clinical relevance of novel genes in T-ALL.
References
1/De Keersmaecker et al. Nat Genet. 2013; 2/ Li et al. PLoS Med. 2016; 3/Liu et al. Nat Genet. 2017; 4/Chen et al. Proc Natl Acad Sci USA 2018; 5/Kimura et al. Cancer Sci. 2019
Keyword(s): Genomics, T-ALL