Alison Tang
Alison Tang (UC Santa Cruz) describes her lab’s studies on full-length transcript characterization of the mutated SF3B1 transcriptome in chronic lymphocytic leukemia

Sponsored content brought to you by

Oxford Nanopore logo

Chronic Lymphocytic Leukemia (CLL) is the most common form of blood cancer in adults, with an estimated incidence of up to 5.5 per 100,000 people worldwide. Alison Tang, a graduate student in the lab of Angela Brooks at the University of California, Santa Cruz, has been studying one of the most frequently mutated genes in CLL (and other cancers)—the gene SF3B1, which encodes a splicing factor. Mutations in SF3B1 are associated with poor patient prognosis, but how they actually contribute to CLL progression is not well understood.*

This gene has been thoroughly studied with traditional short-read sequencing technologies, which have allowed the identification of aberrant splicing across the transcriptome associated with SF3B1 mutations.

Many groups have found a disproportional increase in aberrant intron 3’-end splicing associated with these mutations. Tang points to data from Catherine Wu’s lab at Dana-Farber Cancer Institute, with whom they are collaborating. A 2016 publication from Wu’s group reported transcriptomic characterization, using short reads, of primary CLL cells to identify transcripts and pathways affected by SF3B1 mutations. They found a much greater number of aberrant intron 3’ splice sites than were seen in cells with wild-type SF3B1 (Wang, L. et al. Cancer Cell 2016;30:750-763).

However, the inherent challenges of short sequencing reads limit their application to the analysis of splice junctions alone, rather than entire isoforms. That in turn limits understanding of the functional consequences of these aberrant splicing changes.

Going Long

To overcome these challenges, Tang applied long-read nanopore sequencing to enable unambiguous identification and analysis of full-length RNA transcripts, using nine flow cells on the high-throughput Oxford Nanopore PromethION platform.

“With short-read sequencing you are only able to look at these events in isolation,” she says. “With long-read sequencing, you get a view of the longer transcriptional context of the aberrant splicing events and you can better predict their functional consequences.”

Tang’s team used some of the patient samples from Wu’s 2016 study to see how much more information they could get from sequencing the full transcriptome of CLL samples with and without the SF3B1 mutation. In all, the samples they used comprised B-cells from three healthy donors,  three CLL patients with a wild-type SF3B1 and three CLL patients with SF3B1K700E. Notably, the K700E hotspot mutation in the HEAT-repeat domain of SF3B1 has been shown to be associated with altered splice sites at the 3’ end of introns.

Tang’s study produced 149 million cDNA reads that passed QC and were passed through the group’s Full-Length Alternative Isoform analysis of RNA (FLAIR) pipeline. Tang says this pipeline leverages the full-length transcript sequencing data afforded by the nanopore platform.

In this workflow, the raw sequencing reads are aligned to the human reference genome prior to correction using annotated splice junctions. The corrected reads are then grouped by their splice junction chain before each group is collapsed to form a consensus sequence for each individual transcript. Finally, the raw sequencing reads are reassigned to the collapsed isoforms. Isoforms that pass a given coverage threshold are retained and used as a high-confidence transcript reference.

Alternative splicing patterns can be very complex, Tang points out. First, these are cancer samples, which are notorious for being heterogenous. In addition, the splicing factor mutations are also complex. The team found that when they ran short-read alternative splicing callers on their data, the callers would often miss or mis-call some of the splicing events they were expecting.


As a result, the team developed a new splicing caller, FLAIR-diffSplice, which calls the four main types of alternative splicing events (alternative 3’- and 5’- splicing, intron retention, and exon skipping events). Further, the caller identifies which isoforms support the inclusion of the event, which isoforms support the exclusion of the event, and with the quantification of the isoform in each patient users can then perform statistical tests to determine which splicing events are differentially found between patient groups.

This caller identified the same alternative SF3B1 3’- splice site choice as detected using an alternative short-read sequencing-based analysis, confirming the validity of the approach. This pipeline, Tang says, “will correct and validate the splice junctions in the reads.” They were able to incorporate the short-read sequences they had for these samples. Based on these data, FLAIR built a set of isoforms that were representative of the reads.

Focusing on intron retention, Tang says that, even though in their study the nanopore sequencing reads were relatively short (approximately 1 kb), nanopore sequencing reads made intron retention “much more obvious.”

Tang also notes that thanks to the facility of nanopore technology to sequence full-length transcripts, “isoform productivity can be more confidently assessed.” The team defined an unproductive isoform as those that have a premature termination codon that is 55 nucleotides or more upstream of the most 3’ splice junction. Demonstrating the validity of their nanopore sequencing-based productivity assessment, Tang shared data showing complete concordance with previous studies of the highly characterized isoforms of SRSF1.

Upon examining the productivity of transcripts from the mutant SF3B1 cell line, the expression of unproductive isoforms was decreased in comparison to the other cell types tested. Tang has also shown that the down-regulated unproductive intron retention genes are associated with kinase signaling pathways, which might support tumor proliferation.

Summarizing her work, Tang says that nanopore sequencing, combined with the FLAIR analysis workflow, enables the study of “differential isoform usage, coordinated splicing events, and isoform productivity prediction.” Her group is now working on developing downstream analyses pipelines looking at productivity, differential splicing analysis and isoform usage.


*Nanopore Community Meeting, hosted by Oxford Nanopore Technologies; New York: December 5–6, 2019.

Watch Alison’s full talk

Previous articleTB or Not TB? Using Advanced Methods to Track Infectious Diseases
Next articleEliminating EGR4 from Patient’s T Cells May Be a Viable Immunotherapeutic Approach