Identification of differentially expressed splice variants by the proteogenomic pipeline splicify

Malgorzata A. Komor, Thang V. Pham, Annemieke C. Hiemstra, Sander R. Piersma, Anne S. Bolijn, Tim Schelfhorst, Pien M. Delis-Van Diemen, Marianne Tijssen, Robert P. Sebra, Meredith Ashby, Gerrit A. Meijer, Connie R. Jimenez, Remond J.A. Fijneman

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which can detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated downmodulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared with their controls. Splice variants identified included RAC1, OSBPL3, MKI67, and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 downmodulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/ SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers.

Original languageEnglish
Pages (from-to)1850-1863
Number of pages14
JournalMolecular and Cellular Proteomics
Volume16
Issue number10
DOIs
Publication statusPublished - 1 Oct 2017

Cite this

Komor, Malgorzata A. ; Pham, Thang V. ; Hiemstra, Annemieke C. ; Piersma, Sander R. ; Bolijn, Anne S. ; Schelfhorst, Tim ; Delis-Van Diemen, Pien M. ; Tijssen, Marianne ; Sebra, Robert P. ; Ashby, Meredith ; Meijer, Gerrit A. ; Jimenez, Connie R. ; Fijneman, Remond J.A. / Identification of differentially expressed splice variants by the proteogenomic pipeline splicify. In: Molecular and Cellular Proteomics. 2017 ; Vol. 16, No. 10. pp. 1850-1863.
@article{f05cd4be26c44cb9b9c1739d3267254f,
title = "Identification of differentially expressed splice variants by the proteogenomic pipeline splicify",
abstract = "Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which can detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated downmodulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared with their controls. Splice variants identified included RAC1, OSBPL3, MKI67, and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 downmodulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/ SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers.",
author = "Komor, {Malgorzata A.} and Pham, {Thang V.} and Hiemstra, {Annemieke C.} and Piersma, {Sander R.} and Bolijn, {Anne S.} and Tim Schelfhorst and {Delis-Van Diemen}, {Pien M.} and Marianne Tijssen and Sebra, {Robert P.} and Meredith Ashby and Meijer, {Gerrit A.} and Jimenez, {Connie R.} and Fijneman, {Remond J.A.}",
year = "2017",
month = "10",
day = "1",
doi = "10.1074/mcp.TIR117.000056",
language = "English",
volume = "16",
pages = "1850--1863",
journal = "Molecular and Cellular Proteomics",
issn = "1535-9476",
publisher = "American Society for Biochemistry and Molecular Biology Inc.",
number = "10",

}

Identification of differentially expressed splice variants by the proteogenomic pipeline splicify. / Komor, Malgorzata A.; Pham, Thang V.; Hiemstra, Annemieke C.; Piersma, Sander R.; Bolijn, Anne S.; Schelfhorst, Tim; Delis-Van Diemen, Pien M.; Tijssen, Marianne; Sebra, Robert P.; Ashby, Meredith; Meijer, Gerrit A.; Jimenez, Connie R.; Fijneman, Remond J.A.

In: Molecular and Cellular Proteomics, Vol. 16, No. 10, 01.10.2017, p. 1850-1863.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Identification of differentially expressed splice variants by the proteogenomic pipeline splicify

AU - Komor, Malgorzata A.

AU - Pham, Thang V.

AU - Hiemstra, Annemieke C.

AU - Piersma, Sander R.

AU - Bolijn, Anne S.

AU - Schelfhorst, Tim

AU - Delis-Van Diemen, Pien M.

AU - Tijssen, Marianne

AU - Sebra, Robert P.

AU - Ashby, Meredith

AU - Meijer, Gerrit A.

AU - Jimenez, Connie R.

AU - Fijneman, Remond J.A.

PY - 2017/10/1

Y1 - 2017/10/1

N2 - Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which can detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated downmodulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared with their controls. Splice variants identified included RAC1, OSBPL3, MKI67, and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 downmodulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/ SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers.

AB - Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which can detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated downmodulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared with their controls. Splice variants identified included RAC1, OSBPL3, MKI67, and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 downmodulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/ SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers.

UR - http://www.scopus.com/inward/record.url?scp=85030325349&partnerID=8YFLogxK

U2 - 10.1074/mcp.TIR117.000056

DO - 10.1074/mcp.TIR117.000056

M3 - Article

VL - 16

SP - 1850

EP - 1863

JO - Molecular and Cellular Proteomics

JF - Molecular and Cellular Proteomics

SN - 1535-9476

IS - 10

ER -