portfolio logo

metaprotr: R package for metaproteomics analysis

05 / 02 / 2021

tech project

The origin of metaprotr started at the beginning of the COVID-19 pandemic. As part of the plan to work at home for some time in early 2020, I was in charge to develop a reusable script to analyze and manipulate metaproteomics data. The metaproteome involves the identification of the proteins from all the microorganisms present in a sample. For example, in a fecal extract from human there are >70,000 proteins associated to >1,500 microorganisms.

Thus, in collaboration with Catherine Juste (FINE, INRAE of Jouy-en-Josas) and Céline Henry (PAPPSO, INRAE of Jouy-en-Josas) we decided to build an R package to be the final brick in the Mass Spectrometry workflow to mine the metaproteomics data generated during the past 5 years at PAPPSO proteomics facility.

This R package contains a set of tools for descriptive analysis of metaproteomics data. These tools allow to cluster peptides and proteins abundance, expressed as spectral counts, and to manipulate them in groups of metaproteins. This information can be represented using multiple visualization functions to portray the global metaproteome landscape and to differentiate samples or conditions, in terms of abundance of metaproteins, taxonomic levels and/or functional annotation.

The provided tools allow to implement flexible analytical pipelines that can be easily applied to studies interested in metaproteomics analysis. A pipeline to analyze human gut metaproteomics was developed in the shape of a minimalist R script that hopefully can be reused and adapted for future metaproteome analysis.

# metaproteomics #  R #  mass spec

check the source