Rnaseq Analysis is Easy as 123 With Limma

. 2016 Jun 17;5:ISCB Comm J-1408.

doi: 10.12688/f1000research.9005.3. eCollection 2016.

RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR

Affiliations

  • PMID: 27441086
  • PMCID: PMC4937821
  • DOI: 10.12688/f1000research.9005.3

Free PMC article

RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR

Charity W Law  et al. F1000Res. .

Free PMC article

Abstract

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.

Keywords: RNA sequencing; data analysis; gene expression.

Conflict of interest statement

No competing interests were disclosed.

Figures

Figure 1.
Figure 1.

The density of log-CPM values for raw pre-filtered data ( A) and post-filtered data ( B) are shown for each sample. Dotted vertical lines mark the log-CPM threshold (equivalent to a CPM value of about 0.2) used in the filtering step.

Figure 2.
Figure 2.

Example data: Boxplots of log-CPM values showing expression distributions for unnormalised data ( A) and normalised data ( B) for each sample in the modified dataset where the counts in samples 1 and 2 have been scaled to 5% and 500% of their original values respectively.

Figure 3.
Figure 3.

MDS plots of log-CPM values over dimensions 1 and 2 with samples coloured and labeled by sample groups ( A) and over dimensions 3 and 4 with samples coloured and labeled by sequencing lane ( B). Distances on the plot correspond to the leading fold-change, which is the average (root-mean-square) log 2-fold-change for the 500 genes most divergent between each pair of samples by default.

Figure 4.
Figure 4.

Means (x-axis) and variances (y-axis) of each gene are plotted to show the dependence between the two before

voom

is applied to the data ( A) and how the trend is removed after

voom

precision weights are applied to the data ( B). The plot on the left is created within the

voom

function which extracts residual variances from fitting linear models to log-CPM transformed data. Variances are then rescaled to quarter-root variances (or square-root of standard deviations) and plotted against the mean expression of each gene. The means are log 2-transformed mean-counts with an offset of 2. The plot on the right is created using

plotSA

which plots log 2 residual standard deviations against mean log-CPM values. The average log 2 residual standard deviation is marked by a horizontal blue line. In both plots, each black dot represents a gene and a red curve is fitted to these points.

Figure 5.
Figure 5.

Venn diagram showing the number of genes DE in the comparison between basal versus LP only (left), basal versus ML only (right), and the number of genes that are DE in both comparisons (center). The number of genes that are not DE in either comparison are marked in the bottom-right.

Figure 6.
Figure 6.

Interactive mean-difference plot generated using Glimma. Summary data (log-FCs versus log-CPM values) are shown in the left panel which is linked to the individual values per sample for a selected gene in the right panel. A table of results is also displayed below these figures, along with a search bar to allow users to look up a particular gene using the annotation information available, e.g. the Gene symbol identifier Clu.

Figure 7.
Figure 7.

Heatmap of log-CPM values for top 100 genes DE in basal versus LP. Expression across each gene (or row) have been scaled so that mean expression is zero and standard deviation is one. Samples with relatively high expression of a given gene are marked in red and samples with relatively low expression are marked in blue. Lighter shades and white represent genes with intermediate expression levels. Samples and genes have been reordered by the method of hierarchical clustering. A dendrogram is shown for the sample clustering.

Figure 8.
Figure 8.

Barcode plot of LIM_MAMMARY_LUMINAL_MATURE_UP (red bars, top of plot) and LIM_MAMMARY_LUMINAL_MATURE_DN (blue bars, bottom of plot) gene sets in the LP versus ML contrast. For each set, an enrichment line that shows the relative enrichment of the vertical bars in each part of the plot is displayed. The experiment of Lim et al. (2010) is very similar to the current one, with the same sorting strategy used to obtain the different cell populations, except that microarrays were used instead of RNA-seq to profile gene expression. Note that the inverse correlation (the up gene set is down and the down gene set is up) is a result of the way the contrast has been set up (LP versus ML) – if reversed, the directionality would agree.

Similar articles

  • From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline.

    Chen Y, Lun AT, Smyth GK. Chen Y, et al. F1000Res. 2016 Jun 20;5:1438. doi: 10.12688/f1000research.8987.2. eCollection 2016. F1000Res. 2016. PMID: 27508061 Free PMC article.

  • Glimma: interactive graphics for gene expression analysis.

    Su S, Law CW, Ah-Cann C, Asselin-Labat ML, Blewitt ME, Ritchie ME. Su S, et al. Bioinformatics. 2017 Jul 1;33(13):2050-2052. doi: 10.1093/bioinformatics/btx094. Bioinformatics. 2017. PMID: 28203714 Free PMC article.

  • Integrative Differential Expression Analysis for Multiple EXperiments (IDEAMEX): A Web Server Tool for Integrated RNA-Seq Data Analysis.

    Jiménez-Jacinto V, Sanchez-Flores A, Vega-Alvarado L. Jiménez-Jacinto V, et al. Front Genet. 2019 Mar 29;10:279. doi: 10.3389/fgene.2019.00279. eCollection 2019. Front Genet. 2019. PMID: 30984248 Free PMC article.

  • Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2.

    Liu S, Wang Z, Zhu R, Wang F, Cheng Y, Liu Y. Liu S, et al. J Vis Exp. 2021 Sep 18;(175). doi: 10.3791/62528. J Vis Exp. 2021. PMID: 34605806

  • Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR.

    Chen Y, Pal B, Visvader JE, Smyth GK. Chen Y, et al. F1000Res. 2017 Nov 28;6:2055. doi: 10.12688/f1000research.13196.2. eCollection 2017. F1000Res. 2017. PMID: 29333247 Free PMC article.

Cited by

  • A Novel Necroptosis-Related Gene Signature in Skin Cutaneous Melanoma Prognosis and Tumor Microenvironment.

    Song B, Wu P, Liang Z, Wang J, Zheng Y, Wang Y, Chi H, Li Z, Song Y, Yin X, Yu Z, Song B. Song B, et al. Front Genet. 2022 Jul 11;13:917007. doi: 10.3389/fgene.2022.917007. eCollection 2022. Front Genet. 2022. PMID: 35899194 Free PMC article.

  • VMP1 Regulated by chi-miR-124a Effects Goat Myoblast Proliferation, Autophagy, and Apoptosis through the PI3K/ULK1/mTOR Signaling Pathway.

    Liu Y, Zhou Z, Li K, Wang P, Chen Y, Deng S, Li W, Yu K, Wang K. Liu Y, et al. Cells. 2022 Jul 18;11(14):2227. doi: 10.3390/cells11142227. Cells. 2022. PMID: 35883670 Free PMC article.

  • Functional isolation, culture and cryopreservation of adult human primary cardiomyocytes.

    Zhou B, Shi X, Tang X, Zhao Q, Wang L, Yao F, Hou Y, Wang X, Feng W, Wang L, Sun X, Wang L, Hu S. Zhou B, et al. Signal Transduct Target Ther. 2022 Jul 27;7(1):254. doi: 10.1038/s41392-022-01044-5. Signal Transduct Target Ther. 2022. PMID: 35882831 Free PMC article.

  • Exploring COVID-19 pathogenesis on command-line: A bioinformatics pipeline for handling and integrating omics data.

    Macedo-da-Silva J, Coutinho JVP, Rosa-Fernandes L, Marie SKN, Palmisano G. Macedo-da-Silva J, et al. Adv Protein Chem Struct Biol. 2022;131:311-339. doi: 10.1016/bs.apcsb.2022.04.002. Epub 2022 May 12. Adv Protein Chem Struct Biol. 2022. PMID: 35871895 Free PMC article.

  • Patch-to-Seq and Transcriptomic Analyses Yield Molecular Markers of Functionally Distinct Brainstem Serotonin Neurons.

    Mouradian GC Jr, Liu P, Nakagawa P, Duffy E, Gomez Vargas J, Balapattabi K, Grobe JL, Sigmund CD, Hodges MR. Mouradian GC Jr, et al. Front Synaptic Neurosci. 2022 Jun 30;14:910820. doi: 10.3389/fnsyn.2022.910820. eCollection 2022. Front Synaptic Neurosci. 2022. PMID: 35844900 Free PMC article.

References

    1. Robinson MD, McCarthy DJ, Smyth GK: edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. 10.1093/bioinformatics/btp616 - DOI - PMC - PubMed
    1. Ritchie ME, Phipson B, Wu D, et al. : limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. 10.1093/nar/gkv007 - DOI - PMC - PubMed
    1. Huber W, Carey VJ, Gentleman R, et al. : Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015;12(2):115–21. 10.1038/nmeth.3252 - DOI - PMC - PubMed
    1. Su S, Law CW, Ah-Cann C, et al. : Glimma: interactive graphics for gene expression analysis. Bioinformatics. 2017;33(13):2050–2052. 10.1093/bioinformatics/btx094 - DOI - PMC - PubMed
    1. Sheridan JM, Ritchie ME, Best SA, et al. : A pooled shRNA screen for regulators of primary mammary stem and progenitor cells identifies roles for Asap1 and Prox1. BMC Cancer. 2015;15(1):221. 10.1186/s12885-015-1187-z - DOI - PMC - PubMed

Grant support

This work was funded by the National Health and Medical Research Council (NHMRC) (Fellowship GNT1058892 and Program GNT1054618 to GKS, Project GNT1050661 to MER and GKS and Fellowship GNT1104924 to MER), Victorian State Government Operational Infrastructure Support and Australian Government NHMRC IRIISS.

LinkOut - more resources

  • Full Text Sources

  • Other Literature Sources

lanetrainty.blogspot.com

Source: https://pubmed.ncbi.nlm.nih.gov/27441086/

0 Response to "Rnaseq Analysis is Easy as 123 With Limma"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel