Even if the affirmation of RNAseq as “tool of choice” to explore the transcriptome is a fact, the challenges to improve the reliability and reproducibility of this method are still quite a lot. On the statistical side, the choice between parametric and non-parametric tests is a point of controversy. In 2011, Robert Tibshiran argued that even though parametric tests are widely used in genomic sequencing analysis, they are still very sensitive to data dispersion. Poisson or negative binomial models are very useful, but they can still be affected by outliers, and a possible solution may come from the application of a non-parametric statistical test. This opinion seems to be quite widely accepted among biostatisticians () and many methods are being developed to implement non-parametric approaches in software packages.
Even if I suggest to expore Tibshiran’s paper and the method proposed, today I would like to focus on rSeqNP, an R package implemented from a non-parametric approach, and recently published on Bionformatics. Before applying the analysis, the package needs to deal with a processed raw-dataset. This is because the package works on the expression estimates of all the genes and their isoforms for each sample in the RNA-Seq study, and the outputs from rSeq, RSEM and Cuffdiff are accepted.
The package allows to use different methods for different kind of analysis, as explained in the following table (Shi et al., Bioinformatics 2015)
After simulation analyses, the package has been proven to have a well controlled type I error rate, and achieves good statistical power for moderate sample sizes and effect sizes. In the supplementary data, you can find a demo analysis on real data (Leng, et al., 2013) and a comparison with the EBSeq package functioning. rSeqNP can also detect alternative splicing, by computing an overall score, the gene-level differential score (GDS).
The package and it documentation are free for download at http://wwwpersonal.umich.edu/~jianghui/rseqnp/.