Pôle Rhône-Alpes de Bioinformatique Site Doua


One code to find them all

Version 1.0

CORRECTION: The way fasta files are handled has been corrected -- if you had issues with scaffold names they should now disappear (Thanks to H. Lopez for helping in finding the bug).
CORRECTION: One Code now returns correct reverse complement sequences -- there was a bug with reverse complement sequences composed of joined subparts (Thanks to M. Seidl for finding out the bug).
NEW: A small script to sum all *copynumber* outputs is available here (Thanks to P. Koch for the suggestion!).

One code to find them all is a set of perl scripts to extract useful information from RepeatMasker about transposable elements, retrieve their sequences and get some quantitative information.

Download the code

Logo LinuxPerl scripts
Logo tutorialTutorial

Dictionaries for LTR-retrotransposons

The script matches internal and LTR subparts of LTR retrotransposons, but the outptut files need to be manually checked and completed. Here are some manually curated dictionaries, graciously proposed by those who made them; if you have made one and want to make it available to the community, please contact us and we will add it to the list.

Organism Repeat library version Contributor(s)
Drosophila melanogaster (RM library release 20061006) E. Lerat, LBBE, Univ. Lyon 1.
Homo sapiens (RM library release 20120124) E. Lerat, LBBE, Univ. Lyon 1.
Arabidopsis thaliana (RM library release 20071204) A. Haudry & E. Lerat, LBBE, Univ. Lyon 1.


If you use One code to find them all in a published work, please cite the following reference:

Bailly-Bechet M., Haudry A. & Lerat E. (2014) “One code to find them all”: a Perl tool to conveniently parse RepeatMasker output files . Mobile DNA 5:13.