mc ePoPE Download and Supplemental Material


ePoPE: efficient Prediction of Paralog Evolution

A dynamic programming algorithm designed to efficiently trace back the last common ancestor of a gene family and automatic annotation of gain and loss of its paralogs to the inner nodes of a given phylogenetic tree.

ePoPE is implemented in standard C programming language and available under the GNU General Public License.


Latest release including an improved partition function back recursion and a new summarizing script.

All outcomes for sankoff and partition function and the final annotated metazoan tree.

Former release including new partition function variant and Newick tree parser.

Former release including minor bug fixes and an additional option in the summarize script to adjust the file endings of the individual runs.

Original release:


  1. Installation + Tutorial for 1 alignment
  2. Tutorial for >1 alignment
  3. Usage information
  4. Supplemenatl Material


Download the program and save it into a directory of your choice.

Unzip the file:

$ tar -xzvf ePoPE_1.0.tar.gz

Install the program:

$ make

Test the installation with the example:

$ ./ePoPE -i example/example.stk -t example/example.tree.dat -p example/ -o example/example.ePoPE.out

You may copy the binary of ePoPE and into your bin directory or set an alias to your installation.

Short tutorial for more than 1 alignment

ePoPE can be applied to set of multiple sequence alignments. If the respective gene families are related or in other words all belong to a certain class of genes, e.g. miRNAs or snoRNAs, the output of these ePoPE runs can be summarized using the provided perl script:

	   $ ./ -h
      -d DIR -o FILE [-e STR]

               -help   Print a brief help message and exits.

               -e STR  String defining the ending of the output files from your
                       individual ePoPE runs. (Default: "ePoPE.out"). [OPTIONAL]

               -d DIR  the directory that contains the data output files of GLparaPred.
                       The ending of the files is the ending you provide with -e
                       option. [REQUIRED]

               -o FILE the output file for the final summarized tree data. [REQUIRED]


The summarized output is provided in the file assigned with -o option. It is a space separated file containing a list of nodes with all computed labels. It can therefore be imported into any text editor of even as a spreadsheet for further analysis.

The summarized ePoPE output can be visualized calling the program again. First the output of needs to be sorted:

sort -nk 2 ePoPE.summarize.out >ePoPE.summarize.sort.out

Then call the programm with the sorted output and the same tree that has been used for applying ePoPE to the individual alignments.

./ePoPE -c ePoPE.summarize.sort.out -t treefile -p --type all -o ePoPE.summarize.out.ePoPE.out

The file is the final summarized Postscript file.


You will get help when calling ePoPE without any arguments or with options -h.

$ ./ePoPE
	  | ePoPE 1.0                                        |
	  |                                                  |
	  | ePoPE - efficent Prediction of Paralog Evolution |

	  ePoPE predicts a maximal parsimony solution of gain and loss events of a gene family with paralogs.

	  Usage: ePoPE [ arguments ] -i ALNFILE -t TREEFILE

	  arguments: [-w WEIGHTFILE]
          [-o OUTFILE] [-p PS-OUTFILE]
          [-c COLLECTFILE]
          [-h,--help] [-v,--version]
          [--type TYPE]

	  -i FILE              Input alignment FILE in CLUSTALW/STOCKHOLM format. [REQUIRED]
	  -t FILE              Input tree FILE see example.tree.dat format. [REQUIRED]
	  -w FILE              Input weight array FILE. [REQUIRED]

	  -o FILE              Output FILE for tree data. Default is 'INFILE.dat'.[OPTIONAL]

	  -p FILE              Output FILE for PS-tree data. Default is ''. [OPTIONAL]

	  -c FILE              FILE is a collection of calls to ePoPE with the same tree on a set of gene families
	                       created via ''. This option forces ePoPE to draw this summarized
	                       tree. You must provide the tree file you used for the single ePoPE calls with -t option.

	  -c FILE              TYPE is one of {genes, gainFam, lossFam, gain, loss, all}. Is the type of values that
	    are plotted in the tree. Default: 'all'. [OPTIONAL]

	  -h,--help            Show this help message.
	  -v,--version         Show version information.

	  Example call:

	  ./ePoPE -i example/example.stk  -t example/example.tree.dat -p example/ -o example/example.ePoPE.out --type all

	  Please feel free to contact me for comments, bug-reports, etc.

	  ePoPE 1.0

	  Auhthor: Jana Hertel:

	  Date:    November, 2014

Jana Hertel