KEMBAR78
Linux intro 5 extra: makefiles | PDF
Programming for Evolutionary Biology
         March 17th - April 1st 2012
             Leipzig, Germany




Introduction to Unix systems
Extra: writing simple pipelines
           with make
         Giovanni Marco Dall'Olio
         Universitat Pompeu Fabra
             Barcelona (Spain)
GNU/make
   make is a tool to store command­line instructions 
     and re­execute them quickly, along with all their 
     parameters
   It is a declarative programming language
   It belongs to a class of softwares called 'automated 
       build tools'
Simplest Makefile example
   The simplest Makefile contains just the name of a task and 
      the commands associated with it:




   print_hello is a makefile 'rule': it stores the commands
    needed to say 'Hello, world!' to the screen.
Simplest Makefile example




                                          Makefile rule
Target of the
rule


                    Commands associated
  This is a         with the rule
  tabulation (not
  8 spaces)
Simplest Makefile example
   Create a file in your 
      computer and save it as 
      'Makefile'.
   Write these instructions in it:

      print_hello:
          echo 'Hello, world!!'
                                      This is a tabulation
   Then, open a terminal and         (<Tab> key)
      type:
    make -f Makefile print_hello
Simplest Makefile example
Simplest Makefile example
            –
       explanation




   When invoked, the program 'make' looks for a file in the 
     current directory called 'Makefile'
   When we type 'make print_hello', it executes any procedure 
     (target) called 'print_hello' in the makefile
   It then shows the commands executed and their output
Tip1: the 'Makefile' file
   The '­f' option allows you to define the file which 
     contains the instructions for make
   If you omit this option, make will look for any file 
       called 'Makefile' in the current directory

      make -f Makefile all

      is equivalent to:

      make all
A sligthly longer example
   You can add as many 
     commands you like 
     to a rule
   For example, this 
      'print_hello' rule 
      contains 5 commands
   Note: ignore the '@' 
     thing, it is only to 
     disable verbose mode 
     (explained later)
A more complex example
Make - advantages
   Make allows you to save shell commands along 
     with their parameters and re­execute them;
   It allows you to use command­line tools which are 
       more flexible;
   Combined with a revision control software, it 
     makes possible to reproduce all the operations 
     made to your data;
Second part



A closer look at make syntax (target and
               commands)
The target syntax
   Makefile syntax:
         <target>: (prerequisites)
            <commands associated to the
          rule>
The target syntax
   The target of a rule can be either a title for the task, or a file 
      name.
   Everytime you call a make rule (example: 'make all'), the 
      program looks for a file called like the target name (e.g. 
      'all', 'clean', 'inputdata.txt', 'results.txt')
   The rule is executed only if that file doesn't exists.
Filename as target names
                     In this
                      makefile, we
                      have two rules:
                      'testfile.txt' and
                      'clean'
Filename as target names
                     In this
                      makefile, we
                      have two rules:
                      'testfile.txt' and
                      'clean'

                     When we call
                      'make
                      testfile.txt',
                      make checks if
                      a file called
                      'testfile.txt'
                      already exists.
Filename as target names




                  The commands
                  associated with the
                  rule 'testfile.txt' are
                  executed only if
                  that file doesn't
                  exists already
Multiple target definition
   A target can also be a 
      list of files
   You can retrieve the 
      matched target with 
      the special variable 
      $@
Special characters
   The % character can be used as a wild card
   For example, a rule with the target:
     %.txt:
         ....
     would be activated by any file ending with '.txt'
          'make 1.txt', 'make 2.txt', etc..
   We will be able to retrieve the matched expression 
     with '$*'
Special character % /
creating more than a file at
           a time
Makefile – cluster support
   Note that in the previous 
      example we created three 
      files at the same time, by 
      executing three times the 
      command 'touch'
   If we use the '­j' option when 
       invoking make, the three 
       processess will be launched 
       in parallel
The commands syntax
   Makefile syntax:
         <target>: (prerequisites)
            <commands associated to the
          rule>
Inactivating verbose mode
    You can disactivate the verbose mode for a line by 
      adding '@' at its beginning:




Differences here
Skipping errors
   The modifiers '­' tells make to ignore errors returned 
     by a command
   Example: 
          'mkdir /var' will cause an error (the '/var' directory 
             already exists) and cause gnu/make to exit
          '­mkdir /var' will cause an error anyway, but 
             gnu/make will ignore it
Moving throught directories
   A big issue with make is that every line is executed as a 
      different shell process.
   So, this:

      lsvar:
         cd /var
         ls 
    Won't work (it will list only the files in the current 
     directory, not /var)
   The solution is to put everything in a single process:
    lsvar:
          (cd /var; ls)
Third part




Prerequisites and conditional execution
The commands syntax
   Makefile syntax:
         <target>: (prerequisites)
            <commands associated to the
          rule>
   We will look at the 'prerequisites' part of a make 
     rule, that I had skipped before
Real Makefile-rule syntax
   Complete syntax for a Makefile rule:
          <target>: <list of prerequisites>
             <commands associated to the rule>


   Example:
          result1.txt: data1.txt data2.txt
            cat data1.txt data2.txt > result1.txt
            @echo 'result1.txt' has been calculated'


   Prerequisites are files (or rules) that need to exists already in 
      order to create the target file.
   If 'data1.txt' and 'data2.txt' don't exist, the rule 'result1.txt' will 
       exit with an error (no rule to create them)
Piping Makefile rules
             together
   You can pipe two Makefile rules together by 
     defining prerequisites
Piping Makefile rules
              together
   The rule 'result1.txt' depends on the rule 'data1.txt', 
     which should be executed first
Piping Makefile rules
                together
   Let's look at this 
      example 
      again:
    what happens if 
      we remove the 
      file 'result1.txt' 
      we just 
      created?
Piping Makefile rules
                together
   Let's look at this 
      example 
      again:
    what happens if 
      we remove the 
      file 'result1.txt' 
      we just 
      created?
   The second time 
      we run the 
      'make 
      result1.txt' 
      command, it is 
      not necessary 
      to create 
      data1.txt 
Other pipe example
   all: result1.txt result2.txt

      result1.txt: data1.txt
      calculate_result.py
        python calculate_result.txt --input
      data1.txt

      result2.txt: data2.txt
        cut -f 1, 3 data2.txt > result2.txt
   Make all will calculate result1.txt and result2.txt, if 
     they don't exist already (and they are older than 
     their prerequisites)
Conditional execution by
       modification date
   We have seen how make can be used to create a 
     file, if it doesn't exists.

      file.txt:
         # if file.txt doesn't exists, then create it:
         echo 'contents of file.txt' > file.txt

   We can do better: create or update a file only if it is 
     newer than its prerequisites
Conditional execution by
       modification date
   Let's have a better look at this example:

      result1.txt: data1.txt
      calculate_result.py
        python calculate_result.txt --input
      data1.txt
   A great feature of make is that it execute a rule not 
     only if the target file doesn't exist, but also if it 
     has a 'last modification date' earlier than all of its 
     prerequisites
Conditional execution by
       modification date
    result1.txt: data1.txt
        @sed 's/b/B/i' data1.txt > result1.txt
        @echo 'result1.txt has been calculated'
   In this example, result1.txt will be recalculated 
      every time 'data1.txt' is modified
    $: touch data1.txt calculate_result.py
      $: make result1.txt
      result1.txt has been calculated
      $: make result1.txt
      result1.txt is already up-to-date
      $: touch data1.txt
      $: make result1.txt
      result1.txt has been calculated
Conditional execution -
          applications
   This 'conditional execution by modification date 
     comparison' feature of make is very useful
   Let's say you discover an error in one of your input 
     data: you will be able to repeat the analysis by 
     executing only the operations needed
   You can also use it to re­calculate results every time 
     you modify a script:

      result.txt: scripts/calculate_result.py
        python calculate_result.py > result.py
Another example
Fourth part




Variables and functions
Variables and functions
   You may have already noticed that Make's syntax is 
     really old :)
   In fact, it is a ~40 years old language
   It uses special variables like $@, $^, and it can be 
       worst than perl!!! 
   (perl developers – please don't get mad at me :­) )
Variables
      Variables are declared with a '=' and by convention 
        are upper case.
      They are called by including their name in '$()'
         

WORKING_DIR
is a variable
Special variables - $@
   Make uses some custom variables, with a syntax 
     similar to perl
   '$@' always corresponds to the target name:

     $: cat >Makefile

     %.txt:
        echo $@

     $: make filename.txt        $@ took the value of
     echo filename.txt           'filename.txt'
     filename.txt
Other special variables


$@           The rule's target
$<           The rule's first
             prerequisite
$?           All the rule's out of
             date prerequisites
$^           All Prerequisites
Functions
   Usually you don't want to declare functions in 
     make, but there are some built­in utilities that can 
     be useful 
   Most frequently used functions:
       $(addprefix <prefix>, list)
         → add a prefix to a space­separated list 

      example:
       FILES = file1 file2 file3
       $(addprefix /home/user/data, $(FILES)
   $(addsuffix) work similarly
Full makefile example
INPUTFILES = lower_DAF lower_maf upper_maf 
                               lower_daf upper_daf
RESULTSDIR = ./results
RESULTFILES = $(addprefix $(RESULTSDIR)/, 
              $(addsuffix _filtered.txt,$(INPUTFILES)
help:
        @echo 'type "make filter" to calculate results'
all: $(RESULTFILES)
$(RESULTSDIR)/%_filtered.txt: data/%.txt
    src/filter_genes.py
    python src/filter_genes.py --genes 
            data/Genes.txt --window $< --output $@
   It looks like very complicated, but in the end
    you always use the same Makefile structure
Fifth part




Testing, discussion, other examples and
              alternatives
Testing a makefile
   make ­n: only shows the commands to be executed
   You can pass variables to make:
     $: make say_hello MYNAME=”Giovanni”
     hello, Giovanni
   Strongly suggested: use a Revision Control 
      Software with support for branching (git, hg, 
      bazaar) and create a branch for testing
Another complex Makefile
               example
         # make masked sequence                   our starting point is the 
         myseq.m: myseq                            file myseq, the end point 
           rmask myseq > myseq.m
                                                   is the blast results blastout
         # run blast on masked seq
         blastout: mydb.psq myseq.m               we first want to mask out 
            blastx mydb myseq.m > blastout         any repeats using rmask to 
            echo “ran blast!”                      create myseq.m
         # index blastable db                     we then blastx myseq.m 
         mydb.psq: mydb
                                                   against a protein db called 
           formatdb -p T mydb
                                                   mydb
         # rules follow this pattern:
         target: subtarget1, ..., subtargetN
                                                  before blastx is run the 
             shell command 1                       protein db must be 
             shell command 2...                    indexed using formatdb
(slide taken from biomake web site)
The “make” command
                                     % make blastout
   # run blast on masked seq
                                     formatdb -p T mydb
   blastout: mydb.psq myseq.m        rmask myseq.fst > myseq.m
      blastx mydb myseq.m > blastout blastx mydb myseq.m > blastout
       echo “ran blast!”
                                      % make blastout
   # index blastable db               make: 'blastout' is up to date
   mydb.psq: mydb
                                      % cat newseqs >> mydb
     formatdb -p T mydb               % make blastout
                                      formatdb -p T mydb
   # make masked sequence             blastx mydb myseq.m > blastout
   myseq.m: myseq
     rmask myseq > myseq.m               make uses unix file 
                                          modification timestamps when 
                                          checking dependencies
                                              if a subtarget is more recent 
                                               than the goal target, then 
(slide taken from biomake web site)            re­execute action
BioMake and alternatives
   BioMake is an alternative to make, thought to be 
     used in bioinformatics
   Developed to annotate the Drosophila 
     melanogaster genome (Berkeley university)
   Cleaner syntax,derived from prolog
   Separates the rule's name from the name of the 
      target files
A BioMake example
       formatdb(DB)
          req: DB
          run: formatdb DB
          comment: prepares blastdb for blasting (wublast)
       rmask(Seq)
          flat: masked_seqs/Seq.masked
          req: Seq
          srun: RepeatMasker -lib $(LIB) Seq
          comment: masks out repeats from input sequence
       mblastx(Seq,DB)
          flat: blast_results/Seq.DB.blastx
          req: formatdb(DB) rmask(Seq)
          srun: blastx -filter SEG+XNU DB rmask(Seq)
          comment: this target is for the results of running blastx on
                       a masked input genomic sequence (wublast)
(slide taken from biomake web site)
Other alternatives
   There are other many alternatives to make:
           BioMake (prolog?)
           o/q/dist/etc.. make
           Ant (Java)
           Scons (python)
           Paver (python)
           Waf (python)
   This list is biased because I am a python programmer :)
   These tools are more oriented to software development
Conclusions
   Make is very basic for bioinformatics
   It is useful for the simpler tasks:
            Logging the operations made to your data files
            Working with clusters
            Avoid re­calculations
            Apply a pipeline to different datasets
   It is installed in almost any unix system and has a standard 
       syntax (interchangeable, reproducible)
   Study it and understand its logic. Use it in the most basic way, 
      without worrying about prerequisites and special variables. 
      Later you can look for easier tools (biomake, rake, taverna, 
Suggested readings
   Software Carpentry for bioinformatics 
         http://swc.scipy.org/lec/build.html
   A Makefile is a pipeline
        http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefil
   BioMake and SKAM 
        http://skam.sourceforge.net/
   BioWiki Make Manifesto 
        http://biowiki.org/MakefileManifesto
   Discussion on the BIP mailing list
         http://www.mail­archive.com/biology­in­python@lists.idyll.org
   Gnu/Make manual by R.Stallman and R.MacGrath
          http://theory.uwinnipeg.ca/gnu/make/make_toc.html 

Linux intro 5 extra: makefiles

  • 1.
    Programming for EvolutionaryBiology March 17th - April 1st 2012 Leipzig, Germany Introduction to Unix systems Extra: writing simple pipelines with make Giovanni Marco Dall'Olio Universitat Pompeu Fabra Barcelona (Spain)
  • 2.
    GNU/make  make is a tool to store command­line instructions  and re­execute them quickly, along with all their  parameters  It is a declarative programming language  It belongs to a class of softwares called 'automated  build tools'
  • 3.
    Simplest Makefile example  The simplest Makefile contains just the name of a task and  the commands associated with it:  print_hello is a makefile 'rule': it stores the commands needed to say 'Hello, world!' to the screen.
  • 4.
    Simplest Makefile example Makefile rule Target of the rule Commands associated This is a with the rule tabulation (not 8 spaces)
  • 5.
    Simplest Makefile example  Create a file in your  computer and save it as  'Makefile'.  Write these instructions in it: print_hello: echo 'Hello, world!!' This is a tabulation  Then, open a terminal and  (<Tab> key) type: make -f Makefile print_hello
  • 6.
  • 7.
    Simplest Makefile example – explanation  When invoked, the program 'make' looks for a file in the  current directory called 'Makefile'  When we type 'make print_hello', it executes any procedure  (target) called 'print_hello' in the makefile  It then shows the commands executed and their output
  • 8.
    Tip1: the 'Makefile'file  The '­f' option allows you to define the file which  contains the instructions for make  If you omit this option, make will look for any file  called 'Makefile' in the current directory make -f Makefile all is equivalent to: make all
  • 9.
    A sligthly longerexample  You can add as many  commands you like  to a rule  For example, this  'print_hello' rule  contains 5 commands  Note: ignore the '@'  thing, it is only to  disable verbose mode  (explained later)
  • 10.
  • 11.
    Make - advantages  Make allows you to save shell commands along  with their parameters and re­execute them;  It allows you to use command­line tools which are  more flexible;  Combined with a revision control software, it  makes possible to reproduce all the operations  made to your data;
  • 12.
    Second part A closerlook at make syntax (target and commands)
  • 13.
    The target syntax  Makefile syntax: <target>: (prerequisites) <commands associated to the rule>
  • 14.
    The target syntax  The target of a rule can be either a title for the task, or a file  name.  Everytime you call a make rule (example: 'make all'), the  program looks for a file called like the target name (e.g.  'all', 'clean', 'inputdata.txt', 'results.txt')  The rule is executed only if that file doesn't exists.
  • 15.
    Filename as targetnames  In this makefile, we have two rules: 'testfile.txt' and 'clean'
  • 16.
    Filename as targetnames  In this makefile, we have two rules: 'testfile.txt' and 'clean'  When we call 'make testfile.txt', make checks if a file called 'testfile.txt' already exists.
  • 17.
    Filename as targetnames The commands associated with the rule 'testfile.txt' are executed only if that file doesn't exists already
  • 18.
    Multiple target definition  A target can also be a  list of files  You can retrieve the  matched target with  the special variable  $@
  • 19.
    Special characters  The % character can be used as a wild card  For example, a rule with the target: %.txt: .... would be activated by any file ending with '.txt'  'make 1.txt', 'make 2.txt', etc..  We will be able to retrieve the matched expression  with '$*'
  • 20.
    Special character %/ creating more than a file at a time
  • 21.
    Makefile – clustersupport  Note that in the previous  example we created three  files at the same time, by  executing three times the  command 'touch'  If we use the '­j' option when  invoking make, the three  processess will be launched  in parallel
  • 22.
    The commands syntax  Makefile syntax: <target>: (prerequisites) <commands associated to the rule>
  • 23.
    Inactivating verbose mode  You can disactivate the verbose mode for a line by  adding '@' at its beginning: Differences here
  • 24.
    Skipping errors  The modifiers '­' tells make to ignore errors returned  by a command  Example:   'mkdir /var' will cause an error (the '/var' directory  already exists) and cause gnu/make to exit  '­mkdir /var' will cause an error anyway, but  gnu/make will ignore it
  • 25.
    Moving throught directories  A big issue with make is that every line is executed as a  different shell process.  So, this: lsvar: cd /var ls  Won't work (it will list only the files in the current  directory, not /var)  The solution is to put everything in a single process: lsvar: (cd /var; ls)
  • 26.
    Third part Prerequisites andconditional execution
  • 27.
    The commands syntax  Makefile syntax: <target>: (prerequisites) <commands associated to the rule>  We will look at the 'prerequisites' part of a make  rule, that I had skipped before
  • 28.
    Real Makefile-rule syntax  Complete syntax for a Makefile rule: <target>: <list of prerequisites> <commands associated to the rule>  Example: result1.txt: data1.txt data2.txt cat data1.txt data2.txt > result1.txt @echo 'result1.txt' has been calculated'  Prerequisites are files (or rules) that need to exists already in  order to create the target file.  If 'data1.txt' and 'data2.txt' don't exist, the rule 'result1.txt' will  exit with an error (no rule to create them)
  • 29.
    Piping Makefile rules together  You can pipe two Makefile rules together by  defining prerequisites
  • 30.
    Piping Makefile rules together  The rule 'result1.txt' depends on the rule 'data1.txt',  which should be executed first
  • 31.
    Piping Makefile rules together  Let's look at this  example  again: what happens if  we remove the  file 'result1.txt'  we just  created?
  • 32.
    Piping Makefile rules together  Let's look at this  example  again: what happens if  we remove the  file 'result1.txt'  we just  created?  The second time  we run the  'make  result1.txt'  command, it is  not necessary  to create  data1.txt 
  • 33.
    Other pipe example  all: result1.txt result2.txt result1.txt: data1.txt calculate_result.py python calculate_result.txt --input data1.txt result2.txt: data2.txt cut -f 1, 3 data2.txt > result2.txt  Make all will calculate result1.txt and result2.txt, if  they don't exist already (and they are older than  their prerequisites)
  • 34.
    Conditional execution by modification date  We have seen how make can be used to create a  file, if it doesn't exists. file.txt: # if file.txt doesn't exists, then create it: echo 'contents of file.txt' > file.txt  We can do better: create or update a file only if it is  newer than its prerequisites
  • 35.
    Conditional execution by modification date  Let's have a better look at this example: result1.txt: data1.txt calculate_result.py python calculate_result.txt --input data1.txt  A great feature of make is that it execute a rule not  only if the target file doesn't exist, but also if it  has a 'last modification date' earlier than all of its  prerequisites
  • 36.
    Conditional execution by modification date result1.txt: data1.txt @sed 's/b/B/i' data1.txt > result1.txt @echo 'result1.txt has been calculated'  In this example, result1.txt will be recalculated  every time 'data1.txt' is modified $: touch data1.txt calculate_result.py $: make result1.txt result1.txt has been calculated $: make result1.txt result1.txt is already up-to-date $: touch data1.txt $: make result1.txt result1.txt has been calculated
  • 37.
    Conditional execution - applications  This 'conditional execution by modification date  comparison' feature of make is very useful  Let's say you discover an error in one of your input  data: you will be able to repeat the analysis by  executing only the operations needed  You can also use it to re­calculate results every time  you modify a script: result.txt: scripts/calculate_result.py python calculate_result.py > result.py
  • 38.
  • 39.
  • 40.
    Variables and functions  You may have already noticed that Make's syntax is  really old :)  In fact, it is a ~40 years old language  It uses special variables like $@, $^, and it can be  worst than perl!!!   (perl developers – please don't get mad at me :­) )
  • 41.
    Variables  Variables are declared with a '=' and by convention  are upper case.  They are called by including their name in '$()'   WORKING_DIR is a variable
  • 42.
    Special variables -$@  Make uses some custom variables, with a syntax  similar to perl  '$@' always corresponds to the target name: $: cat >Makefile %.txt: echo $@ $: make filename.txt $@ took the value of echo filename.txt 'filename.txt' filename.txt
  • 43.
    Other special variables $@ The rule's target $< The rule's first prerequisite $? All the rule's out of date prerequisites $^ All Prerequisites
  • 44.
    Functions  Usually you don't want to declare functions in  make, but there are some built­in utilities that can  be useful   Most frequently used functions:  $(addprefix <prefix>, list) → add a prefix to a space­separated list  example: FILES = file1 file2 file3 $(addprefix /home/user/data, $(FILES)  $(addsuffix) work similarly
  • 45.
    Full makefile example INPUTFILES= lower_DAF lower_maf upper_maf lower_daf upper_daf RESULTSDIR = ./results RESULTFILES = $(addprefix $(RESULTSDIR)/, $(addsuffix _filtered.txt,$(INPUTFILES) help: @echo 'type "make filter" to calculate results' all: $(RESULTFILES) $(RESULTSDIR)/%_filtered.txt: data/%.txt src/filter_genes.py python src/filter_genes.py --genes data/Genes.txt --window $< --output $@  It looks like very complicated, but in the end you always use the same Makefile structure
  • 46.
    Fifth part Testing, discussion,other examples and alternatives
  • 47.
    Testing a makefile  make ­n: only shows the commands to be executed  You can pass variables to make: $: make say_hello MYNAME=”Giovanni” hello, Giovanni  Strongly suggested: use a Revision Control  Software with support for branching (git, hg,  bazaar) and create a branch for testing
  • 48.
    Another complex Makefile example # make masked sequence  our starting point is the  myseq.m: myseq file myseq, the end point  rmask myseq > myseq.m is the blast results blastout # run blast on masked seq blastout: mydb.psq myseq.m  we first want to mask out  blastx mydb myseq.m > blastout any repeats using rmask to  echo “ran blast!” create myseq.m # index blastable db  we then blastx myseq.m  mydb.psq: mydb against a protein db called  formatdb -p T mydb mydb # rules follow this pattern: target: subtarget1, ..., subtargetN  before blastx is run the  shell command 1 protein db must be  shell command 2... indexed using formatdb (slide taken from biomake web site)
  • 49.
    The “make” command % make blastout # run blast on masked seq formatdb -p T mydb blastout: mydb.psq myseq.m rmask myseq.fst > myseq.m blastx mydb myseq.m > blastout blastx mydb myseq.m > blastout echo “ran blast!” % make blastout # index blastable db make: 'blastout' is up to date mydb.psq: mydb % cat newseqs >> mydb formatdb -p T mydb % make blastout formatdb -p T mydb # make masked sequence blastx mydb myseq.m > blastout myseq.m: myseq rmask myseq > myseq.m  make uses unix file  modification timestamps when  checking dependencies  if a subtarget is more recent  than the goal target, then  (slide taken from biomake web site) re­execute action
  • 50.
    BioMake and alternatives  BioMake is an alternative to make, thought to be  used in bioinformatics  Developed to annotate the Drosophila  melanogaster genome (Berkeley university)  Cleaner syntax,derived from prolog  Separates the rule's name from the name of the  target files
  • 51.
    A BioMake example formatdb(DB) req: DB run: formatdb DB comment: prepares blastdb for blasting (wublast) rmask(Seq) flat: masked_seqs/Seq.masked req: Seq srun: RepeatMasker -lib $(LIB) Seq comment: masks out repeats from input sequence mblastx(Seq,DB) flat: blast_results/Seq.DB.blastx req: formatdb(DB) rmask(Seq) srun: blastx -filter SEG+XNU DB rmask(Seq) comment: this target is for the results of running blastx on a masked input genomic sequence (wublast) (slide taken from biomake web site)
  • 52.
    Other alternatives  There are other many alternatives to make:  BioMake (prolog?)  o/q/dist/etc.. make  Ant (Java)  Scons (python)  Paver (python)  Waf (python)  This list is biased because I am a python programmer :)  These tools are more oriented to software development
  • 53.
    Conclusions  Make is very basic for bioinformatics  It is useful for the simpler tasks:  Logging the operations made to your data files  Working with clusters  Avoid re­calculations  Apply a pipeline to different datasets  It is installed in almost any unix system and has a standard  syntax (interchangeable, reproducible)  Study it and understand its logic. Use it in the most basic way,  without worrying about prerequisites and special variables.  Later you can look for easier tools (biomake, rake, taverna, 
  • 54.
    Suggested readings  Software Carpentry for bioinformatics  http://swc.scipy.org/lec/build.html  A Makefile is a pipeline http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefil  BioMake and SKAM  http://skam.sourceforge.net/  BioWiki Make Manifesto  http://biowiki.org/MakefileManifesto  Discussion on the BIP mailing list http://www.mail­archive.com/biology­in­python@lists.idyll.org  Gnu/Make manual by R.Stallman and R.MacGrath http://theory.uwinnipeg.ca/gnu/make/make_toc.html