• Intro

  • step_01
  • step_02
  • step_03
  • step_04
  • step_05
  • step_06
  • step_07
  • step_08
  • step_09
  • step_10
  • step_11
  • step_12
  • step_13
  • step_14
  • step_15
  • step_16
  • step_17
  • step_18
  • step_19
  • step_20
  • step_21
  • step_22
  • step_23
  • step_24

  • Credits



  • CGPDB
    UC Davis


    Step 6: Extraction of sequences from fasta file using fastacmd

    To extract subset of sequences from Lycopersicon_esculentum.fasta file fastacmd (included with NCBI BLAST stand alone distribution) program has been used:

    $ fastacmd -d Lycopersicon_esculentum.fasta -i Lycopersicon_esculentum.with_hits.IDs -o Lycopersicon_esculentum.with_hits.fasta

    Then fasta headers of Lycopersicon_esculentum.with_hits.fasta file have been modified using perl (/find/replace/) regular expression:

    $ perl -p -i -e 's/^\>lcl\|/\>/' Lycopersicon_esculentum.with_hits.fasta



    So far, we have reduced number of ESTs in Lycopersicon esculentum set from 150193 to 32887.