Intro
step_01
step_02
step_03
step_04
step_05
step_06
step_07
step_08
step_09
step_10
step_11
step_12
step_13
step_14
step_15
step_16
step_17
step_18
step_19
step_20
step_21
step_22
step_23
step_24
Credits
CGPDB
UC Davis
|
Step 6: Extraction of sequences from fasta file using fastacmd
To extract subset of sequences from Lycopersicon_esculentum.fasta
file fastacmd (included with NCBI BLAST
stand alone distribution) program has been used:
$ fastacmd -d Lycopersicon_esculentum.fasta
-i Lycopersicon_esculentum.with_hits.IDs -o Lycopersicon_esculentum.with_hits.fasta
Then fasta headers of Lycopersicon_esculentum.with_hits.fasta file
have been modified using perl (/find/replace/) regular expression:
$ perl -p -i -e 's/^\>lcl\|/\>/' Lycopersicon_esculentum.with_hits.fasta
So far, we have reduced number of ESTs in Lycopersicon esculentum
set from 150193 to 32887.
|