Intro
step_01
step_02
step_03
step_04
step_05
step_06
step_07
step_08
step_09
step_10
step_11
step_12
step_13
step_14
step_15
step_16
step_17
step_18
step_19
step_20
step_21
step_22
step_23
step_24
Credits
CGPDB
UC Davis
|
Step 17: "DIS" output validation
It is easy to sort A_vs_BC_substitutions.good file, for example, by third
column using UNIX sort command:
$ sort -n +2 -r A_vs_BC_substitutions.good > A_vs_BC_substitutions.good.sorted
where -n stands for numeric option, +2 for third column,
and -r for reverse order (descending).
This file
contains data for more than 800 SNP candidates where polymorphism occurs two times or
more at the same place of the consensus (high priority). About 2000 candidates are of lowest
priority. By analyzing all output files it is easy to find that our method revealed
more than 1000 SNP/INDEL candidates of high priority (high level of confidence) on the example
tomato assembly.
It is possible to view and validate assembly and polymorphic sites using
Py_ContigViewer.
The viewer is written in Python and should work on any computer platform with Python
interpreter. Input files for ContigViewer are "Alignment Files" generated by
Python_CAP3_ContigExtractor_Feb_27_2004.py script.
You can check several examples on the "step 14" web page.
Note, that first two contigs with numbers 2491 and 2942 belong to "high priority" group.
Third contig 415, it seems, assembled with paralogs. Last contig with number 9 belongs
to low priority group.
|