• Intro

  • step_01
  • step_02
  • step_03
  • step_04
  • step_05
  • step_06
  • step_07
  • step_08
  • step_09
  • step_10
  • step_11
  • step_12
  • step_13
  • step_14
  • step_15
  • step_16
  • step_17
  • step_18
  • step_19
  • step_20
  • step_21
  • step_22
  • step_23
  • step_24

  • Credits



  • CGPDB
    UC Davis


    Step 17: "DIS" output validation


    It is easy to sort A_vs_BC_substitutions.good file, for example, by third column using UNIX sort command:


    $ sort -n +2 -r A_vs_BC_substitutions.good > A_vs_BC_substitutions.good.sorted


    where -n stands for numeric option, +2 for third column, and -r for reverse order (descending).

    This file contains data for more than 800 SNP candidates where polymorphism occurs two times or more at the same place of the consensus (high priority). About 2000 candidates are of lowest priority. By analyzing all output files it is easy to find that our method revealed more than 1000 SNP/INDEL candidates of high priority (high level of confidence) on the example tomato assembly.

    It is possible to view and validate assembly and polymorphic sites using Py_ContigViewer. The viewer is written in Python and should work on any computer platform with Python interpreter. Input files for ContigViewer are "Alignment Files" generated by Python_CAP3_ContigExtractor_Feb_27_2004.py script. You can check several examples on the "step 14" web page. Note, that first two contigs with numbers 2491 and 2942 belong to "high priority" group. Third contig 415, it seems, assembled with paralogs. Last contig with number 9 belongs to low priority group.