Python and Tcl/Tk scripts and tools to process and analyze DNA sequences and related data
GenBank2Fasta_UniExtractor_124.tcl -
GenBank to Fasta file converter; besides of sequence extraction this parser extracts additional useful information from GenBank file and place it into Fasta header file.
GenBank2Fasta_UniExtractor_126.tcl - current version, minor bug fixes.
seqs_processor_and_translator_bin_V124_AGCT.py -
DNA sequence processor and translator; it does translation in 6 frames in batch mode.
Brief description is here
seqs_processor_and_translator_bin_V126_AGCT.py - current version, it has new function - sequence split into multiple fasta files.
tcl_blast_parser_123_V038.tcl -
NCBI BLAST parser.
Detailed description is here
tcl_blast_parser_123_V039.tcl - current version
tcl_blast_parser_123_V041.tcl - current version - to find common query overlap
SeqsExtractorFromBlastX_V124.py -
Extraction of ORF (open reading frame) from BLAST-X report.
BLAST EST sequences against protein reference database and extract EST fragment that correspond to BLAST-X alignment.
SeqsExtractorFromBlastX_V126.py - current version (with no_hits counter).
SeqsExtractorFromTclBlast_V001.py -
extraction of sub-region from BLAST report (blast-x) if hit ID has match to query ID.
seqs_subgroup_extr_001.py - sequence subgroup extractor (1)
seqs_subgroup_extr_003.py - sequence subgroup extractor (3)
to extract sequence subset from FASTA file based on gene ID list:
version (1) - full size sequence extraction
version (3) - extraction of defined fragment
seqs_drobilka_003_mod.py - sequence splitter into overlapping fragments.
seqs_trimmer_2007_03_20.py -
EST sequence trimmer. It's weird, use it on your own risk.
seqs_processor_ultra_polyA_V009.py -
sequence masking based on BLAST-N search against
Vector_M_PolyAAA.fasta vector database.
It's weird too, use it on your own risk.
redundancy_elimination_005.py -
redundancy elimination for sequences in FASTA file by Travis Kleeburg.
read more here
qsep_002M.py - quality scores extractor from Phred output and trimmed sequences
Scripts to process CAP3 alignments:
Python_CAP3_ContigExtractor_Uni_2007_03_19.py
Python_CAP3_MM_Finder_Uni_2007_03_19.py
Python_CAP3_MM_Finder_Uni_2007_03_24f.py - current experimental version
Python_CAP3_MM_Finder_Uni_2007_03_24h.py - current experimental version
Python_CAP3_contig_poly_DIS_Uni_2007_03_19.py
Python_CAP3_ClipInfoExtractor_Uni_2007_03_19.py
Detailed description is here
Manipulation with CAP3 derivative files:
getcontig.py -
post-processing of so-called CAP3 Info file after Python_CAP3_ContigExtractor_Uni_2007_03_19.py script
countContig.py -
estimation of CAP3 contig complexity based on CAP3 Info file after Python_CAP3_ContigExtractor_Uni_2007_03_19.py script
read more here
SequenceTrimmer.py - to trim low-quality region from CAP3 alignment
detailed description is here
Scripts for Genetic Maps
addDuplMarker.py - add duplicated markers to non-redundant map
Instructions are here
MadMapper - current versions:
Python_MadMapper_V248_RECBIT_012NR.py - clustering
Python_MadMapper_V248_XDELTA_117.py - map construction
Python_MadMapper_V248_XDELTA_119.py - map construction (current version; variable column ID with pairwise data)
py_matrix_2D_V248_RECBIT.py - map visualization
MadMapper details here
MadMapper clustering based on numerical data
Python_UniCluster_V011.py - really 'beta' ...
Scripts to manipulate tab-delimited tables
tableRotation_2007_03_21.py
tableSort_2007_03_21.py
Read more here
Pixelirator - graphical data display for tab delimited tables
Scripts for Affymetrix Chip design
seqs_processor_and_translator_bin_V027_AGCT_Affy_V05.py - to generate Affy submission
seqs_processor_and_translator_bin_V027_AGCT_N2A.py - to convert 'N' to 'A' in fasta file
AffyProbeSetSorter-006.py
TkLife_Search_07M_Affy_05_off1_100L_ContigViewerTest.tcl
TkLife_Search_12M_AffySuper_25_off1_300L_025_035_25M.tcl
z-xlog-run-affy-chip.txt
email: akozik@atgc.org
last modified: May 14 2007