User:Wayne Decatur/Sequence analysis tools
From Proteopedia
(Difference between revisions)
m (correct a link) |
m (clarify I did at first mean above on this page) |
||
Line 124: | Line 124: | ||
* [https://github.com/fhcrc/seqmagick seqmagick-An imagemagick-like frontend to Biopython SeqIO]. For example, it can convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and describe all FASTA files in the current directory. Requires Biopython. | * [https://github.com/fhcrc/seqmagick seqmagick-An imagemagick-like frontend to Biopython SeqIO]. For example, it can convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and describe all FASTA files in the current directory. Requires Biopython. | ||
- | * see also on | + | * see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items [https://github.com/fomightez/sequencework here] as I sometimes demonstrate script usage in launchable notebooks |
==My own sequence work-related code== | ==My own sequence work-related code== | ||
Line 130: | Line 130: | ||
* [https://github.com/fomightez/UGENE_help Working with UGENE software analysis software] | * [https://github.com/fomightez/UGENE_help Working with UGENE software analysis software] | ||
* [https://github.com/fomightez/yeastmine Working with Yeastmine] | * [https://github.com/fomightez/yeastmine Working with Yeastmine] | ||
- | * see also on | + | * see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items [https://github.com/fomightez/sequencework here] as I sometimes demonstrate script usage in launchable notebooks |
Revision as of 22:00, 6 November 2018
Have not Categorized Yet
- Plasmapper
- Online Schematic plasmid drawing tool
- Online restriction mapper
- NEBcutter
- old web cutter
- Sequence Manipulation Suite: Restriction Map - has at the left side links to other tools they have
- Biotools at UMASS MED (formerly included EMBOSS
- `cons` alignment consensus program and many others at EMBOSS explorer website
- Links to many EMBOSS portals, servers and mirrors under 'Servers'
- MUSCLE: MUltiple Sequence Comparison by Log-Expectation
- Archaeopteryx for the visualization of annotated phylogenetic trees.
- Netprimer
- Genomicus: Genomes in Evolution - "genome browser that enables users to navigate in genomes in several dimensions: linearly along chromosome axes, transversaly across different species, and chronologically along evolutionary time."
- SeqTrace- "is an application for viewing and processing DNA sequencing chromatograms (trace files). SeqTrace makes it easy to quickly generate high-quality finished sequences from a large number of trace files. SeqTrace can automatically identify, align, and compute [contig] consensus sequences from matching forward and reverse traces, filter low-quality base calls, and perform end trimming of finished sequences. The finished DNA sequences can then be exported to common sequence file formats, such as FASTA. " Written in Python.
- CAP3 Sequence Assembly Program - online, webserver for making contigs from DNA sequences. "form allows you to assemble a set of contiguous sequences (contigs) with the CAP3 program.
- Nucleobytes - DNA editor and 4peaks sequence chromatogram viewer along with other mac software
- PaxDb: Protein Abundance Across Organisms
- PrePPI: database of predicted and experimentally determined protein-protein interactions (PPIs) for yeast and human.
- T-profiler - for scoring the activity Of pre-defined groups of yeast genes using gene expression data **As of May 2016 it was not accepting uploads.**
- g:Profiler - for characterizing and manipulating gene lists of high-throughput genomics. Handles yeast and many other organisms.
- ProViz - a web-based visualization tool to investigate the functional and evolutionary features of protein sequences.
- ProDy Project - "ProDy is a free and open-source Python package for protein structural dynamics analysis". Looks like it does protein sequence analysis too and working with PDB files.
BLAST+
- Blast-binder - Launchable Jupyter environment for running command line-based BLAST via Binder.. That page also links to the main BLAST resources there. The launched notebooks illustrate ways to easily work with the output in Python.
Circos
- Circos - binderized Circos so it is actively available in a browser with one click to launch Jupyter environment for Circos via Binder.. That page also links to the main Circos resources there. The launched notebooks illustrate ways to easily work with the output in Python.
Converters
- ALTER (ALignment Transformation EnviRonment) - complex interface but offers lots of options for output. I used it as part of my workflow to get closer to special NEXUS format (or intermediate) for performing maximum likelihood phylogenetic analysis of large sets of sequences.
- Sequence conversion Provided by bugaco.com - a lot of conversion choices with easy interface. When I had interleaved clustal format it converted nicely to a straight fasta listing for the sequence for every organism.
- Reformat utility of Max Planck Institute for Developmental Biology Bioinformatics Toolkit converts sequences or multiple sequence alignments to various forms.
- Format Converter - converts nucleotide and protein sequences in various formats to a lot of other formats.
- Three to One converts three letter amino acid sequence translations to single letter translations.
- One to Three converts single letter amino acid sequence translations to three letter translations.
- ConvertSeq folder at github - my own converter scripts.
- g:Convert - Gene ID Converter. Handles yeast and a very large list of other organisms.
- seqmagick-An imagemagick-like frontend to Biopython SeqIO, can convert from fasta to phylip, etc.
Random sequence generators
- http://users-birc.au.dk/biopv/php/fabox/random_sequence_generator.php
- http://www.bioinformatics.org/sms2/random_dna.html
- http://www.faculty.ucr.edu/~mmaduro/random.htm
- http://molbiol.ru/eng/scripts/01_16.html
- also see resources listed at the bottom of my gene expression page at my simulated_data repo.
Sequence shufflers
- http://emboss.sourceforge.net/ - shuffleseq from EMBOSS shuffles a set of sequences maintaining composition.
Orthology
- EggNOG - A database of orthologous groups and functional annotation
Pattern Matching
- patmatch-binder- Launchable Jupyter environment for running command line-based PatMatch via Binder. That page also links to other sequence pattern matching resources. The launched notebooks illustrate ways to easily work with the output in Python.
Some sequence analysis but mostly OTHER
- BioCyc Database Collection - "BioCyc is a collection of 3530 Pathway/Genome Databases (PGDBs), with tools for understanding their data. Cellular Overview image generated by Pathway Tools. Explore Metabolic Maps for Thousands of Organisms. RouteSearch: Search for Paths through the Metabolic Network. Cross-Organism Search form generated by Pathway Tools. New: Search All of BioCyc for Genes, Proteins, Pathways. Search all of BioCyc or designated taxonomic groups for named genes, proteins, metabolites, pathways. Multiple Sequence Alignment results generated by Pathway Tools using MUSCLE. PatMatch query and results by Pathway Tools. SmartTable display generated by Pathway Tools. Metabolomics Data Analysis. Cellular Overview Omics Viewer image generated by Pathway Tools. Gene Expression Data Analysis. Multi-Genome Browser. Comparative Genome Analysis."
Good E. coli database
- - EcoProDB E. coli protein database (EcoProDB) integrates protein information identified on 2-D gels along with other resources to provide the comparative platform for the expression levels of many heterogeneous proteins under different genetic and environmental conditions using the interactive interface and search mechanism.
NGS
- HOMER - "Software for motif discovery and next-gen sequencing analysis". Nice in that it actually explains some of the details and advantages of the browsers and file types.
Nucleic acid system building
- NUPACK - "NUPACK is a growing software suite for the analysis and design of nucleic acid systems."
Fungal Genome Resources
http://1000.fungalgenomes.org/home/
http://fungidb.org/fungidb/ (about it –> http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245123/)
http://genome.jgi.doe.gov/programs/fungi/1000fungalgenomes.jsf <— nice graphic of situation related to 1000 fungal genomes project
http://genome.jgi-psf.org/programs/fungi/index.jsf
http://fungi.ensembl.org/index.html
http://en.wikipedia.org/wiki/List_of_sequenced_fungi_genomes <– how current is it???
For genomic arrangement (synteny) comparisons/Fungal Genomics Resources
Synteny Viewer listed under every SGD gene on Sequence tab, near bottom of page
http://www.genomicus.biologie.ens.fr/genomicus-fungi-19.01/cgi-bin/search.pl
Yeast Gene Order Browser (YGOB)
RNA Structure Analysis
- Infernal - A downloadable program fors equence analysis using profiles of RNA sequence based on Rfam-associated covariance models and secondary structure consensus. The program can generate covariance models from RNA alignments as well. Binaries are avialble for Mac, Windows, and Linux. ( E. P. Nawrocki and S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches , Bioinformatics 29:2933-2935 (2013). PMID: 24008419)
Sequence Logo Generation
Installable software for fine-tuning sequence alignments
- SEQOTRON - Mac Software for adjusting sequence alignments by hand. Unfortunately it discards the conservation data if it is there in input. Haven't found a way to put it back in the output other than use `cons` alignment consensus program and many others at EMBOSS explorer website
Windows equivalent is here but I have NOT tried it.
Python-based utilities
- seqmagick-An imagemagick-like frontend to Biopython SeqIO. For example, it can convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and describe all FASTA files in the current directory. Requires Biopython.
- see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items here as I sometimes demonstrate script usage in launchable notebooks
My own sequence work-related code
- Sequence manipulation Python code
- Working with UGENE software analysis software
- Working with Yeastmine
- see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items here as I sometimes demonstrate script usage in launchable notebooks