User:Wayne Decatur/Sequence analysis tools

From Proteopedia

(Difference between revisions)

Jump to: navigation, search

Revision as of 22:00, 6 November 2018

1 Have not Categorized Yet
2 BLAST+
3 Circos
4 Converters
5 Random sequence generators
6 Sequence shufflers
7 Orthology
8 Pattern Matching
9 Some sequence analysis but mostly OTHER
10 Good E. coli database
11 NGS
12 Nucleic acid system building
13 Fungal Genome Resources
14 For genomic arrangement (synteny) comparisons/Fungal Genomics Resources
15 RNA Structure Analysis
16 Sequence Logo Generation
17 Installable software for fine-tuning sequence alignments
18 Python-based utilities
19 My own sequence work-related code

Have not Categorized Yet

Plasmapper
Online Schematic plasmid drawing tool
Online restriction mapper
NEBcutter
old web cutter
Sequence Manipulation Suite: Restriction Map - has at the left side links to other tools they have
Biotools at UMASS MED (formerly included EMBOSS
`cons` alignment consensus program and many others at EMBOSS explorer website
Links to many EMBOSS portals, servers and mirrors under 'Servers'
MUSCLE: MUltiple Sequence Comparison by Log-Expectation
Archaeopteryx for the visualization of annotated phylogenetic trees.
Netprimer
Genomicus: Genomes in Evolution - "genome browser that enables users to navigate in genomes in several dimensions: linearly along chromosome axes, transversaly across different species, and chronologically along evolutionary time."
SeqTrace- "is an application for viewing and processing DNA sequencing chromatograms (trace files). SeqTrace makes it easy to quickly generate high-quality finished sequences from a large number of trace files. SeqTrace can automatically identify, align, and compute [contig] consensus sequences from matching forward and reverse traces, filter low-quality base calls, and perform end trimming of finished sequences. The finished DNA sequences can then be exported to common sequence file formats, such as FASTA. " Written in Python.
CAP3 Sequence Assembly Program - online, webserver for making contigs from DNA sequences. "form allows you to assemble a set of contiguous sequences (contigs) with the CAP3 program.
Nucleobytes - DNA editor and 4peaks sequence chromatogram viewer along with other mac software
PaxDb: Protein Abundance Across Organisms
PrePPI: database of predicted and experimentally determined protein-protein interactions (PPIs) for yeast and human.
T-profiler - for scoring the activity Of pre-defined groups of yeast genes using gene expression data **As of May 2016 it was not accepting uploads.**
g:Profiler - for characterizing and manipulating gene lists of high-throughput genomics. Handles yeast and many other organisms.
ProViz - a web-based visualization tool to investigate the functional and evolutionary features of protein sequences.
ProDy Project - "ProDy is a free and open-source Python package for protein structural dynamics analysis". Looks like it does protein sequence analysis too and working with PDB files.

BLAST+

Blast-binder - Launchable Jupyter environment for running command line-based BLAST via Binder.. That page also links to the main BLAST resources there. The launched notebooks illustrate ways to easily work with the output in Python.

Circos

Circos - binderized Circos so it is actively available in a browser with one click to launch Jupyter environment for Circos via Binder.. That page also links to the main Circos resources there. The launched notebooks illustrate ways to easily work with the output in Python.

Converters

ALTER (ALignment Transformation EnviRonment) - complex interface but offers lots of options for output. I used it as part of my workflow to get closer to special NEXUS format (or intermediate) for performing maximum likelihood phylogenetic analysis of large sets of sequences.
Sequence conversion Provided by bugaco.com - a lot of conversion choices with easy interface. When I had interleaved clustal format it converted nicely to a straight fasta listing for the sequence for every organism.
Reformat utility of Max Planck Institute for Developmental Biology Bioinformatics Toolkit converts sequences or multiple sequence alignments to various forms.
Format Converter - converts nucleotide and protein sequences in various formats to a lot of other formats.
Three to One converts three letter amino acid sequence translations to single letter translations.
One to Three converts single letter amino acid sequence translations to three letter translations.
ConvertSeq folder at github - my own converter scripts.
g:Convert - Gene ID Converter. Handles yeast and a very large list of other organisms.
seqmagick-An imagemagick-like frontend to Biopython SeqIO, can convert from fasta to phylip, etc.

Random sequence generators

http://users-birc.au.dk/biopv/php/fabox/random_sequence_generator.php
http://www.bioinformatics.org/sms2/random_dna.html
http://www.faculty.ucr.edu/~mmaduro/random.htm
http://molbiol.ru/eng/scripts/01_16.html
also see resources listed at the bottom of my gene expression page at my simulated_data repo.

Sequence shufflers

http://emboss.sourceforge.net/ - shuffleseq from EMBOSS shuffles a set of sequences maintaining composition.

Orthology

EggNOG - A database of orthologous groups and functional annotation

Pattern Matching

patmatch-binder- Launchable Jupyter environment for running command line-based PatMatch via Binder. That page also links to other sequence pattern matching resources. The launched notebooks illustrate ways to easily work with the output in Python.

Some sequence analysis but mostly OTHER

BioCyc Database Collection - "BioCyc is a collection of 3530 Pathway/Genome Databases (PGDBs), with tools for understanding their data. Cellular Overview image generated by Pathway Tools. Explore Metabolic Maps for Thousands of Organisms. RouteSearch: Search for Paths through the Metabolic Network. Cross-Organism Search form generated by Pathway Tools. New: Search All of BioCyc for Genes, Proteins, Pathways. Search all of BioCyc or designated taxonomic groups for named genes, proteins, metabolites, pathways. Multiple Sequence Alignment results generated by Pathway Tools using MUSCLE. PatMatch query and results by Pathway Tools. SmartTable display generated by Pathway Tools. Metabolomics Data Analysis. Cellular Overview Omics Viewer image generated by Pathway Tools. Gene Expression Data Analysis. Multi-Genome Browser. Comparative Genome Analysis."

Good E. coli database

- EcoProDB E. coli protein database (EcoProDB) integrates protein information identified on 2-D gels along with other resources to provide the comparative platform for the expression levels of many heterogeneous proteins under different genetic and environmental conditions using the interactive interface and search mechanism.

NGS

HOMER - "Software for motif discovery and next-gen sequencing analysis". Nice in that it actually explains some of the details and advantages of the browsers and file types.

Nucleic acid system building

NUPACK - "NUPACK is a growing software suite for the analysis and design of nucleic acid systems."

Fungal Genome Resources

http://fungalgenomes.org/

http://1000.fungalgenomes.org/home/

http://fungidb.org/fungidb/ (about it –> http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245123/)

http://genome.jgi.doe.gov/programs/fungi/1000fungalgenomes.jsf <— nice graphic of situation related to 1000 fungal genomes project

http://genome.jgi-psf.org/programs/fungi/index.jsf

http://www.broadinstitute.org/scientific-community/science/projects/fungal-genome-initiative/fungal-genomics

http://fungi.ensembl.org/index.html

http://en.wikipedia.org/wiki/List_of_sequenced_fungi_genomes <– how current is it???

S. cerevisiae complexes

For genomic arrangement (synteny) comparisons/Fungal Genomics Resources

Synteny Viewer listed under every SGD gene on Sequence tab, near bottom of page

http://www.genomicus.biologie.ens.fr/genomicus-fungi-19.01/cgi-bin/search.pl

Yeast Gene Order Browser (YGOB)

RNA Structure Analysis

Infernal - A downloadable program fors equence analysis using profiles of RNA sequence based on Rfam-associated covariance models and secondary structure consensus. The program can generate covariance models from RNA alignments as well. Binaries are avialble for Mac, Windows, and Linux. ( E. P. Nawrocki and S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches , Bioinformatics 29:2933-2935 (2013). PMID: 24008419)

Sequence Logo Generation

Installable software for fine-tuning sequence alignments

SEQOTRON - Mac Software for adjusting sequence alignments by hand. Unfortunately it discards the conservation data if it is there in input. Haven't found a way to put it back in the output other than use `cons` alignment consensus program and many others at EMBOSS explorer website

Windows equivalent is here but I have NOT tried it.

Python-based utilities

seqmagick-An imagemagick-like frontend to Biopython SeqIO. For example, it can convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and describe all FASTA files in the current directory. Requires Biopython.
see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items here as I sometimes demonstrate script usage in launchable notebooks

My own sequence work-related code

Sequence manipulation Python code
Working with UGENE software analysis software
Working with Yeastmine
see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items here as I sometimes demonstrate script usage in launchable notebooks

Proteopedia Page Contributors and Editors (what is this?)

Wayne Decatur

Retrieved from "http://52.214.119.220/wiki/index.php/User:Wayne_Decatur/Sequence_analysis_tools"

@@ Line 124: / Line 124: @@
 * [https://github.com/fhcrc/seqmagick seqmagick-An imagemagick-like frontend to Biopython SeqIO]. For example, it can  convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and  describe all FASTA files in the current directory. Requires Biopython.
-* see also on [https://github.com/fomightez/sequencework this page] 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python or demonstrate the scripts use
+* see also earlier on this page  'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items [https://github.com/fomightez/sequencework here] as I sometimes demonstrate script usage in launchable notebooks
 ==My own sequence work-related code==
@@ Line 130: / Line 130: @@
 * [https://github.com/fomightez/UGENE_help Working with UGENE software analysis software]
 * [https://github.com/fomightez/yeastmine Working with Yeastmine]
-* see also on [https://github.com/fomightez/sequencework this page] 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python or demonstrate the scripts use
+* see also earlier on this page  'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items [https://github.com/fomightez/sequencework here] as I sometimes demonstrate script usage in launchable notebooks