Jump to Navigation

Chordomics: a visualisation tool for linking function to phylogeny in microbiomes

Bioinformatics Oxford Journals - Fri, 20/09/2019 - 5:30am
AbstractSummaryThe overarching aim of microbiome analysis is to uncover the links between microbial phylogeny and function in order to access ecosystem functioning. This can be done using several experimental strategies targeting different biomolecules, including DNA (metagenomics), RNA (metatranscriptomics) and proteins (metaproteomics). Despite the importance of linking microbial function to phylogeny there are currently no visualisation tools that effectively integrate this information. Chordomics is a Shiny-based application for linked -omics data analysis, allowing users to visualise microbial function and phylogeny on a single plot and compare datasets across time and environmentsAvailability and ImplementationChordomics is available on GitHub: https://github.com/kevinmcdonnell6/chordomics; software is coded in R and JavaScript and a demonstration version is available at https://kmcd.shinyapps.io/chordomics/.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Automatic discovery of 100-miRNA signature for cancer classification using ensemble feature selection

BMC Bioinformatics - Wed, 18/09/2019 - 5:30am
MicroRNAs (miRNAs) are noncoding RNA molecules heavily involved in human tumors, in which few of them circulating the human body. Finding a tumor-associated signature of miRNA, that is, the minimum miRNA entit...
Categories: Bioinformatics Trends

Adverse drug reaction detection via a multihop self-attention mechanism

BMC Bioinformatics - Wed, 18/09/2019 - 5:30am
The adverse reactions that are caused by drugs are potentially life-threatening problems. Comprehensive knowledge of adverse drug reactions (ADRs) can reduce their detrimental impacts on patients. Detecting AD...
Categories: Bioinformatics Trends

A novel protein descriptor for the prediction of drug binding sites

BMC Bioinformatics - Wed, 18/09/2019 - 5:30am
Binding sites are the pockets of proteins that can bind drugs; the discovery of these pockets is a critical step in drug design. With the help of computers, protein pockets prediction can save manpower and fin...
Categories: Bioinformatics Trends

AssessORF: combining evolutionary conservation and proteomics to assess prokaryotic gene predictions

Bioinformatics Oxford Journals - Wed, 18/09/2019 - 5:30am
AbstractMotivationA core task of genomics is to identify the boundaries of protein coding genes, which may cover over 90% of a prokaryote's genome. Several programs are available for gene finding, yet it is currently unclear how well these programs perform and whether any offers superior accuracy. This is in part because there is no universal benchmark for gene finding and, therefore, most developers select their own benchmarking strategy.ResultsHere, we introduce AssessORF, a new approach for benchmarking prokaryotic gene predictions based on evidence from proteomics data and the evolutionary conservation of start and stop codons. We applied AssessORF to compare gene predictions offered by GenBank, GeneMarkS-2, Glimmer, and Prodigal on genomes spanning the prokaryotic tree of life. Gene predictions were 88 – 95% in agreement with the available evidence, with Glimmer performing the worst but no clear winner. All programs were biased towards selecting start codons that were upstream of the actual start. Given these findings, there remains considerable room for improvement, especially in the detection of correct start sites.AvailabilityAssessORF is available as an R package via the Bioconductor package repository.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

MUFold-SSW: A new web server for predicting protein secondary structures, torsion angles, and turns

Bioinformatics Oxford Journals - Wed, 18/09/2019 - 5:30am
AbstractMotivationProtein secondary structure and backbone torsion angle prediction can provide important information for predicting protein 3D structures and protein functions. Our new methods MUFold-SS, MUFold-Angle, MUFold-BetaTurn, and MUFold-GammaTurn, developed based on advanced deep neural networks, achieved state-of-the-art performance for predicting secondary structures, backbone torsion angles, beta-turns, and gamma-turns, respectively. An easy-to-use web service will provide the community a convenient way to use these methods for research and development.ResultsMUFold-SSW, a new web server, is presented. It provides predictions of protein secondary structures, torsion angles, beta-turns and gamma-turns for a given protein sequence. This server implements MUFold-SS, MUFold-Angle, MUFold-BetaTurn, and MUFold-GammaTurn, which performed well for both easy targets (proteins with weak sequence similarity in PDB) and hard targets (proteins without detectable similarity in PDB) in various experimental tests, achieving results better than or comparable with those of existing methods.AvailabilityMUFold-SSW is accessible at http://mufold.org/mufold-ss-angle.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Set Cover Based Methods for Motif Selection

Bioinformatics Oxford Journals - Tue, 17/09/2019 - 5:30am
AbstractMotivationDe novo motif discovery algorithms find statistically over-represented sequence motifs that may function as transcription factor binding sites. Current methods often report large numbers of motifs, making it difficult to perform further analyses and experimental validation. The motif selection problem seeks to identify a minimal set of putative regulatory motifs that characterize sequences of interest (e.g. ChIP-Seq binding regions).ResultsIn this study, the motif selection problem is mapped to variants of the set cover problem that are solved via tabu search and by relaxed integer linear programming (RILP). The algorithms are employed to analyze 349 ChIP-Seq experiments from the ENCODE project, yielding a small number of high quality motifs that represent putative binding sites of primary factors and cofactors. Specifically, when compared to the motifs reported by Kheradpour and Kellis, the set cover based algorithms produced motif sets covering 35% more peaks for 11 TFs and identified 4 more putative cofactors for 6 TFs. Moreover, a systematic evaluation using nested cross-validation revealed that the RILP algorithm selected fewer motifs and was able to cover 6% more peaks and 3% fewer background regions, which reduced the error rate by 7%.AvailabilityThe source code of the algorithms and all the datasets are available at https://github.com/YichaoOU/Set_cover_tools.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

SCSsim: an integrated tool for simulating single-cell genome sequencing data

Bioinformatics Oxford Journals - Tue, 17/09/2019 - 5:30am
AbstractMotivationAllele dropout (ADO) and unbalanced amplification of alleles are main technical issues of single-cell sequencing (SCS), and effectively emulating these issues is necessary for reliably benchmarking SCS-based bioinformatics tools. Unfortunately, currently available sequencing simulators are free of whole-genome amplification involved in SCS technique and therefore not suited for generating SCS datasets. We develop a new software package (SCSsim) that can efficiently simulate SCS datasets in a parallel fashion with minimal user intervention. SCSsim first constructs the genome sequence of single cell by mimicking a complement of genomic variations under user-controlled manner, and then amplifies the genome according to MALBAC technique and finally yields sequencing reads from the amplified products based on inferred sequencing profiles. Comprehensive evaluation in simulating different ADO rates, variation detection efficiency and genome coverage demonstrates that SCSsim is a very useful tool in mimicking single-cell sequencing data with high efficiency.AvailabilitySCSsim is freely available at https://github.com/qasimyu/scssim.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Subtype-Specific Transcriptional Regulators in Breast Tumors Subjected to Genetic and Epigenetic Alterations

Bioinformatics Oxford Journals - Mon, 16/09/2019 - 5:30am
AbstractMotivationBreast cancer consists of multiple distinct tumor subtypes, and results from epigenetic and genetic aberrations that give rise to distinct transcriptional profiles. Despite previous efforts to understand transcriptional deregulation through transcription factor networks, the transcriptional mechanisms leading to subtypes of the disease remain poorly understood.ResultsWe used a sophisticated computational search of thousands of expression datasets to define extended signatures of distinct breast cancer subtypes. Using ENCODE ChIP-seq data of surrogate cell lines and motif analysis we observed that these subtypes are determined by a distinct repertoire of lineage-specific transcription factors. Furthermore, specific pattern and abundance of copy number and DNA methylation changes at these TFs and targets, compared to other genes and to normal cells were observed. Overall, distinct transcriptional profiles are linked to genetic and epigenetic alterations at lineage-specific transcriptional regulators in breast cancer subtypes.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model

Bioinformatics Oxford Journals - Mon, 16/09/2019 - 5:30am
AbstractMotivationIn recent years, multiple circular RNAs biogenesis mechanisms have been discovered. While each reported mechanism has been experimentally verified in different circular RNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circular RNAs. Under the hypothesis that human circular RNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data.ResultsAfter achieving high prediction accuracy, we found through the feature extraction technique that the examined human circular RNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis.AvailabilityScripts and tutorial are available at https://github.com/chl556/Contextual_Regression_for_CircRNA. This program is under GNU General Public License v3.0.
Categories: Bioinformatics Trends

A purely bioinformatic pipeline for the prediction of mammalian odorant receptor gene enhancers

BMC Bioinformatics - Sat, 14/09/2019 - 5:30am
In most mammals, a vast array of genes coding for chemosensory receptors mediates olfaction. Odorant receptor (OR) genes generally constitute the largest multifamily (> 1100 intact members in the mouse). From ...
Categories: Bioinformatics Trends

HH-suite3 for fast remote homology detection and deep protein annotation

BMC Bioinformatics - Sat, 14/09/2019 - 5:30am
HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which rep...
Categories: Bioinformatics Trends

Towards pixel-to-pixel deep nucleus detection in microscopy images

BMC Bioinformatics - Sat, 14/09/2019 - 5:30am
Nucleus is a fundamental task in microscopy image analysis and supports many other quantitative studies such as object counting, segmentation, tracking, etc. Deep neural networks are emerging as a powerful too...
Categories: Bioinformatics Trends

Identifying protein complexes based on an edge weight algorithm and core-attachment structure

BMC Bioinformatics - Sat, 14/09/2019 - 5:30am
Protein complex identification from protein-protein interaction (PPI) networks is crucial for understanding cellular organization principles and functional mechanisms. In recent decades, numerous computational...
Categories: Bioinformatics Trends

A multiscale mathematical model of cell dynamics during neurogenesis in the mouse cerebral cortex

BMC Bioinformatics - Sat, 14/09/2019 - 5:30am
Neurogenesis in the murine cerebral cortex involves the coordinated divisions of two main types of progenitor cells, whose numbers, division modes and cell cycle durations set up the final neuronal output. To ...
Categories: Bioinformatics Trends

PPNID: a reference database and molecular identification pipeline for plant-parasitic nematodes

Bioinformatics Oxford Journals - Sat, 14/09/2019 - 5:30am
AbstractMotivationThe phylum Nematoda comprises the most cosmopolitan and abundant metazoans on Earth and plant-parasitic nematodes represent one of the most significant nematode groups, causing severe losses in agriculture. Practically, the demands for accurate nematode identification are high for ecological, agricultural, taxonomic and phylogenetic researches. Despite their importance, the morphological diagnosis is often a difficult task due to phenotypic plasticity and the absence of clear diagnostic characters while molecular identification is very difficult due to the problematic database and complex genetic background.ResultsThe present study attempts to make up for currently available databases by creating a manually-curated database including all up-to-date authentic barcoding sequences. To facilitate the laborious process associated with the interpretation and identification of a given query sequence, we developed an automatic software pipeline for rapid species identification. The incorporated alignment function facilitates the examination of mutation distribution and therefore also reveals nucleotide autapomorphies, which are important in species delimitation. The implementation of genetic distance, plot and maximum likelihood phylogeny analysis provides more powerful optimality criteria than similarity searching and facilitates species delimitation using evolutionary or phylogeny species concepts. The pipeline streamlines several functions to facilitate more precise data analyses, and the subsequent interpretation is easy and straightforward.AvailabilityThe pipeline was written in vb.net, developed on Microsoft Visual Studio 2017 and designed to work in any Windows environment. The PPNID is distributed under the GNU General Public License (GPL). The executable file along with tutorials is available at https://github.com/xueqing4083/PPNID.
Categories: Bioinformatics Trends

DeepMSPeptide: peptide detectability prediction using deep learning

Bioinformatics Oxford Journals - Sat, 14/09/2019 - 5:30am
AbstractSummaryThe protein detection and quantification using high-throughput proteomic technologies is still challenging due to the stochastic nature of the peptide selection in the mass spectrometer, the difficulties in the statistical analysis of the results and the presence of degenerated peptides. However, considering in the analysis only those peptides that could be detected by mass spectrometry (MS), also called proteotypic peptides, increases the accuracy of the results. Several approaches have been applied to predict peptide detectability based on the physicochemical properties of the peptides. In this manuscript we present DeepMSPeptide, a bioinformatic tool that uses a deep learning method to predict proteotypic peptides exclusively based on the peptide amino acid sequences.Availability and implementationDeepMSPeptide is available at https://github.com/vsegurar/DeepMSPeptide.
Categories: Bioinformatics Trends

When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data

Genome Biology - BiomedCentral - Fri, 13/09/2019 - 5:30am
Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their ...
Categories: Bioinformatics Trends

Pages

Subscribe to Centre for Bioinformatics aggregator - Bioinformatics Trends

Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 
September 2019