Jump to Navigation

Comparison of pathway and gene-level models for cancer prognosis prediction

BMC Bioinformatics - Fri, 28/02/2020 - 5:30am
Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-bas...
Categories: Bioinformatics Trends

2DImpute: Imputation in Single Cell RNA-Seq Data from Correlations in Two Dimensions

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractSummaryWe developed 2DImpute, an imputation method for correcting false zeros (known as dropouts) in single-cell RNA sequencing (scRNA-seq) data. It features preventing excessive correction by predicting the false zeros and imputing their values by making use of the interrelationships between both genes and cells in the expression matrix. We showed that 2DImpute outperforms several leading imputation methods by applying it on datasets from various scRNA-seq protocols.Availability and ImplementationThe R package of 2DImpute is freely available at GitHub (https://github.com/zky0708/2DImpute).Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractSummaryGenome Detective is a web-based, user-friendly software application to quickly and accurately assemble all known virus genomes from next generation sequencing datasets. This application allows the identification of phylogenetic clusters and genotypes from assembled genomes in FASTA format. Since its release in 2019, we have produced a number of typing tools for emergent viruses that have caused large outbreaks, such as Zika and Yellow Fever Virus in Brazil. Here, we present The Genome Detective Coronavirus Typing Tool that can accurately identify the novel severe acute respiratory syndrome (SARS) related coronavirus (SARS-CoV-2) sequences isolated in China and around the world. The tool can accept up to 2,000 sequences per submission and the analysis of a new whole genome sequence will take approximately one minute. The tool has been tested and validated with hundreds of whole genomes from ten coronavirus species, and correctly classified all of the SARS-related coronavirus (SARSr-CoV) and all of the available public data for SARS-CoV-2. The tool also allows tracking of new viral mutations as the outbreak expands globally, which may help to accelerate the development of novel diagnostics, drugs and vaccines to stop the COVID-19 disease.Availabilityhttps://www.genomedetective.com/app/typingtool/covSupplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

rMSIproc: an R package for mass spectrometry imaging data processing

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractSummaryMass spectrometry imaging (MSI) can reveal biochemical information directly from a tissue section. MSI generates a large quantity of complex spectral data which is still challenging to translate into relevant biochemical information. Here we present rMSIproc, an open-source R package that implements a full data processing workflow for MSI experiments performed using TOF or FT-based mass spectrometers. The package provides a novel strategy for spectral alignment and recalibration, which allows to process multiple datasets simultaneously. This enables to perform a confident statistical analysis with multiple datasets from one or several experiments. rMSIproc is designed to work with files larger than the computer memory capacity and the algorithms are implemented using a multi-threading strategy. rMSIproc is a powerful tool able to take full advantage of modern computer systems to completely develop the whole MSI potential.Availability and ImplementationrMSIproc is freely available at https://github.com/prafols/rMSIprocSupplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractMotivationMany methods for microbial protein subcellular localization (SCL) prediction exist, however none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms, and in other environments (for example, for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine vs pollution-impacted watersheds.ResultsWe report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross validation with in silico fragmented sequences with known localization showed that PSORTm maintains PSORTb’s high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm’s read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs), however the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental, or industrial importance.Availability and ImplementationDocumentation, source code, and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open source software under GNU General Public License Version 3).Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

CRAMER: A lightweight, highly customisable web-based genome browser supporting multiple visualisation instances

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractSummaryIn recent years the ability to generate genomic data has increased dramatically along with the demand for easily personalised and customisable genome browsers for effective visualisation of diverse types of data. Despite the large number of web-based genome browsers available nowadays, none of the existing tools provide means for creating multiple visualisation instances without manual set up on the deployment server side. The Cranfield Genome Browser (CRAMER) is an open-source, lightweight and highly customisable web application for interactive visualisation of genomic data. Once deployed, CRAMER supports seamless creation of multiple visualisation instances in parallel while allowing users to control and customise multiple tracks. The application is deployed on a Node.js server and is supported by a MongoDB database which stored all customisations made by the users allowing quick navigation between instances. Currently, the browser supports visualising a large number of file formats for genome annotation, variant calling, reads coverage and gene expression. Additionally, the browser supports direct Javascript coding for personalised tracks, providing a whole new level of customisation both functionally and visually. Tracks can be added via direct file upload or processed in real-time via links to files stored remotely on an FTP repository. Furthermore, additional tracks can be added by users via simple drag and drop to an existing visualisation instance.Availability and ImplementationCRAMER is implemented in JavaScript and is publicly available on GitHub on https://github.com/FadyMohareb/cramer. The application is released under an MIT licence and can be deployed on any server running Linux or Mac OS.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

DNA4mC-LIP: a linear integration method to identify N4-methylcytosine site in multiple species

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractMotivationDNA N4-methylcytosine (4mC) is a crucial epigenetic modification. However, the knowledge about its biological functions is limited. Effective and accurate identification of 4mC sites will be helpful to reveal its biological functions and mechanisms. Since experimental methods are cost and ineffective, a number of machine learning based approaches have been proposed to detect 4mC sites. Although these methods yielded acceptable accuracy, there is still room for the improvement of the prediction performance and the stability of existing methods in practical applications.ResultsIn this work, we first systematically assessed the existing methods based on an independent dataset. And then, we proposed DNA4mC-LIP, a linear integration method by combining existing predictors to identify 4mC sites in multiple species. The results obtained from independent dataset demonstrated that DNA4mC-LIP outperformed existing methods for identifying 4mC sites. To facilitate the scientific community, a web server for DNA4mC-LIP was developed. We anticipated that DNA4mC-LIP could serve as a powerful computational technique for identifying 4mC sites and facilitate the interpretation of 4mC mechanism.Availabilityhttp://i.uestc.edu.cn/DNA4mC-LIP/.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Bio2Rxn: sequence-based enzymatic reaction predictions by a consensus strategy

Bioinformatics Oxford Journals - Fri, 28/02/2020 - 5:30am
AbstractSummaryThe development of sequencing technologies has generated large amounts of protein sequence data. The automated prediction of the enzymatic reactions of uncharacterized proteins is a major challenge in the field of bioinformatics. Here, we present Bio2Rxn as a web-based tool to provide putative enzymatic reaction predictions for uncharacterized protein sequences. Bio2Rxn adopts a consensus strategy by incorporating six types of enzyme prediction tools. It allows for the efficient integration of these computational resources to maximize the accuracy and comprehensiveness of enzymatic reaction predictions, which facilitates the characterization of the functional roles of target proteins in metabolism. Bio2Rxn further links the enzyme function prediction with more than 300,000 enzymatic reactions, which were manually curated by more than 100 people over the past 9 years from more than 580,000 publications.AvailabilityBio2Rxn is available at: http://design.rxnfinder.org/bio2rxn/.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Regenerating zebrafish fin epigenome is characterized by stable lineage-specific DNA methylation and dynamic chromatin accessibility

Genome Biology - BiomedCentral - Thu, 27/02/2020 - 5:30am
Zebrafish can faithfully regenerate injured fins through the formation of a blastema, a mass of proliferative cells that can grow and develop into the lost body part. After amputation, various cell types contr...
Categories: Bioinformatics Trends

Regenerating zebrafish fin epigenome is characterized by stable lineage-specific DNA methylation and dynamic chromatin accessibility

Genome Biology - Thu, 27/02/2020 - 5:30am
Zebrafish can faithfully regenerate injured fins through the formation of a blastema, a mass of proliferative cells that can grow and develop into the lost body part. After amputation, various cell types contr...
Categories: Bioinformatics Trends

A rare codon-based translational program of cell proliferation

Genome Biology - BiomedCentral - Thu, 27/02/2020 - 5:30am
The speed of translation elongation is primarily determined by the abundance of tRNAs. Thus, the codon usage influences the rate with which individual mRNAs are translated. As the nature of tRNA pools and modi...
Categories: Bioinformatics Trends

A rare codon-based translational program of cell proliferation

Genome Biology - Thu, 27/02/2020 - 5:30am
The speed of translation elongation is primarily determined by the abundance of tRNAs. Thus, the codon usage influences the rate with which individual mRNAs are translated. As the nature of tRNA pools and modi...
Categories: Bioinformatics Trends

Ensemble learning for classifying single-cell data and projection across reference atlases

Bioinformatics Oxford Journals - Thu, 27/02/2020 - 5:30am
AbstractSummarySingle-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms.Availabilityhttps://github.com/diazlab/ELSASupplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

StackCPPred: A Stacking and Pairwise Energy Content-based Prediction of Cell-Penetrating Peptides and Their Uptake Efficiency

Bioinformatics Oxford Journals - Thu, 27/02/2020 - 5:30am
AbstractMotivationCell-penetrating peptides (CPPs) are a vehicle for transporting into living cells pharmacologically active molecules, such as short interfering RNAs, nanoparticles, plasmid DNAs, and small peptides, thus offering great potential as future therapeutics. Existing experimental techniques for identifying CPPs are time-consuming and expensive. Thus, the prediction of CPPs from peptide sequences by using computational methods can be useful to annotate and guide the experimental process quickly. Many machine learning-based methods have recently emerged for identifying CPPs. Although considerable progress has been made, existing methods still have low feature representation capabilities, thereby limiting further performance improvements.ResultsWe propose a method called StackCPPred, which proposes three feature methods on the basis of the pairwise energy content of the residue as follows: RECM-composition, PseRECM, and RECM-DWT. These features are used to train stacking-based machine learning methods to effectively predict CPPs. On the basis of the CPP924 and CPPsite3 datasets with jackknife validation, StackDPPred achieved 94.5% and 78.3% accuracy, which was 2.9% and 5.8% higher than the state-of-the-art CPP predictors, respectively. StackCPPred can be a powerful tool for predicting CPPs and their uptake efficiency, facilitating hypothesis-driven experimental design and accelerating their applications in clinical therapy.AvailabilitySource code and data can be downloaded from https://github.com/Excelsior511/StackCPPred.Supplementary informationSupplementary dataSupplementary data are available online at Bioinformatics.
Categories: Bioinformatics Trends

ConfID: an analytical method for conformational characterization of small molecules using molecular dynamics trajectories

Bioinformatics Oxford Journals - Thu, 27/02/2020 - 5:30am
AbstractMotivationThe conformational space of small molecules can be vast and difficult to assess. Molecular dynamics simulations of free ligands in solution have been applied to predict conformational populations, but their characterization is often based on clustering algorithms or manual efforts.ResultsHere, we introduce ConfID, an analytical tool for conformational characterization of small molecules using molecular dynamics trajectories. The evolution of conformational sampling and population frequencies throughout trajectories is calculated to check for sampling convergence while allowing to map relevant conformational transitions. The tool is designed to track conformational transition events and calculate time-dependent properties for each conformational population detected.AvailabilityToolkit and documentation are freely available at http://sbcb.inf.ufrgs.br/confidSupplementary informationSupplementary dataSupplementary data are available at Bioinformatics
Categories: Bioinformatics Trends

Allosteric inhibition of CRISPR-Cas9 by bacteriophage-derived peptides

Genome Biology - BiomedCentral - Wed, 26/02/2020 - 5:30am
CRISPR-Cas9 has been developed as a therapeutic agent for various infectious and genetic diseases. In many clinically relevant applications, constitutively active CRISPR-Cas9 is delivered into human cells with...
Categories: Bioinformatics Trends

Allosteric inhibition of CRISPR-Cas9 by bacteriophage-derived peptides

Genome Biology - Wed, 26/02/2020 - 5:30am
CRISPR-Cas9 has been developed as a therapeutic agent for various infectious and genetic diseases. In many clinically relevant applications, constitutively active CRISPR-Cas9 is delivered into human cells with...
Categories: Bioinformatics Trends

Connecting mathematical models to genomes: Joint estimation of model parameters and genome-wide marker effects on these parameters

Bioinformatics Oxford Journals - Wed, 26/02/2020 - 5:30am
AbstractMotivationParameters of mathematical models used in biology may be genotype-specific and regarded as new traits. Therefore, an accurate estimation of these parameters and the association mapping on the estimated parameters can lead to important findings regarding the genetic architecture of biological processes. In this study, a statistical framework for a joint analysis of model parameters and genome-wide marker effects on these parameters was proposed and evaluated.ResultsIn the simulation analyses based on different types of mathematical models, the joint analysis inferred the model parameters and identified the responsible genomic regions more accurately than the independent analysis. The joint analysis of real plant data provided interesting insights into photosensitivity, which were uncovered by the independent analysis.Availability and implementationThe statistical framework is provided by the R package GenomeBasedModel available at https://github.com/Onogi/GenomeBasedModel. All R and C ++ scripts used in this study are also available at the site.Supplementary informationSupplementary informationSupplementary information is provided on the journal website
Categories: Bioinformatics Trends

Discovering and interpreting transcriptomic drivers of imaging traits using neural networks

Bioinformatics Oxford Journals - Wed, 26/02/2020 - 5:30am
AbstractMotivationCancer heterogeneity is observed at multiple biological levels. To improve our understanding of these differences and their relevance in medicine, approaches to link organ- and tissue-level information from diagnostic images and cellular-level information from genomics are needed. However, these “radiogenomic” studies often use linear, shallow models, depend on feature selection, or consider one gene at a time to map images to genes. Moreover, no study has systematically attempted to understand the molecular basis of imaging traits based on the interpretation of what the neural network has learned. These current studies are thus limited in their ability to understand the transcriptomic drivers of imaging traits, which could provide additional context for determining clinical outcomes.ResultsWe present an approach based on neural networks that takes high-dimensional gene expressions as input and performs nonlinear mapping to an imaging trait. To interpret the models, we propose gene masking and gene saliency to extract learned relationships from radiogenomic neural networks. In glioblastoma patients, our models outperform comparable classifiers (>0.10 AUC) and our interpretation methods were validated using a similar model to identify known relationships between genes and molecular subtypes. We found that tumor imaging traits had specific transcription patterns, e.g., edema and genes related to cellular invasion, and ten radiogenomic traits were significantly predictive of survival. We demonstrate that neural networks can model transcriptomic heterogeneity to reflect differences in imaging and can be used to derive radiogenomic traits with clinical value.Availability and implementationhttps://github.com/novasmedley/deepRadiogenomics.Supplementary informationAvailable at Bioinformatics online.
Categories: Bioinformatics Trends

Pages

Subscribe to Centre for Bioinformatics aggregator - Bioinformatics Trends

Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
March 2020