Jump to Navigation

HeMoQuest: a webserver for qualitative prediction of transient heme binding to protein motifs

BMC Bioinformatics - Fri, 27/03/2020 - 5:30am
The notion of heme as a regulator of many physiological processes via transient binding to proteins is one that is recently being acknowledged. The broad spectrum of the effects of heme makes it important to i...
Categories: Bioinformatics Trends

debCAM: a Bioconductor R package for fully unsupervised deconvolution of complex tissues

Bioinformatics Oxford Journals - Fri, 27/03/2020 - 5:30am
AbstractSummaryWe develop a fully unsupervised deconvolution method to dissect complex tissues into molecularly distinctive tissue or cell subtypes based on bulk expression profiles. We implement an R package, deconvolution by Convex Analysis of Mixtures (debCAM) that can automatically detect tissue/cell-specific markers, determine the number of constituent sub-types, calculate subtype proportions in individual samples, and estimate tissue/cell-specific expression profiles. We demonstrate the performance and biomedical utility of debCAM on gene expression, methylation, proteomics, and imaging data. With enhanced data preprocessing and prior knowledge incorporation, debCAM software tool will allow biologists to perform a more comprehensive and unbiased characterization of tissue remodeling in many biomedical contexts.Availability and implementationhttp://bioconductor.org/packages/debCAMSupplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

CNV-BAC: Copy Number Variation Detection in Bacterial Circular Genome

Bioinformatics Oxford Journals - Fri, 27/03/2020 - 5:30am
AbstractMotivationWhole genome sequencing (WGS) is widely used for copy number variation (CNV) detection. However, for most bacteria, their circular genome structure and high replication rate make reads more enriched near the replication origin. CNV detection based on read depth could be seriously influenced by such replication bias.ResultsWe show that the replication bias is widespread using ~200 bacterial WGS data. We develop CNV-BAC that can properly normalize the replication bias as well as other known biases in bacterial WGS data and can accurately detect CNVs. Simulation and real data analysis show that CNV-BAC achieves the best performance in CNV detection compared with available algorithms.Availability and implementationCNV-BAC is available at https://github.com/XiDsLab/CNV-BAC.
Categories: Bioinformatics Trends

Characterisation of genetic regulatory effects for osteoporosis risk variants in human osteoclasts

Genome Biology - BiomedCentral - Thu, 26/03/2020 - 5:30am
Osteoporosis is a complex disease with a strong genetic contribution. A recently published genome-wide association study (GWAS) for estimated bone mineral density (eBMD) identified 1103 independent genome-wide...
Categories: Bioinformatics Trends

DeepMILO: a deep learning approach to predict the impact of non-coding sequence variants on 3D chromatin structure

Genome Biology - BiomedCentral - Thu, 26/03/2020 - 5:30am
Non-coding variants have been shown to be related to disease by alteration of 3D genome structures. We propose a deep learning method, DeepMILO, to predict the effects of variants on CTCF/cohesin-mediated insu...
Categories: Bioinformatics Trends

Diverse genetic mechanisms underlie worldwide convergent rice feralization

Genome Biology - BiomedCentral - Thu, 26/03/2020 - 5:30am
Worldwide feralization of crop species into agricultural weeds threatens global food security. Weedy rice is a feral form of rice that infests paddies worldwide and aggressively outcompetes cultivated varietie...
Categories: Bioinformatics Trends

Characterisation of genetic regulatory effects for osteoporosis risk variants in human osteoclasts

Genome Biology - Thu, 26/03/2020 - 5:30am
Osteoporosis is a complex disease with a strong genetic contribution. A recently published genome-wide association study (GWAS) for estimated bone mineral density (eBMD) identified 1103 independent genome-wide...
Categories: Bioinformatics Trends

DeepMILO: a deep learning approach to predict the impact of non-coding sequence variants on 3D chromatin structure

Genome Biology - Thu, 26/03/2020 - 5:30am
Non-coding variants have been shown to be related to disease by alteration of 3D genome structures. We propose a deep learning method, DeepMILO, to predict the effects of variants on CTCF/cohesin-mediated insu...
Categories: Bioinformatics Trends

Diverse genetic mechanisms underlie worldwide convergent rice feralization

Genome Biology - Thu, 26/03/2020 - 5:30am
Worldwide feralization of crop species into agricultural weeds threatens global food security. Weedy rice is a feral form of rice that infests paddies worldwide and aggressively outcompetes cultivated varietie...
Categories: Bioinformatics Trends

CRiSP: Accurate Structure Prediction of Disulfide-Rich Peptides with Cystine-Specific Sequence Alignment and Machine Learning

Bioinformatics Oxford Journals - Thu, 26/03/2020 - 5:30am
AbstractMotivationHigh-throughput sequencing discovers many naturally occurring disulfide-rich peptides or cystine-rich peptides (CRPs) with diversified bioactivities. However, their structure information, which is very important to peptide drug discovery, is still very limited.ResultsWe have developed a CRP-specific structure prediction method called CRiSP, based on a customized template database with cystine-specific sequence alignment and three machine-learning predictors. The modeling accuracy is significantly better than several popular general-purpose structure modeling methods, and our CRiSP can provide useful model quality estimations.AvailabilityThe CRiSP server is freely available on the website at http://wulab.com.cn/CRISP.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

A Cas12a ortholog with stringent PAM recognition followed by low off-target editing rates for genome editing

Genome Biology - BiomedCentral - Wed, 25/03/2020 - 5:30am
AsCas12a and LbCas12a nucleases are reported to be promising tools for genome engineering with protospacer adjacent motif (PAM) TTTV as the optimal. However, the C-containing PAM (CTTV, TCTV, TTCV, etc.) recog...
Categories: Bioinformatics Trends

A Cas12a ortholog with stringent PAM recognition followed by low off-target editing rates for genome editing

Genome Biology - Wed, 25/03/2020 - 5:30am
AsCas12a and LbCas12a nucleases are reported to be promising tools for genome engineering with protospacer adjacent motif (PAM) TTTV as the optimal. However, the C-containing PAM (CTTV, TCTV, TTCV, etc.) recog...
Categories: Bioinformatics Trends

High-resolution Repli-Seq defines the temporal choreography of initiation, elongation and termination of replication in mammalian cells

Genome Biology - BiomedCentral - Tue, 24/03/2020 - 5:30am
DNA replication in mammalian cells occurs in a defined temporal order during S phase, known as the replication timing (RT) programme. Replication timing is developmentally regulated and correlated with chromat...
Categories: Bioinformatics Trends

High-resolution Repli-Seq defines the temporal choreography of initiation, elongation and termination of replication in mammalian cells

Genome Biology - Tue, 24/03/2020 - 5:30am
DNA replication in mammalian cells occurs in a defined temporal order during S phase, known as the replication timing (RT) programme. Replication timing is developmentally regulated and correlated with chromat...
Categories: Bioinformatics Trends

HiChIP-Peaks: A HiChIP peak calling algorithm

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractMotivationHiChIP is a powerful tool to interrogate 3D chromatin organization. Current tools to analyse chromatin looping mechanisms using HiChIP data require the identification of loop anchors to work properly. However, current approaches to discover these anchors from HiChIP data are not satisfactory, having either a very high false discovery rate or strong dependence on sequencing depth. Moreover, these tools do not allow quantitative comparison of peaks across different samples, failing to fully exploit the information available from HiChIP datasets.ResultsWe develop a new tool based on a representation of HiChIP data centred on the re-ligation sites to identify peaks from HiChIP datasets, which can subsequently be used in other tools for loop discovery. This increases the reliability of these tools and improves recall rate as sequencing depth is reduced. We also provide a method to count reads mapping to peaks across samples, which can be used for differential peak analysis using HiChIP data.AvailabilityHiChIP-Peaks is freely available at https://github.com/ChenfuShi/HiChIP_peaksSupplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Cancer subtype classification and modeling by pathway attention and propagation

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractMotivationBiological pathway is important curated knowledge of biological processes. Thus, cancer subtype classification based on pathways will be very useful to understand differences in biological mechanisms among cancer subtypes. However, pathways include only a fraction of the entire gene set, only 1/3 of human genes in KEGG, and pathways are fragmented. For this reason, there are few computational methods to use pathways for cancer subtype classification.ResultsWe present an explainable deep learning model with attention mechanism and network propagation for cancer subtype classification. Each pathway is modeled by a graph convolutional network. then, a multi-attention based ensemble model combines several hundreds of pathways in an explainable manner. Lastly, network propagation on pathway-gene network explains why gene expression profiles in subtypes are different. In experiments with five TCGA cancer data sets, our method achieved very good classification accuracies and, additionally, identified subtype-specific pathways and biological functions.Supplementary informationSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Brewery: Deep Learning and deeper profiles for the prediction of 1D protein structure annotations

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractMotivationProtein Structural Annotations are essential abstractions to deal with the prediction of Protein Structures. Many increasingly sophisticated Protein Structural Annotations have been devised in the last few decades. However the need for annotations that are easy to compute, process and predict has not diminished. This is especially true for protein structures that are hardest to predict such as novel folds.ResultsWe propose Brewery, a suite of ab initio predictors of 1D Protein Structural Annotations. Brewery uses multiple sources of evolutionary information to achieve state-of-the-art predictions of Secondary Structure, Structural Motifs, Relative Solvent Accessibility and Contact Density.AvailabilityThe web server, standalone program, Docker image and training sets of Brewery are available at http://distilldeep.ucd.ie/brewery/.
Categories: Bioinformatics Trends

Annotation of tandem mass spectrometry data using stochastic neural networks in shotgun proteomics

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractMotivationThe discrimination ability of score functions to separate correct from incorrect peptide-spectrum matches in database-searching-based spectrum identification are hindered by many superfluous peaks belonging to unexpected fragmentation ions or by the lacking peaks of anticipated fragmentation ions.ResultsHere, we present a new method, called BoltzMatch, to learn score functions using a particular stochastic neural networks, called restricted Boltzmann machines, in order to enhance their discrimination ability. BoltzMatch learns chemically explainable patterns among peak pairs in the spectrum data, and it can augment peaks depending on their semantic context or even reconstruct lacking peaks of expected ions during its internal scoring mechanism. As a result, BoltzMatch achieved 50% and 33% more annotations on high- and low-resolution MS2 data than XCorr at a 0.1% false discovery rate in our benchmark; conversely, XCorr yielded the same number of spectrum annotations as BoltzMatch, albeit with 4-6 times more errors. In addition, BoltzMatch alone does yield 14% more annotations than Prosit (which runs with Percolator), and BoltzMatch with Percolator yields 32% more annotations than Prosit at 0.1% FDR level in our benchmark.AvailabilityBoltzMatch is freely available at: https://github.com/kfattila/BoltzMatchSupporting informationSupplementary materials are available at Bioinformatics Online.
Categories: Bioinformatics Trends

Automatic identification of relevant genes from low-dimensional embeddings of single cell RNAseq data

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractDimensionality reduction is a key step in the analysis of single-cell RNA sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell’s position in the low-dimensional embedding, making it more difficult to characterize the underlying biological processes.In this paper, we introduce the concepts of local and global gene relevance to compute an equivalent of principal component analysis loadings for non-linear low-dimensional embeddings. Global gene relevance identifies drivers of the overall embedding, while local gene relevance identifies those of a defined subregion. We apply our method to single-cell RNAseq datasets from different experimental protocols and to different low dimensional embedding techniques. This shows our method’s versatility to identify key genes for a variety of biological processes.To ensure reproducibility and ease of use, our method is released as part of destiny 3.0, a popular R package for building diffusion maps from single-cell transcriptomic data. It is readily available through Bioconductor.
Categories: Bioinformatics Trends

Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF

Bioinformatics Oxford Journals - Tue, 24/03/2020 - 5:30am
AbstractMotivationThe rapid proliferation of single-cell RNA-Sequencing (scRNA-Seq) technologies has spurred the development of diverse computational approaches to detect transcriptionally coherent populations. While the complexity of the algorithms for detecting heterogeneity has increased, most require significant user-tuning, are heavily reliant on dimension reduction techniques and are not scalable to ultra-large datasets. We previously described a multi-step algorithm, Iterative Clustering and Guide-gene selection (ICGS), which applies intra-gene correlation and hybrid clustering to uniquely resolve novel transcriptionally coherent cell populations from an intuitive graphical user interface.ResultsWe describe a new iteration of ICGS that outperforms state-of-the-art scRNA-Seq detection workflows when applied to well-established benchmarks. This approach combines multiple complementary subtype detection methods (HOPACH, sparse-NMF, cluster “fitness”, SVM) to resolve rare and common cell-states, while minimizing differences due to donor or batch effects. Using data from multiple cell atlases, we show that the PageRank algorithm effectively down-samples ultra-large scRNA-Seq datasets, without losing extremely rare or transcriptionally similar yet distinct cell-types and while recovering novel transcriptionally distinct cell populations. We believe this new approach holds tremendous promise in reproducibly resolving hidden cell populations in complex datasets.Availability and implementationICGS2 is implemented in Python. The source code and documentation are available at: http://altanalyze.org.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
Categories: Bioinformatics Trends

Pages

Subscribe to Centre for Bioinformatics aggregator - Bioinformatics Trends

Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 
April 2020