https://www.biorxiv.org/content/10.1101/2023.09.24.559168v1

 

GET: a foundation model of transcription across human cell types

Transcriptional regulation, involving the complex interplay between regulatory sequences and proteins, directs all biological processes. Computational models of transcriptions lack generalizability to accurately extrapolate in unseen cell types and conditi

www.biorxiv.org

 


 

Summary

The document titled "GET: a foundation model of transcription across human cell types" introduces the General Expression Transformer (GET), an innovative computational model designed to understand transcriptional regulation across a broad spectrum of human fetal and adult cell types. Here are the key points summarized from the document:

 

 

Key Findings

  1. Model Overview: GET is a foundation model that leverages chromatin accessibility data and sequence information to learn transcriptional regulatory syntax, allowing it to predict gene expression across both seen and unseen cell types with high accuracy.
  2. Comprehensive Data Utilization: The model uses data from about 213 human cell types, combining single-cell ATAC-seq (for chromatin accessibility) and RNA-seq data to train and fine-tune its predictions.
  3. Advanced Predictive Abilities: GET showcases exceptional adaptability and accuracy in gene expression prediction across new cell types and conditions. It outperforms existing models in identifying cis-regulatory elements and upstream regulators, as well as in the prediction of lentivirus-based massive parallel reporter assay readouts.
  4. Interpretable Insights: By interpreting GET, researchers can uncover detailed regulatory insights and transcription factor interactions, providing valuable information for nearly every gene in the studied cell types.
  5. Practical Applications: The model's capabilities extend to identifying distal regulatory regions missed by other models, understanding specific transcription factor interactions related to diseases like lymphoma, and designing synthetic biology applications.
  6. Open Access and Integration: The GET model, along with its training and application data, is made available to the scientific community, ensuring that it can be used as a powerful tool in further research.

Overall, the GET model represents a significant advance in computational biology, offering a robust and versatile tool for exploring the complexities of gene regulation across a wide array of human cell types.

 
The GET model’s universal applicability and exceptional accuracy
 
 
Key Words
  • Transcriptional regulation
  • Gene expression
  • Computational model
  • Foundation model
  • Human cell types
  • Chromatin accessibility
  • RNA sequencing (RNA-seq)
  • Single-cell ATAC-seq
  • Gene regulation
  • Transcription factors
  • Machine learning
  • Predictive modeling
  • Regulatory elements

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03217-7

 

Bento: a toolkit for subcellular analysis of spatial transcriptomics data - Genome Biology

The spatial organization of molecules in a cell is essential for their functions. While current methods focus on discerning tissue architecture, cell–cell interactions, and spatial expression patterns, they are limited to the multicellular scale. We pres

genomebiology.biomedcentral.com

 


 

Summary

The document "Bento: a toolkit for subcellular analysis of spatial transcriptomics data" presents Bento, a Python toolkit designed for analyzing spatial transcriptomics data at the subcellular level. Here's a summary of the key findings and components of the study:

 

 

Key Findings

  1. Bento Toolkit: Bento facilitates subcellular analysis by ingesting molecular coordinates and segmentation boundaries. It enables defining subcellular domains, annotating localization patterns, and quantifying gene-gene colocalization.
  2. Integration with Existing Tools: Bento is part of the Scverse ecosystem, which allows it to integrate seamlessly with other single-cell analysis tools such as Scanpy and Squidpy.
  3. Functional Demonstrations: The toolkit's utility is demonstrated through several datasets, including MERFISH, seqFISH+, and Molecular Cartography. Bento effectively characterizes subcellular components and interactions.
  4. Novel Analyses Introduced: The study introduces three novel subcellular analyses:
    • RNAforest: A method for annotating RNA subcellular localization patterns using a multilabel classification approach.
    • RNAcoloc: A technique for quantifying RNA colocalization in a compartment-specific manner, leveraging the Colocation Quotient metric.
    • RNAflux: An unsupervised method for semantic segmentation of subcellular domains, identifying and characterizing consistent subcellular domains across cells.
  5. Application Examples: Bento's capabilities are showcased in several scenarios, including analyzing localization changes in human iPSC-derived cardiomyocytes upon drug treatment, highlighting its potential in biomedical research.
  6. Versatility and Scalability: Bento is designed to be both versatile and scalable, capable of handling diverse types of spatial transcriptomics data and integrating with various data analysis platforms.

Overall, Bento offers a sophisticated toolkit for researchers needing detailed analysis of spatial transcriptomics at the subcellular level, enhancing the understanding of cellular functions and interactions.

+ Recent posts