January 30, 2024 PAO-01-24-NI-03
The human body comprises many different tissues and organs, each of which is composed of a diversity of cell types that perform myriad functions. In addition, cells located within different regions of a given tissue or organ exhibit different properties and functions, which are in part determined by their location, context, and interactions with surrounding cells.1 Examples of observable differences on the genomic level include chromosome structures, nuclear bodies, chromatin states, and gene expression.2 Tumors, meanwhile, are also heterogeneous with respect to cellular makeup and functionality, and the interactions of tumor cells with each other and the tumor microenvironment can be dependent on their locations.3
Understanding the genetic makeup at the single-cell level and the behavior of that genetic material, therefore, only provides a portion of the information needed to truly understand biological activity.1 Placing that information in a spatial context can reveal in-depth biological insights regarding the interactions between cells and, in the case of tumors, the tumor microenvironment. Making the most of this approach requires detailed information regarding the location of cells within tissues and the location of genetic sequences in three-dimensional space within the complex three-dimensional structure of the genome of individual cells.4
In a general sense, spatial genomics is the study of the three-dimensional structure of the genome.4 Analyses take place within cells in specific locations, making it possible to observe differences in these structures across cell types and cell states. That information can be translated into a view of the differentiating features of the structure, accessibility, and ultimately function of each cell’s genome and its expression and be related to disease pathways and much more.
The study of as spatial genomics or spatial transcriptomics is made possible by advances in imaging and synthetic biology technologies combined with evolving computational data analysis capabilities. Because structure determines function in molecular biology, spatial genomics is needed to provide information about form and gene expression, both of which are essential to proper health and functioning of the human body.5
Spatial genomics provides information on “the relationship between differential gene expression, genome organization, and the effect of epigenomes while maintaining the positional context.” The emerging discipline integrates several different types of data, including data related to DNA, RNA, and other molecules in the cell, which are collected on cells within intact tissues, thus allowing the connection to be made between a particular epigenotype and its phenotype.
The term spatial “genomics” first appeared in 2009, and work in this area has rapidly evolved since then. In fact, in 2020, Nature Methods selected spatially resolved transcriptomics as the experimental method of the year.6 Researchers leverage immunofluorescence techniques (labeling of proteins or RNA), gene expression, and next-generation sequencing to understand the activity of a particular cell in a specific tissue and how cells interact with one another and to identify specific cellular and molecular processes.5 The approach allows for single-cell data to be collected with knowledge of the cellular environment of that individual cell –– critical knowledge that is lacking from traditional genomics methods.
There is some discussion over what exactly should be considered spatial genomics. It has been proposed for clarification that spatial genomics be limited to the study of how genome sequences and 3D structures (spatial positioning of the genome in the nucleus) vary between cells and cell types in their native tissue context, while the study of the spatial positioning of chromatin in the nucleus and relative to nuclear landmarks should be considered as a separate field — 3D genomics.7 In this approach, spatial transcriptomics is considered yet another area of research.
Spatial transcriptomics techniques (high-resolution, image-based in situ hybridization (ISH) and in situ sequencing (ISS) and in situ capturing) provide localized information on gene expression within different cells.5 Single-cell analyses are combined with sequencing technologies and barcoding of mRNA transcripts to obtain detailed information within the relevant spatial context.
To collect spatial genomic data, spatially localized probes are used to bind to nucleic acids in tissue sections, or photoactive in situ hybridization probes are used for direct detection within specific regions of tissues. Detected DNA sequences are identified and analyzed with respect to expression levels. Anywhere from 1,000 to over 20,000 protein-encoding genes can be quantitatively analyzed in a single experiment, and the data used to map the transcriptional activity of the transcriptional specific tissue sections. Integrated spatial genomics includes additional information obtained using a variety of other analytical methods that provide epigenomic, genomic, and proteomic information at the cellular level.8
Tissue-sectioning technologies can be important to successful spatial genomics analyses. One recent advance in this area is laser capture microdissection (LCM), which allows collection of very thin subsections (60–700 μm in diameter) from an initial tissue section, thereby allowing access to cells from a highly specific region of the tissue and investigation of transcriptomes on the cellular level.1
Sectioning can be avoided through the use of in situ hybridization techniques, in which RNA molecules can be visualized using complementary fluorescent probes. Two examples of low-throughput methods are single-molecule fluorescent in situ hybridization (smFISH) and ouroboros smFISH (osmFISH).1 Sequential FISH (seqFISH) leverages RNA barcoding and multiple rounds of hybridization to reduce spectral overlap, but it is time consuming and can lead to the compounding of errors.
Multiplexed error-robust FISH (MERFISH) is an smFISH method that includes combinatorial labeling, successive rounds of sequential hybridization imaging, and error-robust encoding.1 It is more time-efficient than seqFISH and can be used for high-throughput analysis (simultaneous detection of up to 10,000 RNA molecules). This technique is offered commercially as the MERSCOPE from Vizgen and in the form of the GenePS platform from Spatial Genomics.
In situ sequencing (ISS) methods use single-stranded DNA padlock probes with sequences complementary to cDNA (generated by reverse transcription of mRNA molecules) using gap-targeted or barcode-targeted sequencing.1 They are advantageous because they do not require transcript extraction. Because of the need for the padlock probes, however, ISS typically requires prior knowledge of the tissue being evaluated. Fluorescent in situ RNA sequencing (FISSEQ) was developed to analyze tissues with no previous data. Expansion sequencing (ExSeq) is available for both targeted and non-targeted detection of RNA molecules.
Spatial barcoding involves binding probes to mRNA that contain spatial barcodes and unique molecular identifiers (UMIs) representing the coordinates of each array.1 Reverse transcription and enzymatic tissue removal generate cDNA hybridized with nucleotides that can be visualized. The company 10× Genomics offers a commercial variation of this technology with a resolution of 55 μm under the name 10× Genomics Visium. There are too many methods to review them all, and new versions continue to be introduced that achieve better resolution with simpler protocols. Some techniques available today can provide subcellular resolutions and thus highly refined spatial distribution information.
An advantage of spatial barcoding is the ability to collect gene expression and spatial location information at the same time.1 Choosing the appropriate method with the right resolution for any given experiment is essential, though, as low resolution might not provide sufficient information, while high resolution can make the analysis too complicated. Even so, spatial barcoding techniques are widely used to learn about tissue and tumor structures.
Initial spatial genomics analyses based on FISH methods involving direct visualization using a fluorescence microscope were not suited for high-throughput application. Different approaches have been developed to overcome the limitations of these techniques, with combinatorial barcoding as leveraged in DNA MERFISH and DNA seqFISH+ (versions of RNA techniques described above) opening the door to rapidly analyzing hundreds to thousands of targets.
These methods can be expensive, as the number of probes requires increases with the number of targets. Fortunately, advances are being made in this area as well.4 Technologies such as Oligonucleotide Fluorescent In Situ Sequencing (OligoFISSEQ) combine sequencing-by-synthesis, sequencing-by-ligation, and sequencing-by-hybridization methods to read spatial information from barcoded probes.
Initially, probes were prepared via column-based synthesis, random polymerase chain reaction (PCR) amplification, or fragmentation of clone-based DNA.4 These costly methods have since been replaced with massively parallel DNA synthesis. In some cases (such as the platform from Twist Biosciences), DNA can be synthesized at a rate of 1 million 300-base oligos per chip while maintaining precision and uniformity. This type of rapid production is necessary to support spatial genomics analyses using combinatorial barcoding and in situ sequencing methods, which can require up to thousands of ~150-bp oligo probes per genomic target.
Other advances including high-definition DNA FISH (HD-FISH) and Oligopaint FISH offer more cost-effective means for generating FISH probes.5 Today, high-throughput multiplexed FISH methods leveraging microfluidic systems for sequential rounds of hybridization and imaging can visualize many DNA loci across hundreds to thousands of single cells. Researchers also have access to computational tools (OligoMiner, iFISH, ProbeDealer, Chorus2, and PaintSHOP) for designing oligo probes.
For integrated spatial genomics, many different techniques are combined to generate comprehensive genetic information. For instance, DNA seqFISH+, RNA and intron seqFISH, and multiplexed immunofluorescence can be used to generate a super-resolved localization image, identify all DNA loci, and assign the chromosomal identity to each locus.9
The information gained from spatial genomics and transcriptomics studies can help link structure to function in both healthy and diseased tissues, providing new insights into physiological activities related to diseases.1 Information about cells within tissues and how they communicate with one another can help researchers better understand highly complex organs, such as the brain, and the clinical significance of changes within them. Better understanding of diseases microenvironments and observation of changes over time during disease progression can help identify new drug targets.
Integrated spatial genomics in particular can provide wide-ranging information to support disease research, including “insights into spatiotemporal patterns, marker genes, cellular interaction networks and developmental trajectories.”1
The massive quantities of complex data collected during integrated spatial genomics studies can only be processed using advanced bioinformatics tools. Examples include spatially variable genes identification, clustering analysis, spatial decomposition and gene imputation, cellular interaction identification, spatial copy number variation identification and region annotation, and spatial trajectories determination.1
These various methods typically involve a preprocessing step followed by downstream analysis. Preprocessing of the data helps ensure higher quality results by normalizing data and removing low-quality spots and genes.1 Specific parameters for preprocessing are generally modified depending on the tissue sample, goals of the analysis and so on. Downstream processing actually involves numerous types of analyses to identify different spots and genes, gene expression patterns, and many other features of the tissue sample.
Given that integrated spatial genomics involves the collection of large quantities of many different types of genetic and related, complementary data simultaneously, there is great need for effective solutions for data integration solutions.10 Such solutions must align the data collected using hybridization, multiplexed fluorescence imaging, and in situ sequencing methods, among other omics techniques.1 Doing so would provide much more comprehensive biological insights than can be obtained by single analyses performed independently. Currently, however, multiple software packages must be used in a linked fashion, as there are few tools available that provide truly integrated analysis solutions.
Spatial PrOtein and Transcriptome Sequencing (SPOTS) is a new method developed by researchers at Weill Cornell Medicine, New York-Presbyterian, and the New York Genome Center for the creation of data-rich maps of organs and tumors, including information about cell types, cell activities, and cell–cell interactions.11 The technique integrates spatial transcriptomics with protein markers to provide improved resolution of differential gene expression analysis across tissue regions.
The method builds on technology from 10x Genomics but uses slides with probes containing molecular positional barcodes that bind to mRNA and designer antibodies that bind to both proteins of interest and the probe molecules. The locations of the mRNAs and proteins can then be precisely mapped, and the maps compared with traditional pathology images of the sample. While the initial resolution of the technique is at least several cells, the researchers are working to reduce that to single-cell capability.
The field of integrated spatial genomics is catching on as researchers realize the tremendous benefits of combining single-cell data regarding gene expression and cellular interactions for cells of both the same and different types with spatial context. Spatial transcriptomics by itself is advancing at breakneck speed as researchers seek to address limitations in resolution, sensitivity, and throughput — as well as integration with various spatial genomic technologies.
Further advances will make it possible to achieve not only high resolution and sensitivity but analysis of larger-scale tissue specimens that more accurately reflect organ structures.1 Similarly, advances in bioinformatics tools will enable more rapid and accurate analyses of ever more complex integrated data sets. Greater data integration will also facilitate progress in integrated spatial genomics.
Currently, spatially resolved transcriptomics technology lies at the heart of tissue-mapping efforts by groups such as the Human Cell Atlas12 and the Brain Initiative Cell Census Network (BICCN).13 One market research firm estimates the global spatial genomics and transcriptomics market to be expanding at a compound annual growth rate of 10.8% from $1.5 billion in 2022 to $2.5 billion through 2027.14 The seemingly limitless potential of the field to provide novel biological insights of value in drug discovery and development and related applications has clearly been recognized, and exciting advances can be anticipated in the near and longer terms.
Dr. Challener is an established industry editor and technical writing expert in the areas of chemistry and pharmaceuticals. She writes for various corporations and associations, as well as marketing agencies and research organizations, including That’s Nice and Nice Insight.