Title: | Visualization and Analysis of Spatial Heterogeneity in Spatially-Resolved Gene Expression |
Version: | 1.2.2 |
Description: | Visualization and analysis of spatially resolved transcriptomics data. The 'spatialGE' R package provides methods for visualizing and analyzing spatially resolved transcriptomics data, such as 10X Visium, CosMx, or csv/tsv gene expression matrices. It includes tools for spatial interpolation, autocorrelation analysis, tissue domain detection, gene set enrichment, and differential expression analysis using spatial mixed models. |
License: | MIT + file LICENSE |
Copyright: | ins/COPYRIGHT |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | arrow, BiocParallel, concaveman, ComplexHeatmap, data.table, DelayedArray, DelayedMatrixStats, dynamicTreeCut, dplyr, EBImage, ggforce, ggplot2, ggpolypath, ggrepel, gstat, GSVA, hdf5r, jpeg, jsonlite, khroma (≥ 1.6.0), magrittr, Matrix, MASS, methods, parallel, png, RColorBrewer, Rcpp (≥ 1.0.7), readr, readxl, rlang, scales, sctransform, sfsmisc, sf, sp, spaMM, spdep, stats, stringr, tibble, tidyr, utils, uwot, wordspace |
LinkingTo: | Rcpp, RcppEigen, RcppProgress |
Suggests: | curl, geoR, ggpubr, httr, janitor, kableExtra, knitr, msigdbr, progress, rmarkdown, rpart, R.utils, scSpatialSIM, spatstat, SeuratObject, tidyverse, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2025-06-03 14:47:52 UTC; oospina1 |
Author: | Oscar Ospina |
Maintainer: | Oscar Ospina <oscar.ospina@jhmi.edu> |
Repository: | CRAN |
Date/Publication: | 2025-06-04 07:50:02 UTC |
STclust: Detect clusters of spots/cells
Description
Perform unsupervised spatially-informed clustering on the spots/cells of a ST sample
Usage
STclust(
x = NULL,
samples = NULL,
ws = 0.025,
dist_metric = "euclidean",
linkage = "ward.D2",
ks = "dtc",
topgenes = 2000,
deepSplit = FALSE,
cores = NULL,
verbose = TRUE
)
Arguments
x |
an STlist with normalized expression data |
samples |
a vector with strings or a vector with integers indicating the samples to run STclust |
ws |
a double (0-1) indicating the weight to be applied to spatial distances. Defaults to 0.025 |
dist_metric |
the distance metric to be used. Defaults to 'euclidean'. Other
options are the same as in |
linkage |
the linkage method applied to hierarchical clustering. Passed to
|
ks |
the range of k values to assess. Defaults to |
topgenes |
the number of genes with highest spot-to-spot expression variation. The
variance is calculated via |
deepSplit |
a logical or integer (1-4), to be passed to |
cores |
an integer indicating the number of cores to use in parallelization (Unix only) |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
Details
The function takes an STlist and calculates euclidean distances between cells or spots
based on the x,y spatial locations, and the expression of the top variable genes
(Seurat::FindVariableFeatures
). The resulting distances are weighted by
applying 1-ws
to the gene expression distances and ws
to the spatial distances.
Hierarchical clustering is performed on the sum of the weighted distance matrices.
The STclust
method allows for identification of tissue niches/domains that are
spatially cohesive.
Value
an STlist with cluster assignments
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
melanoma <- STclust(melanoma, ws=c(0, 0.025))
STplot(melanoma, ws=0.025, samples='ST_mel1_rep2', ptsize=1)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
STdiff: Differential gene expression analysis for spatial transcriptomics data
Description
Tests for differentially expressed genes using linear models with or without spatial covariance structures
Usage
STdiff(
x = NULL,
samples = NULL,
annot = NULL,
w = NULL,
k = NULL,
deepSplit = NULL,
topgenes = 5000,
pval_thr = 0.05,
pval_adj = "fdr",
test_type = "mm",
sp_topgenes = 0.2,
clusters = NULL,
pairwise = FALSE,
verbose = 1L,
cores = NULL
)
Arguments
x |
an STlist |
samples |
an integer indicating the spatial samples to be included in the DE tests.
Numbers follow the order in |
annot |
a column name in |
w |
the spatial weight used in STclust. Required if |
k |
the k value used in STclust, or |
deepSplit |
the deepSplit value if used in STclust. Required if |
topgenes |
an integer indicating the top variable genes to select from each sample based on variance (default=5000). If NULL, all genes are selected. |
pval_thr |
cut-off of adjusted p-values to define differentially expressed genes from
non-spatial linear models. A proportion of genes ( |
pval_adj |
Method to adjust p-values. Defaults to |
test_type |
one of |
sp_topgenes |
Proportion of differentially expressed genes from non-spatial
linear models (and controlled by |
clusters |
cluster name(s) to test DE genes, as opposed to all clusters. |
pairwise |
whether or not to carry tests on a pairwise manner. The default is
|
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
cores |
Number of cores to use in parallelization. If |
Details
The method tests for differentially expressed genes between groups of spots/cells
(e.g., clusters) in a spatial transcriptomics sample. Specifically, the function
tests for genes with significantly higher or lower gene expression in one group of
spots/cells with respect to the rest of spots/cells in the sample. The method first
runs non-spatial linear models on the genes to detect differentially expressed genes.
Then spatial linear models with exponential covariance structure are fit on a
subset of genes detected as differentially expressed by the non-linear models (sp_topgenes
).
If running on clusters detected via STclust, the user can specify the assignments
using the same parameters (w
, k
, deepSplit
). Otherwise, the assignments are
specified by indicating one of the column names in x@spatial_meta
. The function
uses spaMM::fitme
and is computationally expensive even on HPC environments.
To run the STdiff using the non-spatial approach (faster), set sp_topgenes=0
.
Value
a list with one data frame per sample with results of differential gene expression analysis
STdiff_volcano: Generates volcano plots from STdiff results
Description
Generates volcano plots of differential expression results from STdiff
Usage
STdiff_volcano(
x = NULL,
samples = NULL,
clusters = NULL,
pval_thr = 0.05,
color_pal = NULL
)
Arguments
x |
the output of |
samples |
samples to create plots |
clusters |
names of the clusters to generate comparisons |
pval_thr |
the p-value threshold to color genes with differential expression |
color_pal |
the palette to color genes by significance |
Details
The function generated volcano plots (p-value vs. log-fold change) for
genes tested with STdiff
. Colors can be customized to show significance from
spatial and non-spatial models
Value
a list of ggplot objects
STenrich
Description
Test for spatial enrichment of gene expression sets in ST data sets
Usage
STenrich(
x = NULL,
samples = NULL,
gene_sets = NULL,
score_type = "avg",
reps = 1000,
annot = NULL,
domain = NULL,
num_sds = 1,
min_units = 20,
min_genes = 5,
pval_adj_method = "BH",
seed = 12345,
cores = NULL,
verbose = TRUE
)
Arguments
x |
an STlist with transformed gene expression |
samples |
a vector with sample names or indexes to run analysis |
gene_sets |
a named list of gene sets to test. The names of the list should identify the gene sets to be tested |
score_type |
Controls how gene set expression is calculated. The options are the average expression among genes in a set ('avg'), or a GSEA score ('gsva'). The default is 'avg' |
reps |
the number of random samples to be extracted. Default is 1000 replicates |
annot |
name of the annotation within |
domain |
the domain to restrict the analysis. Must exist within the spot/cell
categories included in the selected annotation (i.e., |
num_sds |
the number of standard deviations to set the minimum gene set expression threshold. Default is one (1) standard deviation |
min_units |
Minimum number of spots with high expression of a pathway for that gene set to be considered in the analysis. Defaults to 20 spots or cells |
min_genes |
the minimum number of genes of a gene set present in the data set for that gene set to be included. Default is 5 genes |
pval_adj_method |
the method for multiple comparison adjustment of p-values.
Options are the same as that of |
seed |
the seed number for the selection of random samples. Default is 12345 |
cores |
the number of cores used during parallelization. If NULL (default), the number of cores is defined automatically |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
Details
The function performs a randomization test to assess if the sum of
distances between cells/spots with high expression of a gene set is lower than
the sum of distances among randomly selected cells/spots. The cells/spots are
considered as having high gene set expression if the average expression of genes in a
set is higher than the average expression plus num_sds
times the standard deviation.
Control over the size of regions with high expression is provided by setting the
minimum number of cells/spots (min_units
). This method is a modification of
the method devised by Hunter et al. 2021 (zebrafish melanoma study).
Value
a list of data frames with the results of the test
STgradient: Tests of gene expression spatial gradients
Description
Calculates Spearman's coefficients to detect genes showing expression spatial gradients
Usage
STgradient(
x = NULL,
samples = NULL,
topgenes = 2000,
annot = NULL,
ref = NULL,
exclude = NULL,
out_rm = FALSE,
limit = NULL,
distsumm = "min",
min_nb = 3,
robust = TRUE,
nb_dist_thr = NULL,
log_dist = FALSE,
cores = NULL,
verbose = TRUE
)
Arguments
x |
an STlist with transformed gene expression |
samples |
the samples on which the test should be executed |
topgenes |
the number of high-variance genes to be tested. These genes are selected in descending order of variance as caclulated using Seurat's vst method |
annot |
the name of a column in |
ref |
one of the tissue domains in the column specified in |
exclude |
optional, a cluster/domain to exclude from the analysis |
out_rm |
logical (optional), remove gene expression outliers defined by
the interquartile method. This option is only valid when |
limit |
limite the analysis to spots/cells with distances to |
distsumm |
the distance summary metric to use in correlations. One of |
min_nb |
the minimum number of immediate neighbors a spot or cell has to
have in order to be included in the analysis. This parameter seeks to reduce the
effect of isolated |
robust |
logical, whether to use robust regression ( |
nb_dist_thr |
a numeric vector of length two indicating the tolerance interval to assign
spots/cells to neighborhoods. The wider the range of the interval, the more likely
distinct neighbors to be considered. If NULL, |
log_dist |
logical, whether to apply the natural logarithm to the spot/cell distances. It applies to all distances a constant (1e-200) to avoid log(0) |
cores |
the number of cores used during parallelization. If NULL (default), the number of cores is defined automatically |
verbose |
logical, whether to print text to console |
Details
The STgradient
function fits linear models and calculates Spearman coefficients
between the expression of a gene and the minimum or average distance of spots or
cells to a reference tissue domain. In other wordsm the STgradient
function
can be used to investigate if a gene is expressed higher in spots/cells closer to
a specific reference tissue domain, compared to spots/cells farther from the
reference domain (or viceversa as indicated by the Spearman's cofficient).
Value
a list of data frames with the results of the test
SThet: Computes global spatial autocorrelation statistics on gene expression
Description
Computes the global spatial autocorrelation statistics Moran's I and/or Geary's C for a set of genes
Usage
SThet(
x = NULL,
genes = NULL,
samples = NULL,
method = "moran",
k = NULL,
overwrite = TRUE,
cores = NULL,
verbose = TRUE
)
Arguments
x |
an STlist |
genes |
a vector of gene names to compute statistics |
samples |
the samples to compute statistics |
method |
The spatial statistic(s) to estimate. It can be set to 'moran', 'geary' or both. Default is 'moran' |
k |
the number of neighbors to estimate weights. By default NULL, meaning that spatial weights will be estimated from Euclidean distances. If an positive integer is entered, then the faster k nearest-neighbors approach is used. Please keep in mind that estimates are not as accurate as when using the default distance-based method. |
overwrite |
logical indicating if previous statistics should be overwritten. Default to FALSE (do not overwrite) |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
logical, whether to print text to console |
Details
The function computes global spatial autocorrelation statistics (Moran's I and/or
Geary's C) for the requested genes and samples. Then computation uses the
package spdep
. The calculated statistics are stored in the STlist, which can
be accessed with the get_gene_meta
function. For visual comparative analysis,
the function compare_SThet
can be used afterwards.
Value
an STlist containing spatial statistics
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran')
get_gene_meta(melanoma, sthet_only=TRUE)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
STlist: Creation of STlist objects for spatial transcriptomics analysis
Description
Creates an STlist object from one or multiple spatial transcriptomic samples.
Usage
STlist(
rnacounts = NULL,
spotcoords = NULL,
samples = NULL,
cores = NULL,
verbose = TRUE
)
Arguments
rnacounts |
the count data which can be provided in one of these formats:
|
spotcoords |
the cell/spot coordinates. Not required if inputs are Visium or Xenium (spaceranger or xeniumranger outputs).
|
samples |
the sample names/IDs and (optionally) metadata associated with
each spatial sample.
The following options are available for
|
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
logical, whether to print text to console |
Details
Objects of the S4 class STlist are the starting point of analyses in spatialGE
.
The STlist contains data from one or multiple samples (i.e., tissue slices), and
results from most spatialGE
's functions are stored within the object.
Raw gene counts and spatial coordinates. Gene count data have genes in rows and sampling units (e.g., cells, spots) in columns. Spatial coordinates have sampling units in rows and three columns: sample unit IDs, Y position, and X position.
Visium outputs from Space Ranger. The Visium directory must have the directory structure resulting from
spaceranger count
, with either a count matrix represented in MEX files or a h5 file. The directory should also contain aspatial
sub-directory, with the spatial coordinates (tissue_positions_list.csv
), and optionally the high resolution tissue image and scaling factor filescalefactors_json.json
.Xenium outputs from Xenium Ranger. The Xenium directory must have the directory structure resulting from the
xeniumranger
pipeline, with either a cell-feature matrix represented in MEX files or a h5 file. The directory should also contain a parquet file, with the spatial coordinates (cells.parquet
).CosMx-SMI outputs. Two files are required to process SMI outputs: The
exprMat
andmetadata
files. Both files must contain the "fov" and "cell_ID" columns. In addition, themetadata
files must contain the "CenterX_local_px" and "CenterY_local_px" columns.Seurat object (V4). A Seurat V4 object produced via
Seurat::Load10X_Spatial
.
Optionally, the user can input a path to a file containing a table of sample-level metadata (e.g., clinical outcomes, tissue type, age). This sample metadata file should contain sample IDs in the first column partially matching the file names of the count/coordinate file paths or Visium directories. Note: The sample ID of a given sample cannot be a substring of the sample ID of another sample. For example, instead of using "tissue1" and "tissue12", use "tissue01" and "tissue12".
The function uses parallelization if run in a Unix system. Windows users will experience longer times depending on the number of samples.
Value
an STlist object containing the counts and coordinates, and optionally
the sample metadata, which can be used for downstream analysis with spatialGE
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
Definition of an STlist object class.
Description
Definition of an STlist object class.
Slots
counts
per spot RNA counts
spatial_meta
per spot x,y coordinates
gene_meta
per gene statistics (e.g., average expression, variance, Moran's I)
sample_meta
dataframe with metadata per sample
tr_counts
transfromed per spot counts
gene_krige
results from kriging on gene expression
misc
Parameters and images from ST data
STplot: Plots of gene expression, cluster memberships, and metadata in spatial context
Description
Generates a plot of the location of spots/cells within an spatial sample, and colors them according to gene expression levels or spot/cell-level metadata
Usage
STplot(
x,
samples = NULL,
genes = NULL,
plot_meta = NULL,
ks = "dtc",
ws = NULL,
deepSplit = NULL,
color_pal = NULL,
data_type = "tr",
ptsize = NULL,
txsize = NULL
)
Arguments
x |
an STlist |
samples |
a vector of numbers indicating the ST samples to plot, or their
sample names. If vector of numbers, it follow the order of samples in |
genes |
a vector of gene names or a named list of gene sets. In the latter case, the averaged expression of genes within the sets is plotted |
plot_meta |
a column name in |
ks |
the k values to plot or 'dtc' to plot results from |
ws |
the spatial weights to plot samples if |
deepSplit |
a logical or positive number indicating the |
color_pal |
a string of a color palette from |
data_type |
one of 'tr' or 'raw', to plot transformed or raw counts respectively |
ptsize |
a number specifying the size of the points. Passed to the |
txsize |
a number controlling the size of the text in the plot title and legend title. Passed to the |
Details
The function takes an STlist and plots the cells or spots in their spatial context.
The users can color the spots/cells according to the expression of selected genes,
cluster memberships, or any spot/cell level metadata included in x@spatial_meta
.
The function also can average expression of gene sets.
Value
a list of plots
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
STplot(melanoma, gene='MLANA', samples='ST_mel1_rep2', ptsize=1)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
STplot_interpolation: Visualize gene expression surfaces
Description
Produces a gene expression surface from kriging interpolation of ST data.
Usage
STplot_interpolation(
x = NULL,
genes = NULL,
top_n = 10,
samples = NULL,
color_pal = "BuRd"
)
Arguments
x |
an STlist containing results from |
genes |
a vector of gene names (one or several) to plot. If 'top', the 10 genes with highest standard deviation from each spatial sample are plotted. |
top_n |
an integer indicating how many top genes to perform kriging. Default is 10. |
samples |
a vector indicating the spatial samples to plot. If vector of numbers,
it follows the order of |
color_pal |
a color scheme from |
Details
This function produces a gene expression surface plot via kriging for one or several genes and spatial samples
Value
a list of plots
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2')
kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2')
ggpubr::ggarrange(plotlist=kp)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
compare_SThet: Compares spatial autocorrelation statistics across samples
Description
Plots the spatial autocorrelation statistics of genes across samples and colors samples acording to sample metadata.
Usage
compare_SThet(
x = NULL,
samplemeta = NULL,
genes = NULL,
color_by = NULL,
categorical = TRUE,
color_pal = "muted",
ptsize = 1
)
Arguments
x |
an STlist. |
samplemeta |
a string indicating the name of the variable in the clinical data frame. If NULL, uses sample names |
genes |
the name(s) of the gene(s) to plot. |
color_by |
the variable in |
categorical |
logical indicating whether or not to treat |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a vector with colors with enough elements to plot categories. |
ptsize |
a number specifying the size of the points. Passed to the |
Details
This function takes the names of genes and their Moran's I or Geary's C computed for multiple samples and to provide a muti-sample comparison. Samples in the plot can be colored according to sample metadata to explore potential associations between spatial distribution of gene expression and sample-level data.
Value
a list of plots
dim: Prints the dimensions of count arrays within an STList object.
Description
Returns the number of genes and spots for each array within an STList object
Usage
## S4 method for signature 'STlist'
dim(x)
Arguments
x |
an STList object to show summary from. |
Details
This function takes an STList and prints the number of genes (rows) and spots (columns) of each spatial array within that object.
per_unit_counts: Generates distribution plots of spot/cell meta data or gene expression
Description
Generates violin plots, boxplots, or density plots of variables in the spatial meta data or of gene expression
Usage
distribution_plots(
x = NULL,
plot_meta = NULL,
genes = NULL,
samples = NULL,
data_type = "tr",
color_pal = "okabeito",
plot_type = "violin",
ptsize = 0.5,
ptalpha = 0.5
)
Arguments
x |
an STlist |
plot_meta |
vector of variables in |
genes |
vector of genes to plot expression distribution. If used in conjunction
with |
samples |
samples to include in the plot. Default (NULL) includes all samples |
data_type |
one of 'tr' or 'raw', to plot transformed or raw counts |
color_pal |
a string of a color palette from |
plot_type |
one of "violin", "box", or "density" (violin plots, box plots, or
density plots respectively). If |
ptsize |
the size of points in the plots |
ptalpha |
the transparency of points (violin/box plot) or curves (density plots) |
Details
The function allows to visualize the distribution of spot/cell total counts, total genes, or expression of specific genes across all samples for comparative purposes. It also allows grouping of gene expression values by categorical variables (e.g., clusters).
Value
a list containing ggplot2
objects
filter_data: Filters cells/spots, genes, or samples
Description
Filtering of spots/cells, genes or samples, as well as count-based filtering
Usage
filter_data(
x = NULL,
spot_minreads = 0,
spot_maxreads = NULL,
spot_mingenes = 0,
spot_maxgenes = NULL,
spot_minpct = 0,
spot_maxpct = NULL,
gene_minreads = 0,
gene_maxreads = NULL,
gene_minspots = 0,
gene_maxspots = NULL,
gene_minpct = 0,
gene_maxpct = NULL,
samples = NULL,
rm_tissue = NULL,
rm_spots = NULL,
rm_genes = NULL,
rm_genes_expr = NULL,
spot_pct_expr = "^MT-"
)
Arguments
x |
an STlist |
spot_minreads |
the minimum number of total reads for a spot to be retained |
spot_maxreads |
the maximum number of total reads for a spot to be retained |
spot_mingenes |
the minimum number of non-zero counts for a spot to be retained |
spot_maxgenes |
the maximum number of non-zero counts for a spot to be retained |
spot_minpct |
the minimum percentage of counts for features defined by |
spot_maxpct |
the maximum percentage of counts for features defined by |
gene_minreads |
the minimum number of total reads for a gene to be retained |
gene_maxreads |
the maximum number of total reads for a gene to be retained |
gene_minspots |
he minimum number of spots with non-zero counts for a gene to be retained |
gene_maxspots |
the maximum number of spots with non-zero counts for a gene to be retained |
gene_minpct |
the minimum percentage of spots with non-zero counts for a gene to be retained |
gene_maxpct |
the maximum percentage of spots with non-zero counts for a gene to be retained |
samples |
samples (as in |
rm_tissue |
sample (as in |
rm_spots |
vector of spot/cell IDs to remove. Removes spots/cells in |
rm_genes |
vector of gene names to remove from STlist. Removes genes in |
rm_genes_expr |
a regular expression that matches genes to remove. Removes genes in |
spot_pct_expr |
a expression to use with |
Details
This function provides options to filter elements in an STlist. It can remove
cells/spots or genes based on raw counts (x@counts
). Users can input an
regular expression to query gene names and calculate percentages (for example %
mtDNA genes). The function also can filter entire samples. Note that the function
removes cells/spots, genes, and/or samples in the raw counts, transformed counts,
spatial variables, gene variables, and sample metadata. Also note that the function
filters in the following order:
Samples (
rm_tissue
)Spots (
rm_spots
)Genes (
rm_genes
)Genes matching
rm_genes_expr
Min and max counts
Value
an STlist containing the filtered data
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- filter_data(melanoma, spot_minreads=2000)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
gene_interpolation: Spatial interpolation of gene expression
Description
Performs spatial interpolation ("kriging") of transformed gene counts
Usage
gene_interpolation(
x = NULL,
genes = "top",
top_n = 10,
samples = NULL,
cores = NULL,
verbose = TRUE
)
Arguments
x |
an STlist with transformed RNA counts |
genes |
a vector of gene names or 'top'. If 'top' (default), interpolation of
the 10 genes ( |
top_n |
an integer indicating how many top genes to perform interpolation. Default is 10. |
samples |
the spatial samples for which interpolations will be performed. If NULL (Default), all samples are interpolated. |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity. |
Details
This function takes an STlist and a vector of gene names and generates spatial
interpolation of gene expression values via "kriging". If genes='top', then
the 10 genes (default) with the highest standard deviation for each ST sample
are interpolated. The resulting interpolations can be visualized via the
STplot_interpolation
function
Value
x a STlist including spatial interpolations.
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2')
kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1'))
ggpubr::ggarrange(plotlist=kp)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
get_gene_meta: Extract gene-level metadata and statistics
Description
Extracts gene-level metadata and spatial statistics (if already computed)
Usage
get_gene_meta(x = NULL, sthet_only = FALSE)
Arguments
x |
an STlist |
sthet_only |
logical, return only genes with spatial statistics |
Details
This function extracts data from the x@gene_meta
slot, optionally subsetting
only to those genes for which spatial statistics (Moran's I or Geary's C, see SThet
)
have been calculated. The output is a data frame with data from all samples in the
STlist
Value
a data frame with gene-level data
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran')
get_gene_meta(melanoma, sthet_only=TRUE)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
load_images: Place tissue images within STlist
Description
Loads the images from tissues to the appropriate STlist slot.
Usage
load_images(x = NULL, images = NULL)
Arguments
x |
an STlist |
images |
a string indicating a folder to load images from |
Details
This function looks for .PNG
or .JPG
files within a folder matching the
sample names in an existing STlist. Then, loads the images to the STlist which
can be used for plotting along with other spatialGE plots.
Value
an STlist with images
plot_counts: Generates plots for the distribution of counts
Description
Generates density plots, violin plots, and/or boxplots for the distribution of count values
Usage
plot_counts(
x = NULL,
samples = NULL,
data_type = "tr",
plot_type = "density",
color_pal = "okabeito",
cvalpha = 0.5,
distrib_subset = 0.5,
subset_seed = 12345
)
Arguments
x |
an STlist |
samples |
samples to include in the plot. Default (NULL) includes all samples |
data_type |
one of |
plot_type |
one or several of |
color_pal |
a string of a color palette from |
cvalpha |
the transparency of the density plots |
distrib_subset |
the proportion of spots/cells to plot. Generating these plots can be time consuming due to the large amount of elements to plot. This argument provides control on how many randomly values to show to speed plotting |
subset_seed |
related to |
Details
The function allows to visualize the distribution counts across all genes and spots
in the STlist. The user can select between density plots, violin plots, or box
plots as visualization options. Useful for assessment of the effect of filtering and
data transformations and to assess zero-inflation. To plot counts or genes per
spot/cell, the function distribution_plots
should be used instead.
Value
a list of ggplot objects
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
cp <- plot_counts(melanoma, data_type='raw', plot_type=c('violin', 'box'))
ggpubr::ggarrange(plotlist=cp)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
plot_image: Generate a ggplot object of the tissue image
Description
Creates ggplot objects of the tissue images when available within the STlist
Usage
plot_image(x = NULL, samples = NULL)
Arguments
x |
an STlist |
samples |
a vector of numbers indicating the ST samples to plot, or their
sample names. If vector of numbers, it follow the order of |
Details
If the STlist contains tissue images in the @misc
slot, the plot_image
function
can be used to generate ggplot objects. These ggplot objects can be plotted next to
quilt plots (STplot
function) for comparative analysis.
Value
a list of plots
pseudobulk_dim_plot: Plot PCA of pseudobulk samples
Description
Generates a PCA plot after computation of "pseudobulk" counts
Usage
pseudobulk_dim_plot(
x = NULL,
color_pal = "muted",
plot_meta = NULL,
dim = "pca",
pcx = 1,
pcy = 2,
ptsize = 5
)
Arguments
x |
an STlist with pseudobulk PCA results in the |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a
vector of color names or HEX values. Each color represents a category in the
variable specified in |
plot_meta |
a string indicating the name of the variable in the sample metadata to color points in the PCA plot |
dim |
one of |
pcx |
integer indicating the principal component to plot in the x axis |
pcy |
integer indicating the principal component to plot in the y axis |
ptsize |
the size of the points in the PCA plot. Passed to the |
Details
Generates a Principal Components Analysis plot to help in initial data exploration of
differences among samples. The points in the plot represent "pseudobulk" samples.
This function follows after usage of pseudobulk_samples
.
Value
a ggplot object
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- pseudobulk_samples(melanoma)
pseudobulk_dim_plot(melanoma, plot_meta='patient')
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
pseudobulk_heatmap: Heatmap of pseudobulk samples
Description
Generates a heatmap plot after computation of "pseudobulk" counts
Usage
pseudobulk_heatmap(
x = NULL,
color_pal = "muted",
plot_meta = NULL,
hm_display_genes = 30
)
Arguments
x |
an STlist with pseudobulk counts in the |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a
vector of color names or HEX values. Each color represents a category in the
variable specified in |
plot_meta |
a string indicating the name of the variable in the sample metadata to annotate heatmap columns |
hm_display_genes |
number of genes to display in heatmap, selected based on decreasing order of standard deviation across samples |
Details
Generates a heatmap of transformed "pseudobulk" counts to help in initial data
exploration of differences among samples. Each column in the heatmap represents a
"pseudobulk" sample. Rows are genes, with the number of genes displayed controlled by
the hm_display_genes
argument. This function follows after usage of pseudobulk_samples
.
Value
a ggplot object
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- pseudobulk_samples(melanoma)
hm <- pseudobulk_heatmap(melanoma, plot_meta='BRAF_status', hm_display_genes=30)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
pseudobulk_samples: Aggregates counts into "pseudo bulk" samples
Description
Aggregates spot/cell counts into "pseudo bulk" samples for data exploration
Usage
pseudobulk_samples(x = NULL, max_var_genes = 5000, calc_umap = FALSE)
Arguments
x |
an STlist. |
max_var_genes |
number of most variable genes (standard deviation) to use in pseudobulk analysis |
calc_umap |
logical, whether to calculate UMAP embeddings in addition to PCs |
Details
This function takes an STlist and aggregates the spot/cell counts into "pseudo bulk" counts by summing all counts from all cell/spots for each gene. Then performs Principal Component Analysis (PCA) to explore non-spatial sample-to-sample variation
Value
an STlist with appended pseudobulk counts and PCA coordinates
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- pseudobulk_samples(melanoma)
pseudobulk_dim_plot(melanoma)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})
show: Prints overview of STList oject.
Description
Prints overview/summary of STList oject.
Usage
## S4 method for signature 'STlist'
show(object)
Arguments
object |
an STList object to show summary from. |
Details
This function takes an STList and prints a the number of spatial arrays in that object and other information about the object.
spatial_metadata: Prints the names of the available spot/cell annotations
Description
returns the names of the annotations in the x@spatial_meta
slot.
Usage
spatial_metadata(x)
Arguments
x |
an STList object |
Value
a list of character vectors containing the column names of x@spatial_meta
summarize_STlist: Generates a data frame with summary statistics
Description
Produces a data frame with counts per gene and counts per ROI/spot/cell
Usage
summarize_STlist(x = NULL)
Arguments
x |
an STlist |
Details
The function creates a table with counts per gene and counts per region of interest (ROI), spot, or cell in the samples stored in the STlist
Value
a data frame
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file) # Only first two samples
summarize_STlist(melanoma)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?.")
return(NULL)
})
summary: Prints overview of STList oject.
Description
Prints overview/summary of STList oject.
Usage
## S4 method for signature 'STlist'
summary(object)
Arguments
object |
an STList object to show summary from. |
Details
This function takes an STList and prints a the number of spatial arrays in that object and other information about the object.
tissue_names: Prints the names of the tissue samples in the STlist
Description
returns a character vector with the names of tissue samples in the STlist.
Usage
tissue_names(x)
Arguments
x |
an STList object |
Value
a character vector with the sample names in the STlist object
transform_data: Transformation of spatial transcriptomics data
Description
Applies data transformation methods to spatial transcriptomics samples within an STlist
Usage
transform_data(
x = NULL,
method = "log",
scale_f = 10000,
sct_n_regr_genes = 3000,
sct_min_cells = 5,
cores = NULL
)
Arguments
x |
an STlist with raw count matrices. |
method |
one of |
scale_f |
the scale factor used in logarithmic transformation |
sct_n_regr_genes |
the number of genes to be used in the regression model
during SCTransform. The function |
sct_min_cells |
The minimum number of spots/cells to be used in the regression
model fit by |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
Details
This function takes an STlist with raw counts and performs data transformation.
The user has the option to select between log transformation after library size
normalization (method='log'
), or SCTransform (method='sct'
). In the case of
logarithmic transformation, a scaling factor (10^4 by default) is applied. The
function uses parallelization using "forking" (not available in Windows OS).
Note that the method sct
returns a matrix with less genes as filtering is
done for low expression genes.
Value
x an updated STlist with transformed counts.
Examples
# Using included melanoma example (Thrane et al.)
# Download example data set from spatialGE_Data
thrane_tmp = tempdir()
unlink(thrane_tmp, recursive=TRUE)
dir.create(thrane_tmp)
lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download='
tryCatch({ # In case data is not available from network
download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb')
#' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE)
unzip(zipfile=zip_tmp, exdir=thrane_tmp)
# Generate the file paths to be passed to the STlist function
count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='counts')
coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='mapping')
clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'),
full.names=TRUE, pattern='clinical')
# Create STlist
library('spatialGE')
melanoma <- STlist(rnacounts=count_files,
spotcoords=coord_files,
samples=clin_file)
melanoma <- transform_data(melanoma)
}, error = function(e) {
message("Could not run example. Are you connected to the internet?")
return(NULL)
})