Abstract
PathwaySpace is an R package that creates landscape images from graphs containing vertices (nodes), edges (lines), and a signal associated with the vertices. The package processes the signal using a convolution algorithm that considers the graph’s topology projecting the signal on a 2D space. PathwaySpace has various applications, such as visualizing network data in a graphical format that highlights the relationships and signal strengths between vertices. It can be particularly useful for understanding the influence of signals through complex networks. By combining graph theory, signal processing, and visualization, the PathwaySpace package provides a novel way of representing network signals.
Package: PathwaySpace 1.0.0
For a given igraph object containing vertices, edges, and a
signal associated with the vertices, PathwaySpace performs a
convolution operation, which involves a weighted combination of
neighboring node signals based on the graph structure. Figure
1 illustrates the convolution operation problem. Each vertex’s
signal is placed at a specific position in the 2D space. The
x
and y
coordinates of this space correspond
either to vertex-signal positions (e.g. red, green, and blue
lollipops in Fig.1A) or null-signal positions for which
no signal information is available (question marks in
Fig.1A). Our model considers the vertex-signal
positions as source points (or transmitters) and the null-signal
positions as end points (or receivers). The signal values from
vertex-signal positions are then projected to the null-signal positions
according to a decay function, which will control how the signal values
attenuate as they propagate across the 2D space. Available decay
functions include linear, exponential, and Weibull functions
(Fig.1B). For a given null-signal position, a k-nearest
neighbors (kNN) algorithm is used to define the contributing vertices
for signal convolution. The convolution operation combines the signals
from these contributing vertices, considering their distances and signal
strengths, and applies the decay function to model the attenuation of
the signal. Users can adjust both the decay function’s parameters and
the value of k in the kNN algorithm. These parameters control how the
signal decays, allowing users to explore different scenarios and observe
how varying parameters influence the landscape image. The resulting
image forms geodesic paths in which the signal has been projected from
vertex- to null-signal positions, using a density metric to measure the
signal intensity along these paths.
#--- Load required packages for this section
library(PathwaySpace)
library(igraph)
library(ggplot2)
This section will create an igraph object containing a
binary signal associated to each vertex. The graph layout is configured
manually to ensure that users can easily view all the relevant arguments
needed to prepare the input data for the PathwaySpace package.
The igraph’s make_star()
function creates a
star-like graph and the V()
function is used to set
attributes for the vertices. The PathwaySpace package will
require that all vertices have x
, y
, and
name
attributes.
# Make a 'toy' undirected igraph
gtoy1 <- make_star(5, mode="undirected")
# Assign xy coordinates to each vertex
V(gtoy1)$x <- c(0, 1.5, -4, -4, -9)
V(gtoy1)$y <- c(0, 0, 4, -4, 0)
# Assign a name to each vertex (here, from n1 to n5)
V(gtoy1)$name <- paste0("n", 1:5)
Our gtoy1
graph is now ready for the
PathwaySpace package. We can check its layout using the
plot.igraph()
function. Alternatively, to lay out and
visualize large graphs we suggest the RedeR
package.
# Check the graph layout
plot.igraph(gtoy1)
Next, we will create a PathwaySpace-class object using the
buildPathwaySpace()
constructor. This function will check
the validity of the igraph object. It will also calculate
pairwise distances between vertices, subsequently required by the signal
projection methods. Note that for this example we adjusted
mar = 0.2
. This argument sets the outer margins as a
fraction of the 2D image space on which the convolution operation will
project the signal.
# Run the PathwaySpace constructor
pspace1 <- buildPathwaySpace(gtoy1, mar = 0.2)
As a default behavior, the buildPathwaySpace()
constructor initializes the signal of each vertex as 0
. We
can use the length()
, names()
, and
vertexSignal()
accessors to get and set vertex signals in
the PathwaySpace object; for example, in order to get vertex
names and signal values:
# Check the number of vertices in the PathwaySpace object
length(pspace1)
## [1] 5
# Check vertex names
names(pspace1)
## [1] "n1" "n2" "n3" "n4" "n5"
# Check signal (initialized with '0')
vertexSignal(pspace1)
## n1 n2 n3 n4 n5
## 0 0 0 0 0
…and for setting new signal values in PathwaySpace objects:
# Set new signal to all vertices
vertexSignal(pspace1) <- c(1, 3, 2, 3, 2)
# Set a new signal to the 1st vertex
vertexSignal(pspace1)[1] <- 2
# Set a new signal to vertex "n1"
vertexSignal(pspace1)["n1"] <- 4
# Check updated signal values
vertexSignal(pspace1)
## n1 n2 n3 n4 n5
## 4 3 2 3 2
Following that, we will use the circularProjection()
function to project the network signals, using the
weibullDecay()
function with default settings. We set
knn = 1
, defining the contributing vertices for signal
convolution. In this case, each null-signal position will
receive the projection from a single vertex-signal position
(i.e. from the nearest signal source in the pathway space). We
then create a landscape image using the plotPathwaySpace()
function.
# Run network signal projection
pspace1 <- circularProjection(pspace1, knn = 1, pdist = 0.4)
# Plot a PathwaySpace image
plotPathwaySpace(pspace1, marks = TRUE)
The pdist
term determines a distance unit for the signal
convolution related to the pathway space. This distance unit will affect
the extent over which the convolution operation projects the signal in
the pathway space. Next, we reassess the same PathwaySpace
object using knn = 2
. The user can also customize a few
arguments in plotPathwaySpace()
function, which is a
wrapper to create dedicated ggplot graphics for
PathwaySpace-class objects.
# Re-run the network signal projection with 'knn = 2'
pspace1 <- circularProjection(pspace1, knn = 2, pdist = 0.4)
# Plot the PathwaySpace image
plotPathwaySpace(pspace1, marks = c("n3","n4"), theme = "th2")
The decay function used in the signal projection was passed to the
circularProjection()
function by the decay_fun
argument. The user can pass additional arguments to the decay function
using the ...
argument, for example:
# Re-run the network signal projection, passing 'shape' to the decay function
pspace1 <- circularProjection(pspace1, knn = 2, pdist = 0.2, shape = 2)
# Plot the PathwaySpace image
plotPathwaySpace(pspace1, marks = "n1", theme = "th2")
In this case, we set the shape
of a 3-parameter Weibull
function. This parameter allows a projection to take a variety of
shapes. When shape = 1
the Weibull decay follows an
exponential decay, and when shape > 1
the projection is
first convex, then concave with an inflexion point along the decay
path.
In this section we will project the network signal using a polar
coordinate system. This representation may be useful for certain types
of data, for example, to highlight patterns of signal propagation on
directed graphs, especially to explore the orientation aspect of signal
flow. To demonstrate this feature we will used the gtoy2
directed graph, already available in the PathwaySpace
package.
# Load a pre-processed directed igraph object
data("gtoy2", package = "PathwaySpace")
# Check the graph layout
plot.igraph(gtoy2)
# Build a PathwaySpace for the 'gtoy2' igraph
pspace2 <- buildPathwaySpace(gtoy2, mar = 0.2)
# Set '1s' as vertex signal
vertexSignal(pspace2) <- 1
# Run the network signal projection using polar coordinates
pspace2 <- polarProjection(pspace2, knn = 2, theta = 45, shape = 2)
# Plot the PathwaySpace image
plotPathwaySpace(pspace2, theme = "th2", marks = TRUE)
Note that this projection emphasizes signals along the edges of the
network. In order to also consider the direction of edges, next we set
directional = TRUE
.
# Re-run the network signal projection using 'directional = TRUE'
pspace2 <- polarProjection(pspace2, knn = 2, theta = 45, shape = 2,
directional = TRUE)
# Plot the PathwaySpace image
plotPathwaySpace(pspace2, theme = "th2", marks = c("n1","n3","n4","n5"))
This updated PathwaySpace polar projection emphasizes the signal flow into a defined direction (see the directional pattern of the igraph plot at the top of this section). However, when interpreting the results, users must be aware that this method may introduce distortions. For example, depending on the network’s structure, the polar projection may not capture all aspects of a directed graph, such as cyclic dependencies, feedforward and feedback loops, or other intricate edge interplays.
The PathwaySpace accepts binary, integer, and numeric signal
types, including NAs
. When a vertex is assigned with
NA
, it will be excluded from the signal projection, not
evaluated by the convolution algorithm. Logical values are also allowed,
but it will be treated as binary. Next, we show the projection of a
signal that includes negative values, using the pspace1
object created previously.
# Set a negative signal to vertices "n3" and "n4"
vertexSignal(pspace1)[c("n3","n4")] <- c(-2, -4)
# Check updated signal vector
vertexSignal(pspace1)
# n1 n2 n3 n4 n5
# 4 3 -2 -4 2
# Re-run the network signal projection
pspace1 <- circularProjection(pspace1, knn = 2, shape = 2)
# Plot the PathwaySpace image
plotPathwaySpace(pspace1, bg.color = "white", font.color = "grey20",
marks = TRUE, mark.color = "magenta", theme = "th2")
Note that the original signal vector was rescale to
[-1, +1]
. If the signal vector is >=0
, then
it will be rescaled to [0, 1]
; if the signal vector is
<=0
, it will be rescaled to [-1, 0]
; and if
the signal vector is in (-Inf, +Inf)
, then it will be
rescaled to [-1, +1]
. To override this signal processing,
simply set the rescale
argument to FALSE
in
the projection functions.
In order to enhance clarity and make it less likely for viewers to
miss important details of large graphs, in this section we introduce
visual elements to large PathwaySpace images. We will use an
igraph object with n = 12990
vertices to create a
large PathwaySpace object, upon which we will project binary
signals from a relatively small number of vertices. This example will
emphasize clusters of vertices forming summits, but it might
also come at the cost of reduced clarity in displaying the graph’s
overall structure, particularly in regions far from the summit areas. In
order to balance between emphasizing clusters and maintaining the
visibility of the entire graph structure, we will outline graph
silhouettes as decoration elements in the PathwaySpace
image.
#--- Load required packages for this section
library(PathwaySpace)
library(RGraphSpace)
library(igraph)
library(ggplot2)
Next, we will load an igraph object with
n = 12990
vertices, containing gene interaction data
available from the Pathway Commons database (version 12) (Rodchenkov et al. 2019).
# Load a large igraph object
data("PCv12_pruned_igraph", package = "PathwaySpace")
# Check number of vertices
length(PCv12_pruned_igraph)
# [1] 12990
# Check vertex names
head(V(PCv12_pruned_igraph)$name)
# [1] "A1BG" "AKT1" "CRISP3" "GRB2" "PIK3CA" "PIK3R1"
# Get top-connected nodes for visualization
top10hubs <- igraph::degree(PCv12_pruned_igraph)
top10hubs <- names(sort(top10hubs, decreasing = TRUE)[1:10])
head(top10hubs)
# [1] "GNB1" "TRIM28" "RPS27A" "CTNNB1" "TP53" "ACTB"
Depending on the graphics devices available in the current R
session, rendering a large graph can take a while. To visualize the
graph layout, next we use the plotGraphSpace()
function
from the RGraphSpace package for plotting optimization.
## Visualize the graph layout labeled with 'top10hubs' nodes
plotGraphSpace(PCv12_pruned_igraph, marks = top10hubs,
mark.color = "blue", theme = "th3")
We will also load gene sets from the MSigDB collection (Liberzon et al. 2015), which are subsequently used to project a binary signal in the PathwaySpace image.
# Load a list with Hallmark gene sets
data("Hallmarks_v2023_1_Hs_symbols", package = "PathwaySpace")
# There are 50 gene sets in "hallmarks"
length(hallmarks)
# [1] 50
# We will use the 'HALLMARK_P53_PATHWAY' (n=200 genes) for demonstration
length(hallmarks$HALLMARK_P53_PATHWAY)
# [1] 200
We now follow the PathwaySpace pipeline as explaned in the
previous sections, that is, using the buildPathwaySpace()
constructor to initialize a new PathwaySpace object with the
Pathway Commons interactions.
# Run the PathwaySpace constructor
pspace_PCv12 <- buildPathwaySpace(g=PCv12_pruned_igraph, nrc=500)
# Note: 'nrc' sets the number of rows and columns of the
# image space, which will affect the image resolution (in pixels)
…and now we mark the HALLMARK_P53_PATHWAY genes in the PathwaySpace object.
# Intersect Hallmark genes with the PathwaySpace
hallmarks <- lapply(hallmarks, intersect, y = names(pspace_PCv12) )
# After intersection, the 'HALLMARK_P53_PATHWAY' dropped to n=173 genes
length(hallmarks$HALLMARK_P53_PATHWAY)
# [1] 173
# Set a binary signal (1s) to 'HALLMARK_P53_PATHWAY' genes
vertexSignal(pspace_PCv12) <- 0
vertexSignal(pspace_PCv12)[ hallmarks$HALLMARK_P53_PATHWAY ] <- 1
…and run the circularProjection()
function.
# Run network signal projection
pspace_PCv12 <- circularProjection(pspace_PCv12)
plotPathwaySpace(pspace_PCv12, title="HALLMARK_P53_PATHWAY",
marks = top10hubs, mark.size = 2, theme = "th3")
Note that this image emphasizes groups of vertices forming summits, but it misses the outline of the graph structure, which faded with the signal that reaches the furthermost points of the network.
Next, we will decorate the PathwaySpace image with graph’s silhouettes.
# Add silhouettes
pspace_PCv12 <- silhouetteMapping(pspace_PCv12)
plotPathwaySpace(pspace_PCv12, title="HALLMARK_P53_PATHWAY",
marks = top10hubs, mark.size = 2, theme = "th3")
The summits represent regions within the graph that exhibit signal
values that are notably higher than the baseline level. These regions
may be of interest for downstream analyses. One potential downstream
analysis is to determine which vertices projected the original input
signal. This could provide insights into the communities within these
summit regions. One may also wish to explore other vertices within the
summits, by querying associations with the original input gene set. In
order to extract vertices within summits, next we use the
summitMapping()
function, which also decorate summits with
contour lines.
# Mapping summits
pspace_PCv12 <- summitMapping(pspace_PCv12, minsize = 50)
plotPathwaySpace(pspace_PCv12, title="HALLMARK_P53_PATHWAY", theme = "th3")
# Extracting summits from a PathwaySpace
summits <- getPathwaySpace(pspace_PCv12, "summits")
class(summits)
# [1] "list"
This will be incorporated into the PathwaySpace documentation following the acceptance of Ellrott et al. (2023).
If you use PathwaySpace, please cite:
The Cancer Genome Atlas Analysis Network. PathwaySpace: Spatial projection of network signals along geodesic paths. R package, 2023.
Ellrott et al. (under review)
Castro MA, Wang X, Fletcher MN, Meyer KB, Markowetz F (2012). “RedeR: R/Bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations.” Genome Biology, 13(4), R29. https://bioconductor.org/packages/RedeR/
Cardoso MA, Rizzardi LEA, Kume LW, Groeneveld C, Trefflich S, Morais DAA, Dalmolin RJS, Ponder BAJ, Meyer KB, Castro MAA. “TreeAndLeaf: an R/Bioconductor package for graphs and trees with focus on the leaves.” Bioinformatics, 38(5):1463-1464, 2022. https://bioconductor.org/packages/TreeAndLeaf/
Csardi G and Nepusz T. “The Igraph Software Package for Complex Network Research.” InterJournal, ComplexSystems:1695, 2006. https://igraph.org
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/Sao_Paulo
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] PathwaySpace_1.0.0 RGraphSpace_1.0.6 ggplot2_3.5.1 igraph_2.0.3
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.5 jsonlite_1.8.9 dplyr_1.1.4 compiler_4.4.2
## [5] highr_0.11 Rcpp_1.0.13 tidyselect_1.2.1 jquerylib_0.1.4
## [9] scales_1.3.0 yaml_2.3.10 fastmap_1.2.0 R6_2.5.1
## [13] generics_0.1.3 knitr_1.48 ggrepel_0.9.6 tibble_3.2.1
## [17] munsell_0.5.1 bslib_0.8.0 pillar_1.9.0 rlang_1.1.4
## [21] utf8_1.2.4 cachem_1.1.0 RANN_2.6.2 xfun_0.47
## [25] sass_0.4.9 cli_3.6.3 withr_3.0.1 magrittr_2.0.3
## [29] digest_0.6.37 grid_4.4.2 rstudioapi_0.16.0 lifecycle_1.0.4
## [33] vctrs_0.6.5 evaluate_1.0.0 glue_1.8.0 fansi_1.0.6
## [37] colorspace_2.1-1 rmarkdown_2.28 tools_4.4.2 pkgconfig_2.0.3
## [41] htmltools_0.5.8.1