sfplot is a Python package for spatial structure analysis in spatial omics data. It implements the Search-and-Find Plot (SFplot) / Cell-GPS workflow used in our manuscript to quantify multiscale tissue architecture, compute cophenetic distance-based structure maps, and analyze cell-cell or transcript-cell spatial relationships.
This repository is being maintained as the code companion for manuscript review and future reuse.
- Computes searcher-findee distance matrices from spatial coordinates and cell labels.
- Builds cophenetic distance matrices and StructureMap heatmaps from
AnnDataobjects or plainpandastables. - Loads 10x Xenium outputs and prepares them for downstream spatial analysis.
- Provides a table-bundle Xenium loader that assembles
cells.parquet, official*_cell_groups.csv, andcell_feature_matrix.h5without requiring thespatialdata_iodependency chain. - Supports transcript-by-cell analysis for locating transcripts relative to cell types.
- Includes memory-optimized workflows for large datasets.
- Provides plotting utilities such as clustered heatmaps, circular dendrograms, and related summary figures.
src/sfplot/: core implementation.tests/: package tests and smoke checks.docs/: Sphinx documentation.sfplot-manuscript/: manuscript-specific notebooks, figures, and derived outputs.benchmarking/: benchmarking-related material.segmentation_methods/: supporting segmentation workflows.
Install directly from GitHub:
pip install git+https://github.com/hutaobo/sfplot.gitFor local development or reviewer inspection:
git clone https://github.com/hutaobo/sfplot.git
cd sfplot
pip install -e .The package requires Python 3.9 or later.
The minimal input is a table with spatial coordinates and a cell-type column.
import pandas as pd
from sfplot import compute_cophenetic_distances_from_df, plot_cophenetic_heatmap
df = pd.DataFrame(
{
"x": [0, 1, 5, 6],
"y": [0, 1, 5, 6],
"celltype": ["A", "A", "B", "B"],
}
)
row_coph, col_coph = compute_cophenetic_distances_from_df(
df=df,
x_col="x",
y_col="y",
celltype_col="celltype",
)
plot_cophenetic_heatmap(
row_coph,
matrix_name="row_coph",
output_dir="output",
output_filename="StructureMap_example.pdf",
sample="Example",
)from sfplot import load_xenium_data, load_xenium_table_bundle, compute_cophenetic_distances_from_adata
# Legacy `spatialdata_io` route
adata = load_xenium_data("/path/to/xenium/run", normalize=False)
# Stable table-bundle route used for the Atera Xenium benchmark
adata = load_xenium_table_bundle("/path/to/xenium/run", normalize=False)
row_coph, col_coph = compute_cophenetic_distances_from_adata(
adata,
cluster_col="Cluster",
output_dir="output",
)load_xenium_data: load and preprocess Xenium data.load_xenium_table_bundle: load Xenium data fromcells.parquet+*_cell_groups.csv+cell_feature_matrix.h5whenspatialdata_iois unavailable or unstable.compute_cophenetic_distances_from_df: compute structure matrices from a coordinate table.compute_weighted_searcher_findee_distance_matrix_from_df: weighted searcher-findee kernel for entity, pathway, or LR analysis.compute_weighted_cophenetic_distances_from_df: weighted StructureMap wrapper over the weighted kernel.compute_cophenetic_distances_from_adata: compute structure matrices fromAnnData.compute_entity_to_cell_topology: generalizet_and_cfrom transcripts to arbitrary weighted entities.compute_entity_structuremap: build StructureMap-style topology among arbitrary weighted entities.plot_cophenetic_heatmap: generate StructureMap and related clustered heatmaps.transcript_by_cell_analysis: analyze transcript-to-cell spatial structure at scale.ligand_receptor_topology_analysis: score senderβreceiver ligand-receptor candidates using topology, structure compatibility, and local contact.ligand_receptor_target_consistency: add a NicheNet-style downstream target-consistency layer.compute_pathway_activity_matrix: compute rank-based or weighted pathway activities per cell.pathway_topology_analysis: analyze pathway-to-cell and pathway-to-pathway spatial topology.compute_cophenetic_distances_from_df_memory_opt: memory-aware alternative for large tables.plot_circular_dendrogram_pycirclize: circular dendrogram visualization.
- The package code used for the manuscript is in
src/sfplot/. - Manuscript-facing notebooks and generated figure assets are kept in
sfplot-manuscript/. - Raw experimental datasets are not bundled in this repository because of size and distribution constraints. The code expects standard spatial omics outputs such as Xenium folders or tabular coordinate inputs.
- When an
sfplot_tbc_formal_wta/results-style directory is already available, the LR and pathway topology extensions are designed to reuse itst_and_c_result_*.csvandStructureMap_table_*.csvoutputs as the preferred gene-level topology anchors before falling back to recomputation. load_xenium_datadepends on a compatiblespatialdata_io/spatialdata/ome_zarr/zarrstack. For reviewer reproduction on environments where that stack is mismatched, preferload_xenium_table_bundle.- A short repository walkthrough is available in REVIEWER_GUIDE.md.
Sphinx documentation sources are available in docs/.
If you use this repository in connection with manuscript review, please cite the associated manuscript:
Cophenetic Spatial Topology Embedding reveals multiscale tissue architecture in spatial omics
This project is released under the MIT License. See LICENSE.