Document Type

Article

Publication Date

8-2-2025

Comments

This article is the author’s final published version in Bioinformatics, Volume 41, Issue 8, 2025, Article number btaf430.

The published version is available at https://doi.org/10.1093/bioinformatics/btaf430. Copyright © The Author(s) 2025.

Abstract

MOTIVATION: Evaluation of single-cell protein expression from immunohistochemistry images is used increasingly in biomedical research. Many proteins are used solely for phenotyping cells in the tumor microenvironment. Other proteins with meaningfully quantitative expression levels provide so-called functional protein biomarkers. There is still a limited number of methods and software tools available for utilizing the entire distributions of single-cell expression levels.

RESULTS: We present the R package hyper.gam, providing a supervised learning framework for deriving biomarkers based on single-cell distribution quantiles. The single-cell data are first converted into sample quantile functions, which are then used as predictors in scalar-on-function regression models to estimate the integrand surface. The estimated integrand surface defines the quantile index predictors based on the single-cell expression levels in a new test set. The package features a user-friendly interface and visual tools enabling exploration of the estimated integrand surfaces. Our tools are motivated by the need for biomarkers, taking into account heterogeneous protein expression levels in a tissue, but they can be applied to other types of single-cell data.

AVAILABILITY AND IMPLEMENTATION: R package hyper.gam and vignette are available at https://CRAN.R-project.org/package=hyper.gam and https://CRAN.R-project.org/package=hyper.gam/vignettes/applications.html.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Language

English

Share

COinS