Document Type


Publication Date


Academic Year



Introduction: Immune cells play a prominent role in keeping tumors suppressed, but how the distribution of these immune cells within a tumor’s microenvironment remains poorly understood. The long-term goal of this project is to study how statistical spatial distributions of different immune cells is associated with clinical outcome. The first objective is developing an algorithm for identifying different types of immune cells.

Methods: The data motivating this project includes spatial localization information (x-y coordinates) and expression levels of immune cell CD markers quantified by immunofluorescence immunohistochemistry (IF-IHC) in ~1,500 cases of invasive breast cancer. Using expression levels of CD markers in cancer cells (viewed as background noise), we compute upper nonparametric tolerance limits for CD expression in cancer cells. The stroma cells with CD expression above this tolerance limit are considered to be immune cells of the corresponding CD marker type.

Results: We have developed a Python program allowing us to quickly process a dataset of x-y coordinates of various cells that took up IHC stain, and creates a dataset of coordinates that are true immune cells. We have additionally analyzed multiple parameters for the development of a tolerance interval and concluded that a combination of 95%-confidence 99%-content allows for a minuscule chance of including the stroma cells that are not immune while maintaining enough data for analysis. Exploratory analysis of spatial point patterns of identified immune cell populations and their association with progression-free survival is in progress.

Discussion: We have developed an algorithm for the identification of different types of immune cells associated with type-specific CD markers quantified using IF-IHC. These tools enable further studies of spatial arrangement of immune cells in the tumor tissue and relating them to clinical outcome.