Some 90 percent of patients who die from cancer do so not due to the initial tumor but to secondary sites, or metastases. Thus, being able to predict where cancerous cells will travel through a patient’s bloodstream would help clinicians stay a step ahead of the disease. However, modeling how tumor cells interact with red blood cells (RBCs) in a patient’s cardiovascular system raises complex challenges, the first of which involves sampling.
Why is sampling necessary?
The locations of RBCs can change the trajectory of a cancer cell. And creating a single simulation of cellular interactions may not be representative of their true trajectory, since its results would be based on a particular organization of RBCs. Rather, you need to conduct simulations with a wide variety of red-blood-cell distributions to derive accurate results.
Running simulations for every possible distribution of RBCs would be impossible—some of our largest models to date have had more than 500 million cells in them. Capturing every possible orientation of all of those cells is so far intractable. Taking the average of a sample of distributions also wouldn’t work, especially in the case of cancer, where an outlier RBC could be the one that leads to a secondary tumor.
Thus, researchers must rely on smart metrics to take an adequate sample to generate valid conclusions. The key question becomes, Which metrics should we use to select distributions of RBCs to create an accurate sampling? Our team provides answers to this question in a recent conference paper.
Metrics for sample selection
In the course of our work, we found several metrics that helped select effective samples of RBC distributions. These metrics cut the number of necessary simulations by an order of magnitude while leading to accurate and valid results. Rather than running a thousand simulations for a particulate RBC count and geometry, we showed that only about 72 simulations may be necessary.
Our first identified metric was the Jaccard index, which is a way of measuring how much two distributions deviate from each other. If your sample relies on configurations that are almost identical , then the result will be biased. Using the Jaccard index ensures that you get a wide range of sufficiently distinct layouts. We are currently working on a paper that explains how best to use the Jaccard index in sample selection.
Another metric, called radial distribution function, makes sure researchers haven’t inadvertently introduced patterns into the distributions. Any structure you’ve added needs to be washed out. In other words, this metric ensures that your attempt to be random really is random.
The bigger implications
While this work has important implications for understanding the factors influencing cell transport in the body, which can be used to guide drug design or therapy, there are larger implications. These findings describe metrics that can delineate between distributions in any granular material, from scaffolds for wound healing to concrete for buildings.
When working with granular materials, accounting for the full ensemble of potential configurations is a daunting task. Quantitative metrics are needed to compare these configurations and drive sampling to ensure all options are adequately covered.