1 Introduction
Glioblastoma is a highly aggressive brain cancer characterized by the complexity of tissue microstructure and vasculature within tumor. Previous research shows that multiple intratumoral subregions exist and have distinct treatment response. Therefore, this intratumoral heterogeneity, leading to discrepancy in tumor composition among patients, poses significant challenges to patient individualized treatment and outcome prediction.[1].
Magnetic resonance imaging (MRI) is the standard tool for disease management. As an emerging technique, radiomics could extract useful features from MRI for diagnosis and outcome prediction using machine learning (ML) techniques[2]. However, the robustness of radiomics is challenged by the stability of ML algorithms. Further, the clinical interpretability of radiomics needs to be improved for clinical translation.
In this paper, we proposed a semiautomatic ML pipeline to identify tumor subregions and predict patient survival using physiological MRI, i.e., perfusion (pMRI) and diffusion MRI (dMRI), which provide quantitative measures regarding the different perspectives of tumor biology. Potential drawbacks of the implemented unsupervised clustering, such as instability and unreliability, were addressed by incorporating a mathematical stability measure and an autoencoder neural network. Moreover, a Bayesian optimization framework was also introduced to both efficiently speed up the finetuning process and improve the robustness of the clustering algorithm. To further enhance the interpretability, we designed a clinically sensible feature set to measure the interrelation among the coexisting subregions for survival prediction.
2 Problem formulation
We used below quantitative maps calculated from the pMRI and dMRI that have been coregistered to the postcontrast T1weighted images, with describes the 3D coordinate of each voxel; represents the relative cerebral blood volume (rCBV), calculated from pMRI; and , denotes the isotropic and anisotropic component of dMRI respectively. The MRI of the th patient can be denoted as . Additionally, patient survival can also be collected along with .
Given the training data and its corresponding machine learning (ML) ‘feature’ , one may apply various ML algorithms to train the prediction model. However, this can be problematic from the clinical perspective due to the model’s unstable and unreliable blackbox features, e.g., small changes of the training data may result in significant differences in both the models and prediction.
2.1 Subregion segmentation via unsupervised learning
Unsupervised clustering algorithms have been successfully applied to segment clinically interpretable tumor subregions [3]
. In this paper, partitionbased clustering Kmeans and modelbased clustering Gaussian mixture model (GMM)
[4] were adopted for subregion clustering. Instead of fixing Kmeans and GMM hyperparameters (e.g. cluster number ), we treated them as unknown variables to be optimized. The goal was to cluster data set over patients and obtain clusters (subregions) .2.1.1 Cluster stability
An obvious drawback of generic clustering algorithm lies in its instability that the variance of clustering results over repeated trails can be significant. To offset the uncertainty of clustering results, we evaluated the cluster stability through measuring pairwise cluster distance
[5, 6]. The stability of a clustering algorithm with a certain choice of hyperparameterscan be quantitatively assessed and therefore easily integrated as a loss function to assist the hyperparameter optimization. Specifically, given only one data set
, we considered a Hamming distance (see [5] for details) to compute the distance between two clusterings and :(1) 
where denotes the distance function, is the total number of voxels, represents the Dirac delta function[7] that returns 1 when the inequality condition is satisfied and 0 otherwise, and function denotes the repeated permutations of data set to guarantee the generalization of the stability measure [8]. Eventually, the expected distance between two clusterings over a predefined number of permutations on was adopted as a stability score , which was normalized to 01, with lower values indicating high stable clusterings. See Algorithm 1 for pseudocode.
2.1.2 Autoencoder for statespace conversion
In addition to the stability analysis, an autoencoder that enables the conversion from a dimensional space to the others was introduced to facilitate reliable clustering. An autoencoder is an artificial neural network (ANN) that aims to learn a representation of a set of input data in an unsupervised manner [9]. A typical autoencoder has three neural network layers:the input layer, hidden layer, and output layer [10]. Node number in the hidden layer represents the new dimensionality after the conversion, and the number of nodes are identical in both the input and output layers. We assume both and are part of the unknown variable set to be optimized.
In this paper, we also included
, the quantile threshold that distinguishes outlier data points from the majority, as an unknown variable. Thus we have
. Hyperparameter tuning ofis nontrivial due to its enormous joint searching space. In practice, tuning of ML algorithms also requires machine learning experiences which could be infeasible for clinical experts. Nevertheless, it is highly likely that partial derivative continuous and gradient continuous may not exist in this type of hyperparameter search space, which makes automatic tuning become a challenging task. Therefore, we introduced Bayesian optimization (BO), a sequential optimization technique based on Bayes’ Theorem and Gaussian Processes (GP) to learn the nonparametric representation of the underlying blackbox.
2.2 Bayesian optimization
BO aims to approximate the search space contour of
by casting the covariance matrix of GP in light of data. It adopts an explorationexploitation scheme to find the most probable candidate of
for surrogate function evaluation. As a result, the efficient search of suboptimal hyperparameter solutions become feasible under the BO framework. See [11] for details.Formally, the proposed clustering process can be considered as a blackbox system with input and output , where is the stability score. Thus the training data of BO is , where is the number of data points and can be computed by Equation (1). BO aims to minimize the (surrogate) function mapping , where and denote the input and output space respectively. We defined GP as: ; where is the
mean function vector and
is a covariance matrix composed by the predefined kernel function over the data points .A powerful GP algorithm requires carefully designed kernel function and its associated hyperparameters. See [12] for an overview of GP and the kernel functions. In this paper, we adopted Matern 5/2 kernel to compose the covariance matrix of :
(2) 
where ,
denotes the signal standard deviation,
is the dimension of , and describes the dimensional length scale.BO introduced a socalled acquisition function to suggest the next candidate to be evaluated by . There are typically three types of acquisition functions: probability of improvement (PI), expected improvement (EI) and lower confidence bound (LCB) [11]. This paper employed the EI strategy, which searches for new candidates that maximizes the expected improvement over the current best sample. Suppose returns the best value so far, EI searches for a new that maximizes the function: . The EI acquisition function can then be expressed as a function of :
(3)  
where denotes the CDF of the standard normal. To conclude, BO constructs a surrogate model described by with an objective of maximizing the stability score .
2.3 Clinical survival prediction
Once the stable subregions were identified by clustering, we continued to characterize each subregion and their global distribution using radiomic features. In order to extract clinical features for the patient, we measured spatial radiomic features based on clusters.
In the previous study, radiomic features were extracted from the greylevel images[13], including the firstorder statistics and secondorder spatial distribution features. Particularly, the Haralick texture features obtained from the greylevel cooccurrence matrix (GLCM) are widely used. However, those features designed for the greylevel images may not be suitable to tumor subregion analysis.
We instead proposed two types of feature, namely the multiregional cooccurrence matrix (MRCM) and multiregional runlength matrix (MRRLM). Specifically, MRCM summarizes the subregional volume and their coexistence pattern, while MRRLM describes the distribution pattern of multiple subregions that are different in size. In this paper, 10 features were extracted, including subregion proportion, joint energy, entropy, an informational measure of correlation, categories diversity, shortrun emphasis (SRE) and longrun emphasis (LRE), runlength nonuniformity, run variance and run entropy. Eventually, samples clustering algorithm, along with survival analysis, were applied to identify patient subgroups of higherrisk and lowerrisk. Here we present the complete framework and corresponding algorithm for any glioblastoma cohort, as Algorithm 2 describes.
3 Results
Three unsupervised learning algorithms were compared under the proposed framework, namely Kmeans, GMM and autoencoder enhanced Kmeans (AEKmeans) with quantile threshold. Figure 1 (a) showed the stability performance of Kmeans clustering with hyperparameters tuned by the optimal (minimum) point of GP surrogate model from BO. A lower stability score (SS) indicates a better clustering stability performance. As shown in Figure 1(a), the SS of initial points of which is tuned manually scatter in a wide score range, while the groundtruth of leaned by GP model fluctuates between small score range as red crosses show. It is illustrated that the GP model could effectively learn the underlying blackbox process to minimize SS with several initial points and a few BO steps. A clear decreasing trend of GP estimate minimum can be observed with BO steps, and the bar chart further indicated a decreasing gap between SS (in normalized form) of the GP model and that of the actual clustering process.
Thus, it is clear that with few step BO exploration, the GP surrogate model can alternatively determine the suboptimal hyperparameters for given the algorithm. Furthermore, we compared the groundtruth under hyperparameters given by GP for the three algorithms in Figure 1 (c), proving the effectiveness of autoencoder for facilitating reliable clustering via conversion from a dimensional space to the others.
Fig.2 shows that two patient subgroups were identified with distinct survival probability, which could validate the effectiveness of the proposed framework. Fig. 3. shows the subregions identified from two case examples, where different colors represent the different subregions identified from the patients. Intuitively, these subregions are highly overlapped with the regions of proliferating, necrotic, and edema tumor areas, respectively.
Limitation: it is challenging to guarantee the BO optimized solution is the global optimal: (a) the GP needs to be converged first which requires a number of iterations, (b) even GP is converged, it is one possible representation of the clustering process, which may not be absolutely optimal.
Besides, based on the subregion images we acquired, we deployed the methods in the framework (b) to extract devised clinical features and analyzing those features base on survival analysis. The results are shown in Fig. 4, the spatial feature(Group1) compartment showed a significant result(P = 0.0085) to distinguish patients into higherrisk and lowerrisk subgroups.
4 Conclusion
The paper is an interdisciplinary work that introduced an explainable framework to identify the stable intratumoral subregions while predicting patient survival. Bayesian optimization technique was applied to learn the blackbox of the unsupervised learning process with the clustering stability score as the objective function. The global minimum of the trained Gaussian Process surrogate model is then used as the suboptimal hyperparameter solutions for reliable clustering. Based on the subregional MRI features, higherrisk and lowerrisk patient subgroups can be derived, which could validate the utilities of the proposed framework and shows potential for future precision treatment.
References
 [1] Chao Li, Shuo Wang, Pan Liu, Turid Torheim, Natalie R Boonzaier, Bart RJ van Dijken, CarolaBibiane Schönlieb, Florian Markowetz, and Stephen J Price, “Decoding the interdependence of multiparametric magnetic resonance imaging to reveal patient subgroups correlated with survivals,” Neoplasia, vol. 21, no. 5, pp. 442–449, 2019.
 [2] E Sala, E Mema, Y Himoto, H Veeraraghavan, JD Brenton, A Snyder, B Weigelt, and HA Vargas, “Unravelling tumour heterogeneity using nextgeneration imaging: radiomics, radiogenomics, and habitat imaging,” Clinical radiology, vol. 72, no. 1, pp. 3–10, 2017.

[3]
M Angulakshmi and GG Lakshmi Priya,
“Brain tumour segmentation from mri using superpixels based spectral clustering,”
Journal of King Saud UniversityComputer and Information Sciences, 2018.  [4] Eva Patel and Dharmender Singh Kushwaha, “Clustering cloud workloads: Kmeans vs gaussian mixture model,” Procedia Computer Science, vol. 171, pp. 158–167, 2020.
 [5] Ulrike Von Luxburg, Clustering stability: an overview, Now Publishers Inc, 2010.
 [6] Marina Meilă, “Comparing clusterings by the variation of information,” in Learning theory and kernel machines, pp. 173–187. Springer, 2003.
 [7] Lin Zhang, “Dirac delta function of matrix argument,” arXiv preprint arXiv:1607.02871, 2016.
 [8] Tilman Lange, Volker Roth, Mikio L Braun, and Joachim M Buhmann, “Stabilitybased validation of clustering solutions,” Neural computation, vol. 16, no. 6, pp. 1299–1323, 2004.
 [9] Geoffrey E Hinton and Ruslan R Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006.

[10]
Mark A Kramer,
“Nonlinear principal component analysis using autoassociative neural networks,”
AIChE journal, vol. 37, no. 2, pp. 233–243, 1991.  [11] Jasper Snoek, Hugo Larochelle, and Ryan P Adams, “Practical bayesian optimization of machine learning algorithms,” in Advances in neural information processing systems, 2012, pp. 2951–2959.
 [12] Eric Brochu, Vlad M Cora, and Nando De Freitas, “A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning,” arXiv preprint arXiv:1012.2599, 2010.
 [13] Joonsang Lee, Rajan Jain, Kamal Khalil, Brent Griffith, Ryan Bosca, Ganesh Rao, and Arvind Rao, “Texture feature ratios from relative cbv maps of perfusion mri are associated with patient survival in glioblastoma,” American Journal of Neuroradiology, vol. 37, no. 1, pp. 37–43, 2016.
Comments
There are no comments yet.