Summary
Nuclear magnetic resonance (NMR) relaxation responses in porous media provide a sensitive probe of the microstructure and yet are influenced by a number of factors which are not easily detangled. Low-field transverse relaxation measurements can be carried out quickly and are frequently used as pore size distributions, while adding longitudinal relaxation measurements provides additional insights into surface properties and fluid content. Here we present an inverse solution workflow extracting related intrinsic physical parameters of the system by fitting experiment and numerical simulation(s). An efficient NMR forward solver for the simultaneous calculation of and responses is introduced, which honors existing inequality relationships between and parameters. We propose a Bayesian optimization approach that jointly identifies - and -related properties satisfying physical constraints by simultaneously fitting and experiments to simulations. This dual-task inverse solution workflow (DT-ISW) identifies the solution by minimizing the sum of the L2 norm of the fitting residuals of both and distributions into a single objective and jointly models the two highly correlated objectives with high efficiency using the vector-valued Gaussian process (GP) kernel for transfer learning. A multimodal search strategy is used to identify nonunique solution sets of the problem. The workflow is demonstrated on Bentheimer sandstone, identifying five intrinsic physical parameters. The performance of the joint DT-ISW (DT-ISW-J) is compared to a sequential DT-ISW (DT-ISW-S) approach as well as an independent single-task ISW (ST-ISW) of the and responses. Both dual-task versions converge more than two times faster than the single-task version. DT-ISW-J equally minimizes the L2 norm of and fitting residuals whereas DT-ISW-S only preferentially minimizes the objective assigned higher importance. A Pareto optimal solution (POS) is provided to allow operators to subjectively balance the preference of and data fits for the slightly conflicting objectives. The ability to extract five intrinsic physical parameters simultaneously provides new techniques for tracking wettability alteration and assessing the influence of clay amount and distribution on petrophysical property estimates.
Introduction
NMR relaxation and diffusion measurements as probe of the microstructure of porous media have a long history. Focusing on relaxation responses, this includes the estimation of pore size distributions (Prammer 1994; Kenyon 1997), (pseudo) capillary pressure curves (Marschall et al. 1995; Slijkerman et al. 2001), wettability index (Fleury and Deflandre 2003; Looyestijn and Hofman 2005; Chen et al. 2006; Al-Garadi et al. 2022), and permeability (Timur 1968; Straley et al. 1987; Banavar and Schwartz 1987; Sen et al. 1990; Borgia et al. 1998; Dunn et al. 1999). Length scale interpretations of NMR relaxation responses from saturated rocks are based on the assumption of the fast diffusion limit (Brownstein and Tarr 1979) (i.e., there is a direct correspondence between a specific relaxation time component and pore size). The assumption is valid if a range of conditions are fulfilled, including fast diffusion and diffusionally isolated pores, insignificant internal gradients (absence of paramagnetic impurities), and uniform mineralogy. Under those conditions, the relationship between relaxation time and pore size may be described by a mono-exponential decay with relaxation rate (Torrey 1956)
where T2 and are transverse relaxation time and bulk transverse relaxation time, respectively; is transverse surface relaxivity; and are surface area and volume of an individual pore, respectively; is effective diffusion coefficient; is the gyromagnetic ratio; is the average or effective internal gradient; and is echo time. Summing over all pores results in the magnetization decay of the sample. Assuming a pore shape (e.g., spherical, and constant surface relaxivity), the magnetization decay can then be converted to a pore size. Practically, the prediction of pore sizes requires a calibration of field data to NMR and mercury intrusion porosimetry laboratory measurements using surface relaxivity as the constant conversion parameter. In literature, such a parameter is often (and correctly) called an effective surface relaxivity (Chen et al. 2006) or apparent surface relaxivity (Müller-Petke et al. 2015; Costabel et al. 2018) to emphasize the averaged nature of the parameter (Marschall et al. 1995). Depending on the method utilized for estimation of the surface relaxivity (e.g., using surface area measurements by gas adsorption analysis, accounting for surface roughness or not, utilizing short time limit restricted diffusion, by Bayesian magnetic resonance derived pore size distribution or derived by utilizing nonground eigenstates following Brownstein-Tarr theory), the literature-reported surface relaxivity values vary by an order of magnitude. For the benchmark rock we use in this work, Bentheimer sandstone, the reported values of spread from 8 μm/s (Holland et al. 2013) to 26 μm/s (Afrough et al. 2019).
The application of the concept of constant effective surface relaxivity to realistic heterogeneous rocks leads to errors of various magnitude. Toumelin et al. (2003b) demonstrated possible errors approaching 50% in estimating immobile fluid fraction due to unaccounted diffusional coupling, Arns et al. (2006) reported a 30% spread of permeability estimates due to relaxivity heterogeneity, while Benavides et al. (2017) showed a very significant narrowing of the predicted pore size distribution (twice as narrow compared to a known reference on a standard log scale) if a uniform relaxivity value is utilized. Approaching this problem by introducing computational methods utilizing intrinsic phase-specific surface properties may allow one to quantify those uncertainties and in general enhance the interpretation of NMR relaxation responses.
Additional information about fluid content or compartmentalization of the pore space can be gained from T1-T2 correlation experiments (Song et al. 2002; Tinni et al. 2014). Pseudo T1-T2 correlations may be derived by utilizing independent T1 and T2 measurements (Williamson et al. 2018). As mentioned above, it is well known that NMR relaxation of fluids in porous media is very sensitive to the wetting properties of solid phases. The close relationship between affinity and surface relaxivity was utilized to develop an NMR wettability index under the assumption of uniform wetting properties. The concept was extended for the more practical case of heterogeneous wettability of rock [e.g., we demonstrated in Shikhov et al. (2018) the approach to determine fractional wetting properties of grains surfaces using a combination of transverse relaxation techniques, though the reference data for the rock of interest at two end wetting states in either case is required, limiting applications to laboratory core analysis]. Approaches relying on both types of relaxation (T1 and T2) may not require such calibration, allowing estimates of arbitrary wet rocks at field conditions (e.g., well logs). Korb et al. (2018) utilized a combination of 2D T1-T2 and T1 field-dependent relaxation (dispersion) measurements to probe the wetting properties of fluids confined in the pore space of rocks without prior knowledge of endpoint wettability state responses. Gizatullin et al. (2019) practically demonstrated a high sensitivity of certain types of aromatic compounds to wetting conditions of sandstones and carbonates for T1 dispersion measurements. This suggests that future NMR methods of formation wettability characterization from well logs are likely to be based on a combination of T2 and T1 field-dependent measurements. The interpretation of such measurements would require an ability to separate the responses from internal gradient effects when present and efficient extraction of multiple longitudinal and transverse surface relaxivity values and . The current work presents a significant step toward this goal.
Numerical simulations based on digitized materials offer a way to test such methods and potentially overcome some of their limitations. Basic random walk methods on digitized media were introduced in several studies (Mendelson 1990; Wilkinson et al. 1991; Bergman et al. 1995). Straley and Schwartz (1996) compared experiments on glass beads to numerical simulations with reasonable accuracy. Hidajat et al. (2002) introduced a first-passage random walk method for NMR responses and correlated T2 log mean relaxation time to lattice Boltzmann calculations of permeability for synthetic random structures. Øren et al. (2002) modeled the NMR T2 response of Fontainebleau sandstone utilizing a random walk method with fixed timestep. Toumelin et al. (2003b) considered a periodic array of spheres to capture the response of vuggy carbonates at different water saturations conceptually, where the microporosity itself also formed a periodic array of spheres. More algorithmic detail was given in Toumelin et al. (2003a)—the approach captures dephasing in the presence of an external constant background gradient and utilizes the first-passage algorithm. Valckenborg et al. (2003) extended the modeling of NMR responses to nonuniform fields. Song et al. (2000) measured the magnetization decay due to diffusion in the internal field, a technique that probes higher eigenmodes of the NMR T2 relaxation response. It was first modeled by Arns (2004) on a series of micro-computed tomography (CT) images. Interestingly, the associated length scale estimates indicated a larger pore size as well as a wider pore size distribution than would have resulted from the direct use of the ground states. Arns et al. (2005) compared a range of length scale measures for permeability estimation including NMR-derived ones, showing strong correlations between NMR-based estimates and direct numerical simulation for Fontainebleau sandstone, in agreement with Sen et al. (1990). Numerical techniques operating on 3D representations of microstructure extracted from micro-CT images have since been refined further and were applied to higher-dimensional NMR measurements (e.g., -T2) (Arns et al. 2011). Internal field effects were accounted for by application of a dipole approximation with random walkers accumulating an effective phase; as in Arns (2004), a part of the dephasing behavior is thus accounted for explicitly, rather than by an effective relaxation rate.
Given the significant advances in numerical techniques, a standing problem is the determination of the correct intrinsic parameters associated with relaxation at surfaces or smaller-scale microstructure where bulk relaxation rates of effective materials may be specified. However, determination for those parameters using forward simulations requires an extensive search in the multidimensional solution space, posing an ill-conditioned inverse problem, the solution of which may lead to new NMR relaxation response interpretation techniques directly utilizing intrinsic physical properties. This is particularly so if both T1 and T2 responses are considered. There are two major issues related to the relaxation time inverse problem: The micro-CT image-based NMR simulation is expensive to evaluate and the solution sets are nonunique. For the computational burden, Bayesian optimization has become a powerful strategy for identifying the extrema of the expensive objective functions with a limited budget (Mockus 1994; Jones et al. 1998; Snoek et al. 2012), and we have developed ST-ISW demonstrating the efficiency and reproducibility of Bayesian optimization for T2 relaxation time inverse problem in which multiple solution sets are identified simultaneously (Li et al. 2021). The key advantage of Bayesian optimization is that it models the objective function using GPs or kriging known in geostatistics as the cheap proxy, with which the most promising candidate is identified and proposed for exact function evaluation so that the computational cost is significantly reduced. GPs as a surrogate or proxy model have been applied to solving history matching problems (Horowitz et al. 2013; Sayyafzadeh 2015; Rana et al. 2018; Dachanuwattana et al. 2019; Li et al. 2015) and to recovering physical, geometrical, or structural parameters where expensive measurements or simulations are involved (Vargas-Hernández et al. 2019; Deng et al. 2020). For the identification of nonunique solution sets under box constraints, the objective function is intrinsically multimodal. Li et al. (2021) recovered both the global optimal solution and (major) local optimal solutions using a multimodal search strategy, comprising a multistart Box-constrained limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm (L-BFGS-B) optimizer searching for local optimum solutions and a global optimizer social-learning particle swarm optimizer for global optimum solutions. With this technique, more than one physically valid solution set is identified by computationally reproducing the transverse relaxation time (T2) measurement using the NMR simulation. From the mathematical point of view, the solutions are considered more physically valid if the agreement between the simulated longitudinal relaxation time (T1) distribution arising from the identified physical quantities and the measured T1 distribution can also be achieved. From an NMR forward solver point of view, it is easy to output T1 and T2 relaxation responses with little extra computational effort, making it desirable to develop an inverse solution workflow that can use this advantage. However, compared to ST-ISW for T2 of Bentheimer sandstone where the clay regions effective transverse relaxation time, effective diffusion coefficient, and transverse surface relaxivity were determined, here the corresponding two longitudinal relaxation parameters would also be required, leading to a five-parameter inverse problem.
Incorporating the matching of T1 distributions in addition to T2 distributions following ST-ISW presented in Li et al. (2021) is straightforward: Both T1 and T2 fitting residuals can be jointly minimized by taking the sum of the two L2 norms of the fitting residuals as the objective. The inverse problem can then be solved by ST-ISW, except that incorporation of the T1 distribution leads to additional physical constraints, which are helpful for the identification of the most realistic solution sets. This leads to a strategy mainly focused on extending the expected improvement (EI) or entropy search acquisition functions to the constrained optimization scenarios (e.g., the expected constrained improvement; Schonlau et al. 1998), and the predictive entropy search with constraints (Hernández-Lobato et al. 2015). The family of EI acquisition functions is superior for multimodal optimization as it automatically balances exploration and exploitation by taking into account both value information and uncertainty information. In this study, we identify the nonunique solution sets by multimodal maximization of the expected constrained improvement acquisition function using the multistart sequential quadratic programming (SQP).
While conceptually simple and easy to implement, ST-ISW fails to account for correlations between the two objectives, which may lead to a waste in computational resources. Historically, the (spatial) correlation between related objectives are mostly studied in geostatistics [e.g., Goovaerts (1997) used the cokriging approach to account for the spatial correlations between the concentration of the heavy metals that are expensive to evaluate and the variables that are easy to evaluate, for example, pH levels]. Cokriging with a linear model of coregionalization (LCM) is widely applied in geostatistics (Babak and Deutsch 2009; Ochie and Rotimi 2018; Al-Mudhafar 2018) and is then extensively studied and applied in transfer learning (Bonilla et al. 2007; Álvarez et al. 2012; Swersky et al. 2013). In this study, we incorporate a transfer learning ability to ST-ISW using LCM which jointly models the T1 and T2 objectives by leveraging information from each other. We propose two approaches that differentiate themselves by whether the two objectives (matching T1 and T2 distributions) are considered of equal importance: DT-ISW-S and DT-ISW-J, indicating if the two objectives are optimized sequentially or jointly; here DT stands for the dual-task. In sequential optimization, unknown physical quantities related to the objective with higher importance are identified first, setting the bounds for another objective, whereas in joint optimization all unknowns are identified simultaneously. DT-ISW-S is formulated for the scenario where one objective is considered of higher importance and it does not matter if the other objective strictly matches, whereas DT-ISW-J is for the scenario where the two objectives are considered of equal importance and good matches of the two objectives are desirable. We demonstrate the performance of DT-ISW-S and DT-ISW-J, with ST-ISW as a baseline by the identification of the five parameters of a Bentheimer sandstone sample: one shared quantity (i.e., the effective diffusion coefficient in clay regions), together with four unique quantities (i.e., surface relaxivity of quartz and effective relaxation time in clay regions for T1 and T2, respectively). We also provide the separate optimization (SEP) as the reference for which the physical quantities related to T1 and T2 are separately identified. Finally, we provide the POS, which allows operators to balance the preference of T1 and T2 data fits for the slightly conflicting objectives.
This paper is organized as follows: An overview of DT-ISW is presented, followed by a detailed description of the materials, NMR measurements, and NMR forward simulation. Next, a detailed description of DT-ISW for efficient modeling of the two related objectives using transfer learning is provided. Finally, the performances of DT-ISW-S and DT-ISW-J are investigated and compared against ST-ISW and SEP for a Bentheimer sandstone sample, determining five T1 and T2 relaxation-related intrinsic physical properties simultaneously.
The Dual-Task Inverse Solution Workflow
There are five components in the developed DT-ISW for the inverse problem of T1, T2 parameter estimation, namely (a) the observables, (b) the forward NMR solver, (c) the cost-function, (d) the biobjective Bayesian optimization, and (e) the solution analysis, as presented in Fig. 1.
Flow chart of DT-ISW for the estimation of intrinsic physical parameters from and relaxation responses.
Flow chart of DT-ISW for the estimation of intrinsic physical parameters from and relaxation responses.
We briefly summarize these components in the itemized list below, while detailed descriptions follow in the next section.
(a) Measure NMR T1 and T2 relaxation decays and .
(b) Obtain the digitized representation of the sample via CT imaging (first iteration only), numerically derive the decays and for the chosen sets of unknown intrinsic parameters and (e.g., surface relaxivities of the resolved phase: quartz, effective diffusion coefficient of clay, effective relaxation times of clay regions only resolved on scanning electron miscroscope images) while honoring known quantities like porosity , volumetric magnetic susceptibility , bulk physical properties of water, and matching the physical kernel (see Eqs. 2 and 3) and regularization parameter .
(c) Acquire the probability density distribution for both T1 and T2 using both simulated and measured decays [i.e., , , , and ]. Evaluate the cost-function and calculate the sum of the L2 norm of the fitting residuals for both T1 and T2 distributions.
(d) Jointly model the T1 and T2 objective functions using the vector-valued GP with correlation captured using the LCM, followed by maximum likelihood estimation of the model hyperparameters; then, the EI acquisition function is maximized to identify the promising candidates and . Finally, solutions within sparsely explored regions are evaluated using exact simulation.
(e) When the optimization budget is depleted, the objective functions are modeled with the updated model hyperparameters followed by a division of the solution space into various unique partitions (UPs) associated with different modes of the objective function using the solution space partitioning introduced in Li et al. (2021). Finally, for each UP, the top three solutions and the POS are reported.
Materials, Measurements, and Forward Simulations
Sample Characterization
Rock Sample
NMR measurements were performed on a Bentheimer sandstone core plug (which is hereafter labeled as BHG2) fully saturated with 3 wt% NaCl brine (30,000 ppm). It was cored out of a rock block sourced from Bad Bentheim quarry (Lower Saxony, Germany) exhibiting 23.9% porosity. This sandstone is a remarkably clean quartz arenite, as five analyzed samples exhibit a narrow range of 97.94 to 98.07 wt% of silicon dioxide determined by X-ray fluorescence spectroscopy. The corresponding mineralogy associations are 95.67% quartz, 1.35% kaolinite, 2.39% feldspar, and 0.03% iron-rich components. Owing to the very small iron and manganese content, the rock is diamagnetic. The bulk volumetric magnetic susceptibility of three finely powdered samples (about 1 g each) measured using magnetic susceptibility balance (Sherwood Scientific, Cambridge, UK) is −7.45 × 10–6 (SI units).
Digitized Image
A 5-mm diameter core of Bentheimer sandstone was imaged on the UNSW Tyree X-ray CT facility’s HeliScan Micro-CT set equipped with a GE Phoenix NanoFocus X-ray source and a 3,072 × 3,072 flatbed detector. Radiographs were acquired over 12 hours with 80 kV X-ray energy using a 0.75-mm aluminum filter in double helix mode. The resulting spatial resolution of the 3D image was μm. Following reconstruction, the gray-scale tomogram was segmented into five phases: (macro) pore space, quartz, clay region (a microporous effective phase), feldspar, and an iron-rich dense mineral phase. The segmentation first identified pore space, intermediate intensity clay region, and high density (quartz and denser) in a standard converging active contour approach (Sheppard et al. 2004)—see Li et al. (2021) for the actual sample—followed by a further partitioning of the solid phase into quartz, feldspar, and an iron-rich phase. Feldspar and quartz grains were separated using grain partitioning by a watershed algorithm (Sheppard et al. 2004) followed by a thresholding based on average grain intensity, while the high-intensity iron-rich phase was extracted directly based on intensity. For reference, a slice through the tomogram and resultant phase segmentation is given in Fig. 2.
Slice through the (a) gray-scale tomogram and (b) corresponding five-phase segmentation. Domain size: 1,2403 voxels, voxelsize ε = 2.16 μm. Black: pore; green: clay; red: quartz; blue: feldspar yellow: high-density phase.
Slice through the (a) gray-scale tomogram and (b) corresponding five-phase segmentation. Domain size: 1,2403 voxels, voxelsize ε = 2.16 μm. Black: pore; green: clay; red: quartz; blue: feldspar yellow: high-density phase.
Statistics of the Bentheimer sandstone segmentations as well as magnetic susceptibility values for the calculation of the internal magnetic field at lattice resolution ε that follows is listed in Table 1.
Statistics of the Bentheimer sandstone segmentations. Upper: Porosity from simulation () is calculated as (Shikhov et al. 2018), domain size is in voxel, denotes voxel size, and fi are phase fractions (, resolved porosity; , clay region; , quartz; , feldspar; , high density). Lower: Measured average volumetric magnetic susceptibilities , the known values for NaCl brine, clay region, quartz, feldspar, and the inferred value for high density using material balance. Magnetic susceptibilities are expressed in μSI.
Sample . | (μm) . | Domain . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|
BHG2 | 2.226 | 1,2403 | ||||||
0.2373 | 0.2252 | 0.0241 | 0.7369 | 0.0126 | 0.0011 | |||
−7.45 | −9 | −12 | −12 | 1,900 |
Sample . | (μm) . | Domain . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|
BHG2 | 2.226 | 1,2403 | ||||||
0.2373 | 0.2252 | 0.0241 | 0.7369 | 0.0126 | 0.0011 | |||
−7.45 | −9 | −12 | −12 | 1,900 |
Since we focus on NMR relaxation at low external field strength of 0.047 T and given the small amount of feldspar, we assume that the two phases, feldspar and quartz, share the same surface relaxivity and volumetric magnetic susceptibility, so that the increased magnetic susceptibility is assigned to the iron content in the high-density phase, leading to a higher , with the clay region considered as a mixture of kaolinite and NaCl brine, as shown in Table 1.
T1 and T2 Reference Data
Here we measured T1 relaxation time using the standard inversion recovery pulse sequence (Carr and Purcell 1954). It is formed by a 180° pulse which inverts the magnetization, followed by recovery at time delay , and concluded by a 90° pulse that projects the attenuated magnetization on the transverse plane for the acquisition of the free induction decay. By varying time intervals in subsequent scans, a set of free induction decays with different amplitudes is obtained. T1-inversion recovery measurements were performed with the following set of parameters: 64 time increments were spaced logarithmically between 0.2 and 10,000 ms. The signal was acquired as an average of four scans following a phase cycling sequence to suppress systematic noise due to hardware imbalances. Recording the initial value of each free induction decay as a function of the recovery time results in a mono-, or in a general case, in a multiexponential decay described by the following expression with the T1-inversion recovery kernel as (Callaghan 1994; Song et al. 2002; Wang et al. 2017)
where is the longitudinal magnetization decay as a function of time , is noise assumed white and additive, and our objective is the estimation of the probability density function of T1 [i.e., ].
The standard method to measure transverse relaxation time T2 is the Carr, Purcell, Meiboom, and Gill (CPMG) pulse sequence (Carr and Purcell 1954; Meiboom and Gill 1958). It consists of a 90° pulse that projects the initial magnetization on the transverse plane, followed by a series of 180° refocusing pulses separated by a time interval of (interecho time). Each 180° pulse produces an attenuated signal (echo) with the maximum occurring at time after the pulse. By acquiring the top of each echo, the exponential (or multiexponential, in a more general case of porous media) signal decay is obtained with the decaying rate equal to the inverse of the transverse relaxation time T2, thus giving an kernel for the multiexponential problem as (Callaghan 1994; Song et al. 2002)
where is the transverse magnetization decay as a function of time and is additive white noise, and the objective is the estimation of the probability density function of T2 [i.e., ]. All NMR experiments were performed using the Magritek 2 MHz Rock Core Analyzer. Being equipped with the P54 RF probe (54-mm bore opening), it is capable of achieving signal-to-noise ratio in excess of 50 from four scans for the most saturated rock samples of about 20 cm3 in volume. The CPMG measurements were recorded with an echo spacing of 0.2 ms and 50,000 pulses per echo-train; the echo amplitude was determined by averaging 16 points on top of each echo.
The resultant magnetization decays are converted to relaxation time distributions by solving the respective Fredholm integrals of the first kind (i.e., Eqs. 2 and 3) for the probability density functions and . Both are nonnegative least squares problems, for which various numerical solvers are available (Bertsekas 1982; Hansen 1992; Lawson and Hanson 1995). For consistency, all experimental and simulated decays use exactly the same kernels with 256 logarithmically spaced bins for both T1 and T2 and are inverted using the algorithm of Lawson and Hanson (1995).
Joint T1 and T2 Simulation
The NMR simulations are carried out via a random walk method. Some modifications to our previous implementations have been made over time. Arns (2004) utilized a fixed displacement and fixed timestep random walk following Mendelson (1990) and Bergman et al. (1995) by directly using the natural resolution of the underlying micro-CT image, with the addition of accounting for dephasing. Arns et al. (2011) evolved that approach to one with fixed displacement, but variable timestep while also implementing a higher time resolution through a logical subgridding approach: Given the tomogram voxel size ε, random walks are carried out on a lattice with resolution , where is the step length of the random walk and ls is a grid refinement factor. Furthermore, the approach allows for the existence of effective properties (e.g., voxels which are microporous and exhibit an effective diffusion coefficient). Alhwety et al. (2016) utilized the approach for a dual-scale synthetic laminated sphere pack, tracking the magnetization exchange of different microporosity domains with distance maps embedded into the NMR simulations for spatial analysis. SayedAkram et al. (2016) utilized effective medium approaches for describing the motion and NMR relaxation in Indiana limestone exhibiting significant microporosity, comparing experiment and simulation under mixed-wet conditions. Shikhov et al. (2017) measured and modeled the NMR T1-z spatially resolved NMR relaxation response for a set of sandstones under centrifuge desaturation conditions resulting in good agreement between experiment and simulation. More recently, Zheng et al. (2018) considered the case of flow propagators and exchange of magnetization between microporous and macroporous regions of a glass-bead sample. The effective medium approach for nonresolved porosity was also utilized in several studies (Cui et al. 2021; Li et al. 2021). In this work, modifications are added, allowing the simultaneous extraction of T1 and T2 responses, which adds significant additional analysis options to the approach while also enabling the efficient determination of intrinsic parameters enabling these applications. For this reason, we summarize our method in the following in more detail.
Consider a microstructure given by a 3D regular grid from a micro-CT image segmented into phases. Then the material properties at voxel scale relevant for NMR response simulation can be presented by a set of arrays , , , , . . Here, is longitudinal relaxation time, is transverse relaxation time, is volumetric susceptibility, is longitudinal surface relaxivity, is transverse surface relaxivity, and is the effective diffusion coefficient. We note that the array entries represent either intrinsic properties for fully resolved voxels or effective properties. For relaxation-related parameters, we have the element-wise conditions and . For a single fluid and a fully saturated sample considered here, random walks start only in voxels where porosity is greater than zero and with a probability proportional to that of porosity. Consequently, the introduced arrays are sparse—not all entries are meaningful and if a full notation is utilized for convenience, those entries may be set to zero. To translate the material properties into timesteps describing the motion and movement and stepwise signal damping factors, we introduce a set of additional quantities, namely , , , and as timestep, movement probability, and T1 and T2 signal damping factors. A random walker at step count with position in a voxel of phase and internal clock ts attempts a step in a random direction (one of six nearest neighbors). If the attempted motion is successful, the random walker changes to position of phase with timestep and signal damping factor following Bergman et al. (1995) as
advancing the clock to . Here, the terms in round brackets correspond to survival probabilities for T1 and T2 surface relaxation processes, respectively, while the exponential terms account for bulk relaxation. The surface relaxivities are defined in the usual way (e.g., independently for a T1 or T2 experiment/simulation). This implies , as a T1 surface relaxation will also destroy phase coherence. When jointly simulating T1 and T2 relaxation, the relaxation already accounted for by T1 processes therefore needs to be corrected for in the T2 process, leading to the prefactor in Eq. 5. If the attempted motion is not successful due to , the damping factors and are still applied for the attempted move. However, the timestep is taken to be and the internal field (see below) is considered to be the one of the current position. The above formulation allows one to write the signal decay of an individual walker carrying out steps with initial signal strength M0 as a product, while recording the T1 and T2 decays simultaneously
Here, is the accumulated phase shift due to diffusion in an inhomogeneous (offset) internal magnetic field , with total magnetic field . The latter is calculated in the dipole approximation by convolution of the susceptibility field with the dipole field as
Details of the implementation are given in Arns et al. (2011). Following the random path of a walker over steps and weighting the phase accumulation due to diffusion in the internal field by the timestep for changing locations, the total phase shift after steps can be written as
and denote the location of the start and later positions of the random walker. The CPMG sequence is implemented by starting random walks with phase and switching the sign of at times , , where is the echo spacing and is the total number of echoes recorded. Recording at times , , and averaging over random walks result in the magnetization decays and . We note that is an integer multiple of . As timesteps are allowed to vary, individual timesteps are split into two where necessary by linear interpolation for accurate timestepping of the CPMG sequence. Finally, a comment on the choice of pmov: We assume continuity of the pore space (maximal correlation) at interfaces. This leads to for , for , and otherwise, where denotes the calculated porosity of phase from simulation. Furthermore, we determine the off-diagonal entries of by taking the harmonic mean of the respective nonzero diagonal entries. This is the simplest assumption (e.g., treating the interface as two parallel half-spaces).
Minimization of the T1, T2 Objectives
The specific problem of minimization of the sum of the L2 norm of the fitting residuals for both T1 and T2 distributions can be formulated as:
where and are the lower and upper bounds of the prespecified search domain and and are L2 norm of the fitting residuals for T1 and T2 distributions, respectively, which may be expressed as
with weights for all data points of the T1 and T2 distribution being equally 1. Each lower-dimension component comprises the unique components and the shared component , and therefore Eq. 11 specifies the complete solution. The constraints Eqs. 14 and 15 apply because in most practical situations for saturated porous media, T2 relaxation rate is faster than T1 due to near-surface interactions specific to transverse relaxation. An important nature of the problem is that the NMR simulator is adapted in a way that both and can be simulated simultaneously in a single function evaluation, enabling individual modeling of the two objectives to better reveal the structure of the true objective.
Scalar-Valued GPs
GP is a distribution over functions and is characterized by its mean function and covariance function as (Rasmussen and Williams 2005; Li et al. 2021)
where is a candidate in of dimension , is the process evaluated at location . In GP, the prior distribution over is expressed as
where denotes normal distribution, denotes the mean vector for observations, and denotes observed and predicted candidates, respectively, and denotes the covariance matrix for , with the entry expressed by . Using the Gaussian likelihood function to accommodate for the noise in observations yields . As we are interested in prediction of a single candidate, the joint distribution of the observations and a single prediction can be expressed as
where is the covariance between observed and predicted candidates, is the variance of the predicted candidate, is the mean for , and is the identity matrix. By conditioning the predictions on observations from Eq. 21, the posterior distribution for a single predicted candidate becomes
where denotes normal distribution, is a vector of model hyperparameters for the scalar-valued GP, and
Vector-Valued GPs
The vector-valued GP indicates GP with multiple outputs. We use to represent the number of correlated objective functions to be modeled, with applied throughout this study. To model the correlations between the outputs of , the GP kernel in Eq. 19 has to be replaced by a vector-valued GP kernel, so that the vector-valued function can be expressed as (Bonilla et al. 2007; Swersky et al. 2013)
where denotes the stacked mean function for each output and denotes a matrix-valued function whose entries express the correlation not only between the input pairs and , but also between the outputs pairs and . Similar as in Li et al. (2021), the null hypothesis is assumed as there is usually no prior knowledge about our problem; as a result, we use a constant mean function for each objective throughout this study (i.e., ), with a stationary covariance function: parameterized squared exponential for each objective. The squared exponential kernel is parameterized using automatic relevance determination, that is,
where is the signal amplitude, accommodates different levels of sensitivity against each component of , with being a vector of positive values, referred to as length scales. Automatic relevance determination has been incorporated to infer the sensitivity against each dimension of the input, which has been used successfully for removing irrelevant input associated with a very large inferred length scale (Williams and Rasmussen 1995; Rasmussen and Williams 2005; Wipf and Nagarajan 2007). is set to 1 when applied to vector-valued GP.
Various coregionalization models have been proposed in literature (Goovaerts 1997; Gelfand et al. 2004; Majumdar and Gelfand 2007; Krupskii and Joe 2015), and here we use the standard LCM originated from geostatistics to model , with the complete derivation of LCM shown in Appendix A. Assume that there are distinct observed candidates and observations for each objective [i.e.,]. Using LCM, the vector-valued GP kernel for the latent variable can be expressed as (Álvarez et al. 2012)
where is the Kronecker product between matrices, is the number of separable kernels, is a covariance matrix measuring the relationship between the outputs, and is an covariance matrix measuring the relationship between inputs, with the entry specified by . Typically and Eq. 26 degrades to intrinsic coregionalization model if . Both matrices are symmetric and positive semidefinite. We use a full-rank LCM (i.e., ) throughout this study.
Eq. 26 is called the sum of separable kernels as the covariance function is expressed as the sum of product of two covariance functions. It is observed from Eq. 26 that for each latent function, the correlation within tasks and between tasks are decoupled. Similar to the scalar-valued GP case, we use a Gaussian likelihood to account for the noisy observations (i.e., , where is a diagonal matrix accommodating task-dependent noise levels, and is a vector of noise variances). Similar to the case of scalar-valued GP, the posterior distribution of mean and covariance for a single predicted candidate can be expressed as
where concatenates outputs for , is a vector of model hyperparameters where stands for vector-valued GP, and
where denotes the stacked constant mean function for each objective, is the covariance matrix for the latent variable evaluated at all pairs of observed candidates and predicted candidate, is the covariance matrix for the predicted candidate, concatenates outputs, and is the covariance matrix for tasks with variances on the diagonal. Fig. 3 shows a comparison of the prediction and uncertainty quantification between scalar-valued GP and vector-valued GP.
Prediction and uncertainty quantification for the two correlated objectives with correlation coefficient being 0.8 using scalar-valued GP (a–b) and vector-valued GP (c–d) given seven unique observations for each objective. Left axis displays prediction, uncertainty quantification, whereas right axis displays EI. The seven candidates are deliberately selected so that each objective contains six observations on one-half plane and one observation on the other. The peaks of the red dot-dashed curve on the EI axis indicate the local maxima of the EI acquisition function for that objective. The vector-valued GP with transfer learning kernel predicts much more closer to the ground truth than the scalar-valued GP using the knowledge from the related objective.
Prediction and uncertainty quantification for the two correlated objectives with correlation coefficient being 0.8 using scalar-valued GP (a–b) and vector-valued GP (c–d) given seven unique observations for each objective. Left axis displays prediction, uncertainty quantification, whereas right axis displays EI. The seven candidates are deliberately selected so that each objective contains six observations on one-half plane and one observation on the other. The peaks of the red dot-dashed curve on the EI axis indicate the local maxima of the EI acquisition function for that objective. The vector-valued GP with transfer learning kernel predicts much more closer to the ground truth than the scalar-valued GP using the knowledge from the related objective.
It is clear that given unevenly distributed observations for both tasks, the scalar-valued GP cannot truly reflect the unexplored region, whereas the vector-valued GP predicts the trend in unexplored region using knowledge from the related objective, leading to a more informative EI guiding the search.
Proposal of Promising Candidates
Proposed by Mockus (1975), the EI acquisition function has been widely applied in Bayesian optimization with the standard GP, balancing exploration and exploitation by taking information on value and local uncertainty (Huang et al. 2006; Forrester et al. 2006; Quan et al. 2013). It is defined as the amount of improvement that the evaluation of a candidate is expected to induce.
Extending the EI to the multiobjective case is straightforward. For vector-valued GP, the EI acquisition function associated with the component can be expressed as
where is the current best observed candidate for the objective and is the short-hand symbol for the Gaussian distributed variable given by Eq. 27. Eqs. 27 and 29 yield
where and are the cumulative distribution function and the probability density function of the standard normal distribution respectively.
In each iteration, the choice of the promising candidate is based on maximization of the EI acquisition function. The order of maximization requires a choice: Sometimes the T1 data is only used for verification of the nonunique solution sets obtained from the T2 inverse problem, whereas sometimes T1 and T2 data are considered of equal importance and it might be desirable to have both T1 and T2 distributions fitted of equal quality. Formally, we propose two routines for maximization of the EI acquisition function corresponding to these two scenarios:
• is first maximized, followed by maximization of in a subspace constrained by Eqs. 14 and 15. For clarity we denote DT-ISW with sequential maximization of as DT-ISW-S.
• and are jointly maximized by maximization of the sum of and . For clarity, we denote DT-ISW with joint maximization of as DT-ISW-J.
Throughout this study, is assigned more importance as the T2 decays acquired using CPMG measurement consist of significantly more data points than the T1 decays acquired during T1 inversion recovery measurement, making T2 distributions and the associated objective function more reliable. Maximization of under linear constraints can be tackled using SQP (Boggs and Tolle 1995). The initial guesses are placed randomly in the feasible domain, ensuring that both the global optimal solution and major local optimal solutions are identified. The performance of DT-ISW-J and DT-ISW-S is compared in the next section.
Solution Analysis
After the optimization budget is depleted, the model hyperparameters are updated and the T1, T2 objectives are jointly modeled under the new set of hyperparameters. The solution space of Eq. 10 is then divided into various UPs corresponding to various modes of the objective function. The solution with the lowest fitness value in each UP is referred to as the local minimum (LM). For details, see Li et al. (2021). As the two objectives involve noisy measurements which are never noise-free, it is expected that they will be slightly conflicting and there is no solution that simultaneously optimizes each objective precisely. To account for this, we provide the POS of the T1 and T2 data fits for the slightly conflicting objectives, as shown in Fig. 4.
Demonstration of the Pareto front (red line) and the POS (solid squares) in the case of two conflicting objectives. Candidate F is dominated by Candidates B and C because both B and C achieve lower fitness values on both objectives. In addition to the LM [i.e., solution C which achieves the lowest sum of and ], the POS (i.e., solutions A, B, D, and E) are also reported.
Demonstration of the Pareto front (red line) and the POS (solid squares) in the case of two conflicting objectives. Candidate F is dominated by Candidates B and C because both B and C achieve lower fitness values on both objectives. In addition to the LM [i.e., solution C which achieves the lowest sum of and ], the POS (i.e., solutions A, B, D, and E) are also reported.
POS is defined as the set of solutions that is not dominated by any other solutions. Generally speaking, dominates if beats or ties for each of the two objectives; mathematically, it is expressed as , , and , .
Results
In this study, we demonstrate the performance of the three approaches: DT-ISW-J, DT-ISW-S, and ST-ISW on Bentheimer sandstone, identifying five parameters by computationally producing T1 and T2 distributions almost identical with experiments using the dual-output NMR simulator. The individual optimization for the T1 and T2 are presented as reference methods. As was done in Li et al. (2021), practically the base-10 logarithm of the effective relaxation time and of the effective diffusion coefficient are used since they usually vary across orders of magnitude, and the search for is limited to a domain bounded by the following constraints:
0 /(μm/s), /(μm/s) 16
−3 , 0
−6.28 −4
and the corresponding 5D solution space is rather large so that a grid search or random search is computationally intractable. We compare the performance of the following three approaches discussed in the previous section for the T1, T2 inverse problem: DT-ISW-S, DT-ISW-J, and ST-ISW. As the reference method, the SEP minimizes T1 and T2 independently in the 3D solution space as was done in Li et al. (2021), ignoring that is a shared value. We summarize the description of the three approaches and the reference in Table 2 and note that the determined values and are the lattice-based discrete surface relaxivities of quartz, whereas the actual continuum values are times higher, and is 3/2 following Bergman et al. (1995). The optimization budget is 180 function evaluations.
Summary of the three approaches for the T1, T2 physical parameters identification problem. The social-learning particle swarm optimizer (SL-PSO) is applied under box constraints. SEP does not take any additional information from T1 into account and is used as a reference demonstrating the quality of the solution without constraints.
Method . | Order of Optimization . | Transfer Learning . | Constraints . | Optimizer . |
---|---|---|---|---|
DT-ISW-J | Joint minimization | Yes | Linear | SQP |
DT-ISW-S | Sequential minimization | Yes | Box | SL-PSO and L-BFGS-B |
ST-ISW | Joint minimization | No | Linear | SQP |
SEP | – | No | Box | SL-PSO and L-BFGS-B |
Method . | Order of Optimization . | Transfer Learning . | Constraints . | Optimizer . |
---|---|---|---|---|
DT-ISW-J | Joint minimization | Yes | Linear | SQP |
DT-ISW-S | Sequential minimization | Yes | Box | SL-PSO and L-BFGS-B |
ST-ISW | Joint minimization | No | Linear | SQP |
SEP | – | No | Box | SL-PSO and L-BFGS-B |
Comparison of the Speed of Convergence
Fig. 5a displays the best fitness value so far for , together with its two components and in Figs. 5b and 5c for the three approaches, with SEP plotted as a reference. In terms of the objective , both DT-ISW-S and DT-ISW-J are more efficient in the first 60 function evaluations, with the speed of convergence almost twice as fast as ST-ISW. In the following 120 steps, DT-ISW-J maintains the same rate of convergence until Step 83, where it finds the lowest fitness value, whereas DT-ISW-S continues to minimize and find a lower at Step 152. ST-ISW maintains the same convergence speed until the optimization budget is depleted. Minimization of is very similar to the case of minimization of : In the first 32 steps, DT-ISW-J immediately reduces , whereas due to low priority assigned to T1 fitting, DT-ISW-S cannot reduce since a good proposal of is subjected to the valid proposal of . DT-ISW-S becomes effective since Step 32, and the overall rate of convergence is comparable to DT-ISW-J. However, ST-ISW, with a lower rate of convergence, steadily minimizes to the same level as DT-ISW-S, which is slightly higher than 1×10−3. All approaches identified lower than the reference methods.
Minimum fitness value plots for (a) and its two components; (b) ; (c) associated with the three approaches and the baseline listed in Table 2. For SEP, there are only individual optimizations of and and thus does not appear in (a).
Minimum fitness value plots for (a) and its two components; (b) ; (c) associated with the three approaches and the baseline listed in Table 2. For SEP, there are only individual optimizations of and and thus does not appear in (a).
In terms of , DT-ISW-J, DT-ISW-S, and SEP-T2 maintain a similar speed of convergence until the fitness values fall below 1 before Step 70. In the following steps, both SEP and DT-ISW-S achieve lower fitness values because they can freely evaluate in the solution space without constraints. Though at a slower rate of convergence, ST-ISW steadily reduces to the level of 1 until its last step. It can be inferred from Fig. 5 that the speed of convergence for both DT-ISW-S and DT-ISW-J are more than twice as fast as ST-ISW. DT-ISW-J achieves good T1 and T2 distribution agreement, whereas DT-ISW-S achieves better agreement for T2 and slightly worse agreement for T1.
Multimodal Solution Analysis by Solution Space Partitioning
As is described in the previous section, ST-ISW caters for the multimodal nature of the objective functions, with solution space partitioning applied in the solution analysis to cluster all solutions to different UPs. Here we apply the same solution analysis procedure to both DT-ISW-S and DT-ISW-J. Table 3 shows identified parameter values of the LMs for the three approaches and the reference. LMs with above 1 are not listed. All three approaches identified a consistent value of , and all are capable of achieving lower values than the reference. SEP-T2 is affected by the problem of nonunique solutions—although achieving a lower , the LM 1 identified by SEP-T2 is less physically valid than LM 2. As a comparison, all approaches identified at least 2 UPs, but from Table 3 we see that for each approach there is only one UP within which the solutions are both mathematically sound and physically valid.
The top three candidates in terms of [or and if does not exist] within each UP, identified using the three approaches and the reference method SEP. UPs are sorted in ascending order of the best identified within the UP. The T1 and T2 distributions corresponding to the LMs are displayed in Fig. 6. Numbers in the R column indicate the rankings of the solutions out of 180 evaluated candidates in terms of , whereas numbers in the Step column indicate at which step those solutions are proposed. and are the lattice-based discrete surface relaxivity of quartz. Values of , , and are expressed in log10 scale.
UP . | R . | (μm/s) . | (μm/s) | . | . | . | . | . | . | Step . |
---|---|---|---|---|---|---|---|---|---|---|
DT-ISW-J | ||||||||||
1 | 1 | 4.557 | 7.886 | −1.739 | −1.739 | −5.127 | 19.72 | 5.82 | 25.55 | 83 |
2 | 5.749 | 8.232 | −1.775 | −1.853 | −4.891 | 16.69 | 9.85 | 26.54 | 108 | |
3 | 4.667 | 7.830 | −1.811 | −1.811 | −5.035 | 10.45 | 17.52 | 27.97 | 84 | |
2 | 35 | 4.852 | 7.206 | −2.520 | −2.656 | −4.258 | 29.99 | 17.31 | 47.30 | 168 |
36 | 4.392 | 7.219 | −2.606 | −2.635 | −4.306 | 28.67 | 19.19 | 47.86 | 62 | |
40 | 4.852 | 7.193 | −2.575 | −2.701 | −4.203 | 25.73 | 25.79 | 51.51 | 171 | |
DT-ISW-S | ||||||||||
1 | 1 | 4.937 | 7.888 | −1.797 | −1.797 | −5.026 | 14.35 | 4.01 | 18.36 | 152 |
2 | 5.045 | 8.005 | −1.742 | −1.742 | −5.035 | 12.38 | 8.24 | 20.62 | 127 | |
3 | 4.899 | 7.811 | −1.773 | −1.773 | −5.058 | 16.04 | 5.49 | 21.53 | 158 | |
2 | 53 | 5.770 | 8.689 | −2.581 | −2.581 | −4.000 | 30.25 | 18.88 | 49.13 | 92 |
54 | 5.656 | 8.517 | −2.614 | −2.614 | −4.000 | 29.64 | 19.53 | 49.18 | 100 | |
55 | 5.810 | 8.725 | −2.571 | −2.571 | −4.000 | 30.53 | 18.95 | 49.48 | 93 | |
ST-ISW | ||||||||||
1 | 1 | 5.155 | 7.113 | −1.794 | −2.040 | −5.001 | 15.87 | 10.36 | 26.23 | 159 |
2 | 5.134 | 7.092 | −1.774 | −2.006 | −5.045 | 13.96 | 14.51 | 28.47 | 150 | |
3 | 4.761 | 6.996 | −1.753 | −1.966 | −5.106 | 16.02 | 13.80 | 29.82 | 180 | |
2 | 17 | 5.163 | 7.988 | −2.738 | −2.738 | −4.000 | 29.65 | 23.88 | 53.53 | 166 |
19 | 6.291 | 8.490 | −2.469 | −2.614 | −4.000 | 35.61 | 20.00 | 55.61 | 158 | |
23 | 5.976 | 7.946 | −2.599 | −2.790 | −4.000 | 35.54 | 24.76 | 60.30 | 131 | |
SEP-T1 | ||||||||||
1 | 1 | 3.951 | - | −2.075 | - | −4.956 | 14.90 | - | - | 162 |
2 | 4.508 | - | −1.882 | - | −5.121 | 15.23 | - | - | 154 | |
3 | 4.612 | - | −1.842 | - | −5.135 | 15.45 | - | - | 177 | |
2 | 21 | 4.720 | - | −2.518 | - | −4.275 | 25.49 | - | - | 144 |
23 | 5.509 | - | −2.672 | - | −4.000 | 30.12 | - | - | 46 | |
26 | 4.860 | - | −2.863 | - | −4.000 | 38.66 | - | - | 99 | |
SEP-T2 | ||||||||||
1 | 1 | - | 7.288 | - | −2.068 | −4.907 | - | 2.91 | - | 115 |
2 | - | 7.471 | - | −2.032 | −4.895 | - | 5.11 | - | 139 | |
3 | - | 7.579 | - | −2.017 | −4.883 | - | 6.47 | - | 180 | |
2 | 78 | - | 6.552 | - | −1.958 | −5.247 | - | 23.18 | - | 94 |
99 | - | 7.986 | - | −1.691 | −5.236 | - | 39.85 | - | 164 | |
117 | - | 5.546 | - | −2.321 | −5.138 | - | 95.09 | - | 6 |
UP . | R . | (μm/s) . | (μm/s) | . | . | . | . | . | . | Step . |
---|---|---|---|---|---|---|---|---|---|---|
DT-ISW-J | ||||||||||
1 | 1 | 4.557 | 7.886 | −1.739 | −1.739 | −5.127 | 19.72 | 5.82 | 25.55 | 83 |
2 | 5.749 | 8.232 | −1.775 | −1.853 | −4.891 | 16.69 | 9.85 | 26.54 | 108 | |
3 | 4.667 | 7.830 | −1.811 | −1.811 | −5.035 | 10.45 | 17.52 | 27.97 | 84 | |
2 | 35 | 4.852 | 7.206 | −2.520 | −2.656 | −4.258 | 29.99 | 17.31 | 47.30 | 168 |
36 | 4.392 | 7.219 | −2.606 | −2.635 | −4.306 | 28.67 | 19.19 | 47.86 | 62 | |
40 | 4.852 | 7.193 | −2.575 | −2.701 | −4.203 | 25.73 | 25.79 | 51.51 | 171 | |
DT-ISW-S | ||||||||||
1 | 1 | 4.937 | 7.888 | −1.797 | −1.797 | −5.026 | 14.35 | 4.01 | 18.36 | 152 |
2 | 5.045 | 8.005 | −1.742 | −1.742 | −5.035 | 12.38 | 8.24 | 20.62 | 127 | |
3 | 4.899 | 7.811 | −1.773 | −1.773 | −5.058 | 16.04 | 5.49 | 21.53 | 158 | |
2 | 53 | 5.770 | 8.689 | −2.581 | −2.581 | −4.000 | 30.25 | 18.88 | 49.13 | 92 |
54 | 5.656 | 8.517 | −2.614 | −2.614 | −4.000 | 29.64 | 19.53 | 49.18 | 100 | |
55 | 5.810 | 8.725 | −2.571 | −2.571 | −4.000 | 30.53 | 18.95 | 49.48 | 93 | |
ST-ISW | ||||||||||
1 | 1 | 5.155 | 7.113 | −1.794 | −2.040 | −5.001 | 15.87 | 10.36 | 26.23 | 159 |
2 | 5.134 | 7.092 | −1.774 | −2.006 | −5.045 | 13.96 | 14.51 | 28.47 | 150 | |
3 | 4.761 | 6.996 | −1.753 | −1.966 | −5.106 | 16.02 | 13.80 | 29.82 | 180 | |
2 | 17 | 5.163 | 7.988 | −2.738 | −2.738 | −4.000 | 29.65 | 23.88 | 53.53 | 166 |
19 | 6.291 | 8.490 | −2.469 | −2.614 | −4.000 | 35.61 | 20.00 | 55.61 | 158 | |
23 | 5.976 | 7.946 | −2.599 | −2.790 | −4.000 | 35.54 | 24.76 | 60.30 | 131 | |
SEP-T1 | ||||||||||
1 | 1 | 3.951 | - | −2.075 | - | −4.956 | 14.90 | - | - | 162 |
2 | 4.508 | - | −1.882 | - | −5.121 | 15.23 | - | - | 154 | |
3 | 4.612 | - | −1.842 | - | −5.135 | 15.45 | - | - | 177 | |
2 | 21 | 4.720 | - | −2.518 | - | −4.275 | 25.49 | - | - | 144 |
23 | 5.509 | - | −2.672 | - | −4.000 | 30.12 | - | - | 46 | |
26 | 4.860 | - | −2.863 | - | −4.000 | 38.66 | - | - | 99 | |
SEP-T2 | ||||||||||
1 | 1 | - | 7.288 | - | −2.068 | −4.907 | - | 2.91 | - | 115 |
2 | - | 7.471 | - | −2.032 | −4.895 | - | 5.11 | - | 139 | |
3 | - | 7.579 | - | −2.017 | −4.883 | - | 6.47 | - | 180 | |
2 | 78 | - | 6.552 | - | −1.958 | −5.247 | - | 23.18 | - | 94 |
99 | - | 7.986 | - | −1.691 | −5.236 | - | 39.85 | - | 164 | |
117 | - | 5.546 | - | −2.321 | −5.138 | - | 95.09 | - | 6 |
The and distributions corresponding to the LMs listed in Table 3 associated with (a–b) DT-ISW-J, (c–d) DT-ISW-S, (e–f) ST-ISW, and (g–h) reference fittings.
The and distributions corresponding to the LMs listed in Table 3 associated with (a–b) DT-ISW-J, (c–d) DT-ISW-S, (e–f) ST-ISW, and (g–h) reference fittings.
Fig. 6 shows the fit corresponding to the LMs in Table 3. From appearance, the fittings for the T1 and T2 distributions are equally good for all three methods, but DT-ISW-J achieves good fit significantly faster using only 83 steps, as shown in Fig. 5 and Table 3. Apart from the good-enough-fit criteria, the physical validity should also be considered for selection of the best solutions, so that all candidates in Table 3 are checked if the parameters reach boundaries and if the effective diffusion coefficient is larger than bulk water (we use under ambient conditions).
Identification of the POS
Given Table 3, it is natural to nominate the solution corresponding to LM 1 as the final solution for the inverse problem of T1, T2 parameter estimation. However, we observe that the two objectives are slightly conflicting. Taking DT-ISW-J as an example, the LM 1 achieves the lowest overall with a significantly lower value of compared with the second-best solution; however, the value of LM 1 is inferior to the second-best solution. Consequently, there is no best solution as no solution simultaneously minimizes both and . The situation is also true for DT-ISW-J and ST-ISW, which implies that there is slight conflict between the two objectives—improvement of cannot be achieved without harming and vice versa. The conflict is also revealed in the result of SEP from Table 3 that if and are individually optimized, and will neither recover the same value of nor satisfy the constraints Eqs. 14 and 15. As a result, in addition to providing the LM 1 from the three approaches, we analyze the trade-off between the conflicting objectives and provide the POS to further balance the preference of T1 and T2 data fits.
The slight conflicts between the two objectives, which exists for all UPs and manifest themselves in log scale, are illustrated in Fig. 7. As expected, for DT-ISW-J and ST-ISW the and are equally minimized, whereas for DT-ISW-S the is much more preferentially minimized. The identified POS as well as the corresponding T1 and T2 fits are shown in Table 4 and Fig. 8. Comparing Table 4 with Table 3, we see that the POS does not always coincide with the top three solutions identified using solution space partitioning, and the POS gives us more flexibility to trade-off the conflicting objectives. Practically, we place slightly higher confidence in due to the larger number of acquired points in the acquisition sequence and nominate the candidate with the lowest under the condition that the is no more than 2 as the final solution. Final solutions are in bold in Table 3. It is observed from Fig. 7 that candidates evaluated by either DT-ISW-J or DT-ISW-S are more clustered to the bottom-left, whereas candidates evaluated by ST-ISW seem to be evenly distributed in log scale. The following subsection discusses the evolution of the clusterization pattern of POS.
Minimization of and using (a) DT-ISW-J, (b) DT-ISW-S, and (c) ST-ISW. Candidates in the bottom-left corner are preferred. Solid dots are the POS and the circles are the dominated solutions. POS are listed in Table 4.
Minimization of and using (a) DT-ISW-J, (b) DT-ISW-S, and (c) ST-ISW. Candidates in the bottom-left corner are preferred. Solid dots are the POS and the circles are the dominated solutions. POS are listed in Table 4.
The and distributions corresponding to the POS in UP 1 listed in Table 4 associated with (a–b) DT-ISW-J, (c–d) DT-ISW-S, and (e–f) ST-ISW, respectively. POS corresponding to UP 2 of all three approaches do not satisfy the physical validity so that they are not displayed. In each plot, the increases whereas the decreases from POS 1 to POS 4.
The and distributions corresponding to the POS in UP 1 listed in Table 4 associated with (a–b) DT-ISW-J, (c–d) DT-ISW-S, and (e–f) ST-ISW, respectively. POS corresponding to UP 2 of all three approaches do not satisfy the physical validity so that they are not displayed. In each plot, the increases whereas the decreases from POS 1 to POS 4.
The POS corresponding to the solid dots in Fig. 7. Within each UP, the solutions are sorted in descending order of . The T1 and T2 distributions corresponding to solutions in LM 1 are displayed in Fig. 8. Numbers in the R column indicate the rankings of the solutions out of 180 evaluated candidates in terms of , whereas numbers in the Step column indicate at which step those solutions are proposed. and are the lattice-based discrete surface relaxivity of quartz. Values of , , and are expressed in log10 scale.
UP . | R . | (μm/s) . | (μm/s) . | . | . | . | . | . | . | Step . |
---|---|---|---|---|---|---|---|---|---|---|
DT-ISW-J | ||||||||||
1 | 1 | 4.557 | 7.886 | −1.739 | −1.739 | −5.127 | 19.72 | 5.82 | 25.55 | 83 |
2 | 5.749 | 8.232 | −1.775 | −1.853 | −4.891 | 16.69 | 9.85 | 26.54 | 108 | |
3 | 4.667 | 7.830 | −1.811 | −1.811 | −5.035 | 10.45 | 17.52 | 27.97 | 84 | |
8 | 4.600 | 7.293 | −1.855 | −1.909 | −5.027 | 7.95 | 25.26 | 33.22 | 94 | |
2 | 35 | 4.852 | 7.206 | −2.520 | −2.656 | −4.258 | 29.99 | 17.31 | 47.30 | 168 |
36 | 4.392 | 7.219 | −2.606 | −2.635 | −4.306 | 28.67 | 19.19 | 47.86 | 62 | |
40 | 4.852 | 7.193 | −2.575 | −2.701 | −4.203 | 25.73 | 25.79 | 51.51 | 171 | |
70 | 4.851 | 7.191 | −2.583 | −2.730 | −4.192 | 23.34 | 46.27 | 69.61 | 180 | |
DT-ISW-S | ||||||||||
1 | 1 | 4.937 | 7.888 | −1.797 | −1.797 | −5.026 | 14.35 | 4.01 | 18.36 | 152 |
2 | 5.045 | 8.005 | −1.742 | −1.742 | −5.035 | 12.38 | 8.24 | 20.62 | 127 | |
20 | 5.046 | 7.998 | −1.730 | −1.730 | −5.046 | 11.10 | 21.15 | 32.25 | 128 | |
2 | 109 | 4.899 | 8.870 | −2.554 | −2.554 | −4.000 | 79.58 | 18.79 | 98.37 | 44 |
53 | 5.770 | 8.689 | −2.581 | −2.581 | −4.000 | 30.25 | 18.88 | 49.13 | 92 | |
54 | 5.656 | 8.517 | −2.614 | −2.614 | −4.000 | 29.64 | 19.53 | 49.18 | 100 | |
169 | 4.745 | 15.663 | −2.103 | −2.219 | −4.754 | 15.08 | 2,683.22 | 2,698.30 | 164 | |
ST-ISW | ||||||||||
1 | 4 | 4.823 | 7.025 | −1.762 | −1.979 | −5.086 | 19.84 | 10.15 | 29.99 | 179 |
1 | 5.155 | 7.113 | −1.794 | −2.040 | −5.001 | 15.87 | 10.36 | 26.23 | 159 | |
2 | 5.134 | 7.092 | −1.774 | −2.006 | −5.045 | 13.96 | 14.51 | 28.47 | 150 | |
5 | 5.137 | 7.115 | −1.813 | −2.068 | −4.974 | 11.61 | 19.12 | 30.73 | 173 | |
2 | 19 | 6.291 | 8.490 | −2.469 | −2.614 | −4.000 | 35.61 | 20.00 | 55.61 | 158 |
17 | 5.163 | 7.988 | −2.738 | −2.738 | −4.000 | 29.65 | 23.88 | 53.53 | 166 | |
29 | 5.294 | 7.092 | −2.681 | −3.000 | −4.000 | 28.68 | 47.36 | 76.05 | 85 |
UP . | R . | (μm/s) . | (μm/s) . | . | . | . | . | . | . | Step . |
---|---|---|---|---|---|---|---|---|---|---|
DT-ISW-J | ||||||||||
1 | 1 | 4.557 | 7.886 | −1.739 | −1.739 | −5.127 | 19.72 | 5.82 | 25.55 | 83 |
2 | 5.749 | 8.232 | −1.775 | −1.853 | −4.891 | 16.69 | 9.85 | 26.54 | 108 | |
3 | 4.667 | 7.830 | −1.811 | −1.811 | −5.035 | 10.45 | 17.52 | 27.97 | 84 | |
8 | 4.600 | 7.293 | −1.855 | −1.909 | −5.027 | 7.95 | 25.26 | 33.22 | 94 | |
2 | 35 | 4.852 | 7.206 | −2.520 | −2.656 | −4.258 | 29.99 | 17.31 | 47.30 | 168 |
36 | 4.392 | 7.219 | −2.606 | −2.635 | −4.306 | 28.67 | 19.19 | 47.86 | 62 | |
40 | 4.852 | 7.193 | −2.575 | −2.701 | −4.203 | 25.73 | 25.79 | 51.51 | 171 | |
70 | 4.851 | 7.191 | −2.583 | −2.730 | −4.192 | 23.34 | 46.27 | 69.61 | 180 | |
DT-ISW-S | ||||||||||
1 | 1 | 4.937 | 7.888 | −1.797 | −1.797 | −5.026 | 14.35 | 4.01 | 18.36 | 152 |
2 | 5.045 | 8.005 | −1.742 | −1.742 | −5.035 | 12.38 | 8.24 | 20.62 | 127 | |
20 | 5.046 | 7.998 | −1.730 | −1.730 | −5.046 | 11.10 | 21.15 | 32.25 | 128 | |
2 | 109 | 4.899 | 8.870 | −2.554 | −2.554 | −4.000 | 79.58 | 18.79 | 98.37 | 44 |
53 | 5.770 | 8.689 | −2.581 | −2.581 | −4.000 | 30.25 | 18.88 | 49.13 | 92 | |
54 | 5.656 | 8.517 | −2.614 | −2.614 | −4.000 | 29.64 | 19.53 | 49.18 | 100 | |
169 | 4.745 | 15.663 | −2.103 | −2.219 | −4.754 | 15.08 | 2,683.22 | 2,698.30 | 164 | |
ST-ISW | ||||||||||
1 | 4 | 4.823 | 7.025 | −1.762 | −1.979 | −5.086 | 19.84 | 10.15 | 29.99 | 179 |
1 | 5.155 | 7.113 | −1.794 | −2.040 | −5.001 | 15.87 | 10.36 | 26.23 | 159 | |
2 | 5.134 | 7.092 | −1.774 | −2.006 | −5.045 | 13.96 | 14.51 | 28.47 | 150 | |
5 | 5.137 | 7.115 | −1.813 | −2.068 | −4.974 | 11.61 | 19.12 | 30.73 | 173 | |
2 | 19 | 6.291 | 8.490 | −2.469 | −2.614 | −4.000 | 35.61 | 20.00 | 55.61 | 158 |
17 | 5.163 | 7.988 | −2.738 | −2.738 | −4.000 | 29.65 | 23.88 | 53.53 | 166 | |
29 | 5.294 | 7.092 | −2.681 | −3.000 | −4.000 | 28.68 | 47.36 | 76.05 | 85 |
Sensitivity Analysis
The inferred parameterization of LCM is displayed in Table 5, corresponding to the steps when the LM 1 is identified in Table 3. Taking DT-ISW-J as an example, both and can be described as a linear combination of samples from two GPs, u1 and u2, in addition to the constant mean and Gaussian noise u3:
Hyperparameters inferred at the step when the LM 1 is identified for the three approaches and the reference shown in Table 3. and are the Cholesky factor of and , respectively; , , are vectors of constant mean functions, noise amplitude, and length scales, respectively. Entries on the first and second line are associated with and . Detailed description on parameterization can be found in Appendix B.
Approach . | . | . | . | . | . |
---|---|---|---|---|---|
DT-ISW-J | |||||
DT-ISW-S | |||||
ST-ISW | – | ||||
SEP-T1 | – | ||||
SEP-T2 | – |
Approach . | . | . | . | . | . |
---|---|---|---|---|---|
DT-ISW-J | |||||
DT-ISW-S | |||||
ST-ISW | – | ||||
SEP-T1 | – | ||||
SEP-T2 | – |
where , , and , respectively, and
It is clear from Table 5 that DT-ISW-J considers the two objective functions as the combination of a major component with larger length scales (u2) capturing the trends of the objective functions and a secondary component with smaller length scales (u1) catering for the local variations of the objective functions. Meanwhile, the amplitude of the secondary component is less than one-fourth of the main component. This is the same for DT-ISW-S. As introduced before, since automatic relevance determination determines the sensitivity of the objective function against each component of the input ; we immediately conclude from Eq. 33 that an increase or decrease of 10.075 in /(μm/s) or ρ2,q/(μm/s) is equivalent to an increase or decrease of 2.316 in or or a change of 3.197 in in terms of the reduction in . Therefore, comparison of the sensitivity can be expressed as: , > > , . Ideally, the unknowns should be chosen so that is equally sensitive to all unknowns. Practically, a physical quantity is considered redundant or insensitive if the associated inferred length scale is too large. For the chosen experimental settings and sample this is the case for surface relaxivities of the minority phases and the spatial distribution of susceptibility. Namely, feldspar is considered identical with quartz as the change in is negligible even though the properties of feldspar change moderately. Similarly, the volumetric magnetic susceptibilities of the constituent materials are regarded as fixed parameters in the presence of short and small .
Comparing the Efficiency Dynamically
Another observation is that for DT-ISW-J and DT-ISW-S, most function evaluations are spent in the promising region where both and are low, whereas ST-ISW selects significantly more candidates in the nonpromising region for exploration. To illustrate how the three approaches efficiently minimize the two objectives by trading off exploration and exploitation, we show in Figs. 9a through 9d the top 10 candidates corresponding to each UP identified using the three approaches at 60, 100, 140, and 180 steps. The four plots clearly show that all approaches successfully identified the promising region with various speeds of convergence.
Evolution of the POS at (a) 60, (b) 100, (c) 140, and (d) 180 steps, with the top 10 observed candidates corresponding to each UP identified using the three approaches, DT-ISW-J, DT-ISW-S, and ST-ISW, plotted. Evolution of the UP 2 is also displayed alongside with UP 1 for each approach though the solutions are unrealistic.
Evolution of the POS at (a) 60, (b) 100, (c) 140, and (d) 180 steps, with the top 10 observed candidates corresponding to each UP identified using the three approaches, DT-ISW-J, DT-ISW-S, and ST-ISW, plotted. Evolution of the UP 2 is also displayed alongside with UP 1 for each approach though the solutions are unrealistic.
In summary, DT-ISW-S and DT-ISW-J are more than twice as efficient as ST-ISW mainly due to the following: (1) Both DT-ISW-S and DT-ISW-J simplify the regression by decomposition of the 5D problem into a dual 3D problem; (2) both DT-ISW-S and DT-ISW-J incorporate transfer learning ability, so that the T1 and T2 objectives can leverage information from each other. It is worth noting that both DT-ISW-J and ST-ISW equally minimize and , whereas DT-ISW-S is more powerful in minimization of due to the higher priority assigned to . For optimization problems involving multiple objectives, the multimodal search strategy together with the Pareto solution analysis gives us more flexibility in trading off the conflicting objectives.
Discussion
Nonunique Solution Sets
One major concern of the inverse problem is that many different sets of property estimates may provide satisfactory data fits, requiring additional knowledge to identify the most probable and realistic solution set. The number of nonunique solution sets is positively correlated to the number of main minerals or main unknowns but is usually reduced by the accompanying constraints. If the main phase physical properties are dependent [e.g., via (in)equality constraints], incorporation of multiple measurements not only significantly speeds up the convergence but also reduces the number of nonunique solution sets and increases the confidence level of the results (e.g., the workflow for simultaneous fitting of T1 and T2 presented in this work identifies to two or less unique solution sets compared with our previous work of fitting T2 distributions under various signal-to-noise ratios in which usually two to five unique solution sets are identified) (Li et al. 2021). If they are independent, the number of nonunique solution sets will increase, but there is still very high chance that the global optimal solution set is captured due to application of the multistart SQP optimizer under a larger optimization budget.
Extension to Higher Fields
In this paper, internal magnetic field effects on NMR relaxation resulting from phase susceptibility contrast are incorporated directly into the simulation of T2 relaxation responses. Furthermore, their influence on relaxation is small due to the chosen small interecho time ( μs) and the relatively low NMR proton frequency of 2 MHz. At higher fields, T2 is sensitive to internal field inhomogeneities, necessitating an accurate calculation of the internal magnetic field given the spatial distribution of the volumetric magnetic susceptibility. A first step toward this goal was recently presented by Cui et al. (2022), enabling a high-resolution multiscale calculation of the internal magnetic field. The latter may require an even more sophisticated characterization workflow to assign mineral phases in segmented images and the direct measurement of the susceptibility of some of the individual phases to constrain modeling. The advantage would be the access to T1 frequency dispersion [in particular ] for wettability characterization and the fact that multiple T1 measurements at different field strength or T2 measurements with varying interecho time can further constrain the inversion for unknown intrinsic surface properties.
Application to Other Rocks
The samples considered in this work are outcrops of the Bentheimer sandstone formation, which contains a relatively minute portion of clay minerals and is relatively homogeneous. Many sandstone reservoir formations exhibit a higher degree of heterogeneity and thus wider distribution in pore sizes but also a more complex mineralogy. This poses challenges in terms of balancing field of view and resolution in the micro-CT or generally the 3D image acquisition and processing part of the presented workflow. None of these are insurmountable however: for NMR responses, the self-diffusion coefficient of relevant fluids and gases, and hence traveled distance for relevant relaxation times, implies that responses of domains larger than about 1 mm3 are independent. Thus, heterogeneity in pore and grain sizes can be dealt with by a decomposition of larger cores (full core or core plugs) into smaller domains, potentially combined with a microstructure typing regime to enable a multiscale image acquisition workflow, for example, Jiang and Arns (2020a,-2021). NMR responses can then be derived as ensemble averages over respective domains. Regional variation in clay content or mineralogy may cause local apparent surface relaxivities; decomposing these into intrinsic mineral-specific ones as demonstrated in this work and accounting for local variations in clay distribution should provide an efficient pathway for samples with more complex mineralogy. Given the very efficient workflow presented, multiscale solvers such as Cui et al. (2022) can be utilized in this approach despite their high computational cost.
It is clear that for reservoir rocks the wettability state and wettability alteration for future enhanced oil recovery scenarios are important considerations. The presented workflow, in particular when combined with pore-scale modeling of fluid flow, may provide significant additional insight into the distribution of surfaces with different wetting properties. This may lead to enhanced interpretations of NMR T1 and T2 spectra or higher-dimensional NMR measurements like T1-T2 correlation maps at partial fluid saturations.
The application of the presented technique to long cored sections of reservoirs would benefit from transferring knowledge from already processed cores when similarities between core samples exist. This leads to the concept of transfer learning for study of rock heterogeneity, which is out of scope for the work presented here (see Li et al. 2022).
Finally, we comment on future applications to carbonate rocks, which are often mono-mineralic but exhibit a complex multiscale pore structure. Here the spatial distribution of wettability and connectivity of the pore space may be considered the major target variables in an inverse solution workflow.
Conclusions
In this paper, we introduced a DT-ISW enabling the extraction of multiple intrinsic physical properties associated with T1 and T2 relaxation responses by joint minimization of the deviation between simulated and measured T1 and T2 distributions. An LCM is used in the implementation of transfer learning. This, together with the adaption of the NMR simulator providing both T1 and T2 responses simultaneously, makes the implementation highly efficient.
The workflow is demonstrated on Bentheimer sandstone, identifying five parameters by computationally producing T1 and T2 distributions almost identical with experiments using the dual-output simulator. We compared the performance of a joint optimization (DT-ISW-J) of the L2 norm of T1 and T2 fitting residuals with a sequential implementation (DT-ISW-S) and a single-task optimization (ST-ISW) and draw the following specific conclusions:
DT-ISW-J and DT-ISW-S achieve twice the speed of convergence of ST-ISW due to the statistical modeling of the two highly correlated objectives aided by transfer learning using LCM.
DT-ISW-J and ST-ISW equally minimize the L2 norm of T1 and T2 fitting residuals, whereas DT-ISW-S preferentially minimizes the objective assigned higher importance.
A POS allows balancing T1 and T2 data fits as fitting T1 and T2 responses can result in conflicting objectives.
For both T1 and T2, measured and simulated relaxation time distributions are in excellent agreement over the entire distribution function including short components. This suggests that the conventional interpretation of NMR relaxation time distributions can be carried out on simulated responses, including cutoff analysis for permeability correlations or estimation of bound water. Given the excellent agreement to experiment and availability of micro-CT images, more complex mappings to petrophysical parameters may be introduced.
The method extracts the intrinsic surface relaxivities associated with mineral-fluid contacts, not the heuristic surface relaxivities typically extracted when mapping to external length scale measurements (e.g., mercury intrusion capillary measurements). It also accounts for internal gradients directly. This can be of significant interest when interpreting T1 and T2 measurements at different field strength.
While the study is focused on identification of parameters governing T1 and T2 relaxation, the method can be easily extended to correlation measurements (e.g., T1-T2 and -T2). In turn, respective simulations should provide significant insights into the cause of various features of these distributions and thus improve their interpretation.
Nomenclature
EI acqusitions function related to the T1 and T2 objectives, respectively
- B or B(r)
total magnetic field (at location r), Gauss
- B0
applied static magnetic field, Gauss
- Bdip
dipole field, Gauss
- Bi or Bi (r)
inhomogeneous (offset) internal magnetic field (at location r), Gauss
- Bq′
Q × Q covariance matrix measuring the relationship between the Q outputs of the objective q′
- Bz
z component of the total magnetic field, Gauss
- D
length of the input x, equivalent to the number of unknowns
- De,c
effective diffusion coefficient of water in the clay region, cm2/s
- De or De,pq
diffusion coefficient (from phase p to phase q), cm2/s
- Dobs
observation set
- f(obj) or f(obj) (x)
objective function (as a function of x)
- fp, fc, fq, ff , fh
phase fraction of bulk pore, clay, quartz, feldspar, and high-density phase, respectively, calculated from CT images
objective functions associated with T1 and T2 (as a function of and , respectively
- f(x)
scalar-valued GP
- f(x)
vector-valued GP
- G
average or effective internal gradient strength, Gauss
Gaussian process; for scalar-valued , the function is distributed as a Gaussian process with mean function m(x) and covariance function ; for vector-valued GP, , the function is distributed as a Gaussian process with mean function m(x) and covariance function
- I
identity matrix
- kSE, or kSE(xp, xq)
parameterized squared exponential kernel (for candidates xp and xq)
covariance function for scalar-valued GP
- K or Kf (X, X)
covariance matrix for the (noise-free) latent function values f for vector-valued GP
covariance matrix for the (noise-free) latent function values f for scalar-valued GP
covariance matrix for the (noise-free) latent function values f for the objective q′ for vector-valued GP
variance of predicted candidate in scalar-valued GP
covariance between predicted and observed candidates in scalar-valued GP
- Kf (x∗, x∗)
covariance of predicted candidate for vector-valued GP
- Kf (x∗, X)
covariance between predicted and observed candidates for vector-valued GP
covariance function for vector-valued GP, with its entries expressing the correlation not only between the input pairs x and x′ but also between the outputs pairs and
- ls
grid refinement factor
- m0
scalar-valued constant mean function
- m0
vector-valued constant mean function
constant mean for a single predicted candidate
vector of constant mean for a single predicted candidate
- m1,exp and mt,exp
measured longitudinal and transverse relaxation decay, respectively
- m1,sim and mt,sim
simulated longitudinal and transverse relaxation decay, respectively
T1 and T2 magnetization decay as function of t, respectively
- m(x)
scalar-valued mean function
vector-valued mean function, equivalent to
- M0
initial signal strength for a random walker
overall T1 and T2 magnetization decays, respectively
recorded T1 and T2 decays for a random walker after Ns time steps, respectively
- n
number of observed candidates
- nE
echo count
- N
number of phases
- NE
total number of echoes
- Ns
total number of steps carried out by a random walker
normal distribution: a Gaussian (normal) distribution with mean vector μ and covariance matrix Σ
- pmov or pmov,pq
movement probability (traversing from phase p to phase q)
- Q
number of correlated objective functions
total number of seperable kernels
- r0 and rj
location of the start and later positions of a random walker, respectively, µm
- rs and rs+1
position of a random walker at step count s and s + 1, respectively, µm
- s
step count
- S
surface area of an individual pore, µm2
T1 and T2 signal damping factors (traversing from phase p to phase q), respectively
- t
elapsed time, seconds
- tE
echo time, seconds
- ts and ts+1
internal clock for a random walker at step count s and s + 1, respectively, seconds
- T1b and T2b
T1 and T2 relaxation time, respectively, seconds
- T1e,c, T1b,w
effective T1 of the clay region and bulk water, respectively, seconds
- T2e,c, T2b,w
effective T2 of the clay region and bulk water, respectively, seconds
- V
volume of an individual pore, μm3
- w
an individual walker
- x
input for the objective function,
- x*
a single predicted candidate
- xl and xu
lower and upper bounds of the pre-specified search domain
the current best observed candidate for the qth objective function
- x(shared)
shared component of
lower-dimension component of x, i {1, 2}
unique component of
input for the objective function associated with T1 and T2, respectively
- X
matrix of observed candidates
solution space
- χν
volumetric magnetic susceptibility, µSI
- χν,w, χν,c, χν,q, χν,f, χν,h
volumetric magnetic susceptibility of water, clay, quartz, feldspar and high-density phase, respectively, µSI
- y
vector of observations
- y or yq
prediction (for the objective q)
- y*
vector of predictions
- y*
prediction for a single candidate
- γ
gyromagnetic ratio, 2.6753×104, rad / (Gauss⋅s)
- ε
lattice resolution, or voxelsize, µm
- εw
step length of a random walk, µm
- Δt or Δtpq
timestep for a random walker (traversing from phase p to phase q), seconds
total phase shift for a random walker, rad
vector of model hyperparameters for scalar-valued GP (in log10 scale)
vector of model hyperparameters for vector-valued GP (in log10 scale)
Q × Q predictive covariance matrix for a single observed candidate using vector-valued GP
Q × Q diagonal matrix accommodating task-dependent noise levels,
probability density function of T1 and T2, inverted from simulated and experimental decay, respectively
probability density function of T1 and T2, inverted from simulated decay and conditioned on the input , respectively
- ηn (t)
white noise
- Λ
matrix of characteristic length scales, Λ = diag (ℓ)−2
- λ
regularization parameter
predictive mean for a single predicted candidate using scalar-valued GP
predictive mean for a single predicted candidate using vector-valued GP
- ρ1,q and ρ2,q
T1 and T2 surface relaxivity of quartz, respectively, µm/ s
longitudinal and transverse surface relaxivities derived by adjusting the dominant mode of the NMR relaxation time measurements to an external length scale reference measurement, respectively, µm/ s
- ρ1 and ρ2, or ρ1,pq and ρ2,pq
longitudinal and transverse surface relaxivity (traversing from phase p to phase q), respectively, µm/s
signal amplitude of covariance function
noise variance of covariance function
vector of noise variance,
predictive variance for a single predicted candidate using scalar-valued GP
- τ
time delay in inversion-recovery pulse sequence, seconds
- τE
interecho time, seconds
- φ(x)
probability density function
- Φ(x)
cumulative distribution function
porosity from experiment
accumulated phase shift due to diffusion in Bi, rad
porosity from simulation
calculated porosity from simulation, related to phase p and q, respectively
Data Statement
The tomogram data that support the findings of this study are made available online at the Digital Rocks Portal (https://doi.org/10.17612/1J6K-SH07).
Acknowledgments
CHA acknowledges the Australian Research Council for funding through ARC discovery project DP200103548. This work was supported by computational resources provided by the Australian Government through gadi under the National Computational Merit Allocation Scheme (grant m65).
Appendix A—Derivation of LCM
Given the data set , the jth outputs are indeed expressed as the sum of the mean function and linear combinations of independent random functions, that is,
where the latent functions are independent samples from a specific GP following , is the scalar coefficient specifying relative contribution associating with each for the jth task. The independence assumption can be relaxed when the objective functions are expressed through a linear combination of nonorthogonal latent functions (Vargas-Guzmán et al. 2002). The significance of nonorthogonality is that in some real cases, a linear combination may be required to couple several correlated attributes in a single model to include negative cross-covariance between components. Since the modeled objectives are usually considered positively correlated, expressing the objective function by nonorthogonal latent functions are not thoroughly discussed here and we leave it to the interested readers for in-depth studies. Here we implement automatic relevance determination to the covariance function . Under independency assumption, the covariance between is given by if , and Eq. (A.1) can be rewritten by grouping together the latent functions which share the same covariance, that is,
Eq. (A-2) can be viewed as linear combinations of groups of functions within which the independent functions share the same covariance. Under independency assumption, the covariance between any pair of can expressed as
which exactly specifies the entries of . The multitask kernel can be then expressed as
where specifies entries of the coregionalization matrix .
Appendix B—Parameterization of LCM
Compared with parameterization of the scalar-valued GP in Li et al. (2021), the vector-valued GP requires inference on the additional correlation matrix . To enforce positive definiteness and also suggested by Swersky et al. (2013), we represent using its Cholesky factor , where , as
whereas enforcing that the diagonal of is nonnegative. Assuming constant mean functions for each task with a full-rank LCM, we give the parameterization of the vector-valued GP in Table B-1.
Parameterization of the vector-valued GP modeled by full-rank LCM.
Task . | . | . | . | . |
---|---|---|---|---|
1 | ||||
2 | ||||
… | … | … | … | … |
Task . | . | . | . | . |
---|---|---|---|---|
1 | ||||
2 | ||||
… | … | … | … | … |
In addition to the mean and noise variance hyperparameters, LCM calls for hyperparameters in the Cholesky factor and in length scales. In summary, the parameterization of the vector-valued GP using LCM can be expressed as and practically the following parameterization scheme is adopted, that is,
The full-rank LCM () and reduced-rank LCM () are suitable for objectives with moderate and high similarities respectively, and in this study the full-rank LCM is adopted given no information about the intertask correlation.
In ST-ISW, we adopted a fully Bayesian treatment on model hyperparameters (Snoek et al. 2012; Li et al. 2021). However, such treatment soon becomes intractable due to the cubic growth of the computational burden with the number of observed candidates. As a result, the model hyperparameters are determined by maximum likelihood estimation. We use the multistart L-BFGS-B to ensure that the hyperparameter settings is globally optimum.
Article History
Original SPE manuscript received for review 22 December 2021. Revised manuscript received for review 26 April 2022. Paper (SPE 210563) peer approved 28 April 2022.