A. La crítica de la causalidad
A.1. H UME contra la causalidad
Deep learning methods are advanced machine learning (ML) techniques for data processing, information retrieval, pattern recognition, and diagnostics. The use of deep learning algorithms is getting adopted in remote sensing applications.
This study applies novel deep-learning-assisted methods to synthesize the NMR- T2 distribution response of fluid-filled porous subsurface earth formations around the wellbore by processing the conventional and easy-to-acquire subsurface measurements in the absence of a downhole NMR logging tool.
Four types of deep learning neural networks are applied for the purpose of NMR T2 synthesis. The four models include: Variational-Autoencoder (VAE) assisted neural network (VAE-NN), Generative Adversarial Network (GAN) assisted neural network (GAN-NN), Long Short-Term Memory (LSTM), and Variational Autoencoder with a convolutional layer (VAEc) network. NMR T2 data has both spatial and time characteristics. NMR T2 data is a 64-dimensional vector, which contains spatial features. Besides, the NMR T2 data contains T2 relaxation times that are inverted from raw NMR logs. These two characteristics inspired us to use neural networks that suitable to deal with spatial data (VAE, VAEc), and neural networks that suitable to deal with time series (LSTM).
The four models mentioned before are implemented to extract the complex relationships between the NMR-T2-distribution log and other easy to acquire well logs. Two types of logs are used as inputs: 12 easy-to-acquire raw logs and 13 inverted formation mineral and fluid composition logs. 12 raw logs are used as input data comprising 5 array induction four foot resistivity log (AF10, AF20, AF30, AF60, AF90),
caliper (DCAL), compressional sonic (DTCO), shear sonic (DTSM), gamma-ray (GR), Neutron porosity (NPOR), PEFZ, and formation density (RHOZ). The inverted formation compositional logs are formation fluid and mineral compositions inverted from resistivity, neutron, density, GR, and dielectric. 10 inverted formation mineral composition logs of anhydrite, calcite, chlorite, dolomite, halite, kerogen, illite, k feldspar, montmorillonite, quartz, and 3 formation fluid compositional logs of free water, free oil, and bound water are used as the input of the synthesis process. The 12 raw logs and 13 formation mineral and fluid composition logs are not all used as inputs. Some logs are removed in the dimensionality reduction process due to high correlations. The four models applied in this study contain unsupervised ML algorithms with generative abilities. The training of the models requires exposure to a limited NMR-T2 distribution dataset prior to the prediction.
An autoencoder is a type of deep neural network that is trained to reproduce its input as its output by implementing an encoder followed by a decoder (Goodfellow et al., 2016). On the encoder side, it has a latent layer of lower dimension compared to the preceding layers. The encoder projects the input data to the latent layer, following that decoder decodes the latent vector encoded in the latent layer. With this bottleneck structure, an autoencoder learns to extract the most important information when the input goes through the latent layer. Therefore, an autoencoder can be taken as an effective way to project data from a high dimension to a lower dimension by identifying the most dominant features and characteristics. A variational autoencoder is a specific form of the autoencoder. The latent vectors are constrained to follow a Gaussian distribution (Kingma and Welling, 2013), which adds uncertainty to the latent variable. VAE arranges
the learned features with similar shapes close to each other in the projected latent space; thereby, reducing the loss in the reproduction of input.
A GAN is composed of two neural networks, generator, and discriminator, aiming to learn from training data and produce data that are similar to the training data (Goodfellow et al., 2014). GAN has shown great potential in image generation (Radford et al., 2015) and text to image synthesis (Reed et al., 2016). In this case, the generator learns to generate T2 distribution that cannot be detected by the discriminator as to whether the generated data is from measured training dataset or synthetically generated by the generator. With a proper architecture and training process, the generator will gain the ability to generate data that are very similar to those in the training dataset. GAN-NN learns from NMR T2 through the formation under investigation, and then make a prediction of T2 based on matrix composition and fluid saturations.
A CNN (Convolutional Neural Network) is inspired by the complex arrangement of the biological visual cortex, where a set of neurons are sensitive to a specific visual feature, such as angle, curvature or edge. A convolutional layer in CNN contains a combination of mathematical filters, in which each filter is sensitive to a specific feature of an image. When multiple convolutional layers are combined, CNN can learn features of varying visual complexity. A Variational Autoencoder is a special type of generative neural network that aims to reproduce the input of the network. It comprises two computational frameworks, an encoder followed by a decoder. The encoder encodes the input into a latent layer, whereas the decoder decodes the information in the latent layer to reconstruct the input. As a generative model, the VAE has been applied in various generative modeling problems like image generation (Dosovitskiy and Brox, 2016) and
text generation (Bowman et al., 2015). We assume that combining VAE and convolutional improves the learning ability of VAE.
A LSTM is a type of recurrent neural network (RNN) with an improved ability to solve the gradient vanishing problem in RNN. With the ability of ‘memorizing’ values for a long time, the LSTM is superior in dealing with sequences, like text, where values in a different position may have a strong dependency. LSTM applications include machine translation(Cho et al., 2014; Sutskever et al., 2014), natural language generation(Wen et al., 2015), and time-series prediction. The joint application of LSTM and CNN can accomplish the work of video description (Venugopalan et al., 2014), image captioning (Vinyals et al., 2015), and text to image generation (Reed et al., 2016). NMR T2 distribution is the distribution of the T2 relaxation time. We assume that the LSTM can deal with the dependencies of the NMR T2 relaxation time values batter than other models.
The NMR-T2 synthesis processes of VAE, VAEc, and GAN-NN are similar. First, a deep neural network is pre-trained to extract the features of the NMR-T2 distribution. NMR T2 distributions have inherent characteristics. All NMR T2 distributions are similar to combinations of Gaussian distributions. In this step, the models can learn the general characteristics of NMR T2. Second, a simple multilayer neural network is connected with the pre-trained deep neural network to predict NMR-T2 distribution from conventional logs. The two-step training process makes the prediction results stable. Instead of predicting each value of the NMR T2 distribution, our designed neural network predicts the complete NMR-T2 spectra; thereby providing greater constraint to the synthesis. To the best of our knowledge, no similar study has been
attempted to synthesize NMR-T2 distribution prior to our work published in various journals(Li and Misra, 2017a; Li and Misra, 2017b; Li and Misra, 2018; Li et al., 2019; Misra and He, 2019a; Misra and Li, 2019). Six shallow models are also trained and tested with the same data to compare with the deep models.