Deep Transfer-Learning Based Lithium-Ion Battery Fault Diagnosis

A signed copy of the Certificate of Approval is available from the School of Graduate and Postdoctoral Studies. Defect detection in lithium-ion (LiB) batteries is essential to ensure proper battery life and operation.

Introduction

Background
Objectives & Assumptions
Contributions
Summary

Another contribution of this work is the use of partial voltage curves for fault detection. The main goal of this research is to develop a fault diagnosis framework that can be used in various LIBs.

Literature Review

Introduction

Fault Diagnosis Algorithms Review

Results from their experiments demonstrated the advantages of the technique for detecting external soft-circuit faults. Results from their experiments demonstrated the effectiveness of the model and fault severity detection.

Summary

LSTMs

Long-term memory neural networks are a subset of recurrent neural networks (RNNs), which in turn are a subset of artificial neural networks, generally classified under the umbrella of deep learning/machine learning [38]. RNNs, unlike more traditional neural networks such as multilayer perceptron (MLP) and convolutional neural networks (CNN), can handle sequential data. The governing equations for an LSTM cell are shown in (1), where x, C, Q, h, f, I, and o are the input vector, cell state, potential candidates, hidden state, and activations for the forgotten input and output gate, respectively.

CNNs

Finally, the pooling layer is used to reduce the dimensionality/spatial size of inputs to layers, either to help solve the vanishing gradient problem or because of some peculiarity, such as a change in dimension of the next layers that inputs can be fed into. The formula for the pooling layer is shown in (4), where W1 and W2 are the input and output widths respectively, H1 and H2 are the input and output heights, D1 and D2 are the input and output depths respectively, F is the core size and S is the stride length. The underlying techniques that make ResNets a viable candidate for image recognition are the remaining blocking and jumping connections.

Traditional fully connected deep neural networks suffer from this problem because there is not enough information about the images to propagate to the deeper layer of the network. ResNet solves this problem by allowing all layers to be used, which in turn leads to better image classification performance.

Transfer Learning

The primary method of utilizing transfer learning is one in which there is a source domain with a high number of data points and a target domain with a smaller number of data points. The data representations for both the target and source domains are usually the same although this may not be the case in scenarios where feature extraction is used. In this thesis, the data representation is images of size 224x224 pixels, the source domain is the ImageNet dataset[43], and the target domain is the voltage curve dataset generated from the full and partial charge voltage curves of lithium-ion batteries.

Summary

Methodology

Experimental Setup & Induced Faults

A close-up of the battery along with the thermocouple and heat pad used to fault the battery is shown in Figure 4.3b. The researchers found that an increase in temperature would lead to an increase in the cell's ionic conductivity and conversely, a decrease in temperature would lead to a decrease in ionic conductivity. As the thermal fault generates additional heat for the battery cell, the battery temperature rises and the cell's internal resistance decreases, leading to an associated change in the voltage curve.

This change is the increase in the time it takes to complete the CC (Constant Current) part of the charge and a slight reduction in the overall voltage due to the lower internal resistance. The use of the external resistance to simulate an internal short circuit is a useful technique used by other researchers as in [22]. Data is obtained for this failure class of the battery under normal operating conditions, and this failure class serves as a baseline.

For each fault, the battery is discharged to a random value between 40–60% SOC and then charged to 100% SOC at a rate below 1 C under a constant current constant voltage (CCCV) charging protocol. To create the data set, the partial charge voltage data are plotted and labeled according to the defect class from which they were created. This is verified by the results of the validation experiments, which show similar classification performance.

Figure 4.1 : Experimental Process Overview (Voltage Charge Curve)

RES152

Voltage Monitoring & LSTM Prediction

Residual Generation

Training Optimization & Fault Classification

Normalization adjusts the pixel values of the images from the battery to the equivalent value on the ImageNet scale. Cross-validation involves rearranging the data points contained in the test set and the training set to ensure that the network has the opportunity to "learn" on each instance in the data set. The final optimizations used were using the Adam optimizer [50] and the one cycle fit policy [51].

The loss function used to set the parameters of the network is Cross-Entropy Loss. Cross-Entropy Loss provides the ability to map a prediction probability to each image in the dataset. This allows for fine-grained fine-tuning of the network's weights and biases to maximize the likelihood of the correct label for the image.

L represents the value of the cross-entropy loss, i represents an instance in each batch, m represents the number of samples in the batch, yi represents the actual label of instance i, and 𝑦̂𝑖 represents the predicted label of instance i. TP represents the true positives, FP represents the false positives, TN represents the true negatives and FN represents the false negatives. The figures contain representative images of the stress curves of the four faults, along with a zoomed-in section to show the dynamics of the partial stress curve during charging.

Figure 4.6: Voltage Charge Image Classification Process

Summary

Results and Discussion

Introduction

Experimental Results using Plotted Voltage Charge Curves

Various problems seem to occur during network training, such as the vanishing gradient phenomenon. It can be assumed that the network is stuck in a local minimum despite efforts to optimize the training. As for HEALTHY and ISC, the network encounters some difficulties in distinguishing between these two classes of errors.

As such, the network cannot differentiate one case from the other, as evidenced by the results from the confusion matrix. It can be reasoned from these results that the network is robust and can perform well on never-before-seen data. Like the result of the training exercise, the network struggles with the HEALTH and THERM classes.

Like the training set and validation set 1, the network replaces the HEALTHY class with the THERM class. From these results, it can be concluded that the mode of fault simulation also plays a role in the network's ability to accurately discriminate between fault classes. The third contribution of this work is the ability to quickly and accurately train the network with a relatively small amount of data (voltage charge cycles).

Figure 5.1: Accuracy Plot for Training and Testing Set (Voltage Charge Curve)

Experimental Results using Plotted Voltage Residuals

A summary of the parameters used for the different algorithms in the comparative study is presented in Table 4.3. The batteries used for the validation set from the full charging case and the batteries used in validation set 1 are identical as the network performance is comparable. However, the decrease in network accuracy can be attributed to two factors.

This is expected because the batteries used to generate this data set are different from validation set 1. Regarding the validation set 1 confusion matrix, there is a similar phenomenon to the full charge validation set confusion matrix. From the confusion matrix of validation set 2, it again makes a classification error between the cases HEALTHY and WARM.

For Validation Set 1, although reduced, there is still a case where the THERM case is confused with the HEALTHY case. Furthermore, for validation set 2, there is also a case where the ISC case is mistaken for a SUND case. This lends itself to illustrating the benefits of transfer learning both in terms of the image classification CNN and the voltage prediction LSTM.

Figure 5.8 : Accuracy Plot, Full Charge, Validation Set (Voltage Residual Curve)

Summary

For the confusion matrices of validation sets 1 and 2, a trend similar to the previous two experiments begins to develop. This speaks to the effectiveness of transfer learning due to the small training dataset size used for voltage prediction LSTM training and image classification CNN training.

Figure 5.10: Accuracy Plot, Partial Charge(Random Initialization) (Voltage Residual Curve)

Conclusion and Future Works

Conclusion

Future Works

The third contribution of this work is the ability of the network to be used for different battery chemistries. Real-time experimental results from simulations demonstrate the novelty and effectiveness of the developed framework. In [5], the authors created a fault diagnosis system for current and voltage sensors via modeling the battery pack as a hybrid system, combined with the use of an unscented particle filter.

It estimates the cell terminal voltage and generates residuals which are then used for fault detection. Results from their experiments provide promising results and showed the effectiveness of the proposed method. The technique was validated on cell data from a typical flight cycle and confirmed the effectiveness of the model.

Results of their experiments showed that the proposed technique can detect potential thermal runaway cells days before the fault occurs. The experimental results demonstrated the effectiveness of the technique, but a major drawback, like other data-driven techniques discussed, is the amount of data required for validation. The results of the experiment confirmed the effectiveness of the fault diagnosis scheme in terms of its ability to diagnose the fault and also detect the severity of the fault.

The technique accounts for battery degradation and defect masking effects of developed defects. A review of modern fault diagnosis algorithms based on machine learning is performed in [37]. To the best of the author's knowledge, no attempt has been made to use partial voltage-load curves for fault diagnosis.

Further investigation shows that this confusion is largely due to the simulation mode of the ISC error.