Wind turbine fatigue loads statistical estimation from standard signals

Texto completo

(1)UNIVERSIDAD POLITÉCNICA DE MADRID ESCUELA TÉCNICA SUPERIOR DE INGENIERÍA AERONÁUTICA Y DEL ESPACIO. MASTER’S DEGREE IN AERONAUTICAL ENGINEERING. MASTER’S THESIS. Wind turbine fatigue loads statistical estimation from standard signals. Darı́o PÉREZ CAMPUZANO Aircraft specialty Tutors Enrique GÓMEZ DE LAS HERAS CARBONELL Cristóbal José GALLEGO CASTILLO September 2016.

(2)

(3) UNIVERSIDAD POLITÉCNICA DE MADRID ESCUELA TÉCNICA SUPERIOR DE INGENIERÍA AERONÁUTICA Y DEL ESPACIO. MASTER’S DEGREE IN AERONAUTICAL ENGINEERING. MASTER’S THESIS. Wind turbine fatigue loads statistical estimation from standard signals. Author Darı́o PÉREZ CAMPUZANO Aircraft specialty Professional Tutor Enrique GÓMEZ DE LAS HERAS CARBONELL Wind turbine loads leader, Gamesa. Academic Tutor Cristóbal José GALLEGO CASTILLO Aerospace vehicles dept., ETSIAE UPM September 2016. This thesis was supported by Gamesa. Their cooperation is hereby gratefully acknowledged..

(4)

(5) ”Invest yourself in everything you do. There’s fun in being serious.” John Coltrane.

(6) Regression towards mediocrity in hereditary stature, Francis Galton (1822-1911)..

(7) Acknowledgements First of all I would like to express my regard to all my colleagues with whom I have share several moments at University, specially Ana Marı́a, Alberto and Paraı́so. I met her during my first days in Madrid and the others later, but all of them have been very important in different stages of my last years. Without their attention and unconditional support this long way would have been much tougher. I would also thank Cristina for its consideration because if it were not for her advise this thesis title would look somewhat different. Obviously I can not ignore the encouragement given by my family throughout all my life. Their backing for my decisions has always been undeniable. I owe to you and your education almost everything that I have achieved, hence a big percentage of this thesis is yours. I also feel lucky and proud to have worked with Cristóbal and Álvaro from the Aerospace Vehicles Department. They have introduced my classmates and me into the research field and encouraged our creative potential inside engineering. I am beholden to them also for having shared much of their knowledge covering areas such as forecasting, estimation or neural networks to name but three. Last but not least, I must express my gratitude to the GAMESA company and all its employees for their reception, help and suggestions during this project development. Special recognition deserves Enrique, whose technical assistance and support were essential for several aspects related with wind turbines behavior. Furthermore, he was in charge of this report revision and his comments have enriched and made much more coherent and understandable this thesis. Surely, I am forgetting a lot of people that deserve a huge space in this small page. Unfortunately I must conclude this writing since this report must be sent to the copy shop in a couple of hours. For those who appear here and for those who should: gracias..

(8) Human retinal neuron, Santiago Ramón y Cajal (1852-1934)..

(9) Resumen Las cargas de fatiga representan un elemento crı́tico en numerosas aplicaciones aeronátuicas y los aerogeneradores (WTs) no son una excepción. Su evolución a lo largo de la vida en servicio de la máquina frecuentemente determina su vida útil y, por lo tanto, el conocimiento de su comportamiento puede causar un gran impacto en el Coste de la Energı́a (CoE). Sin embargo, su medición es a menudo complicada o costosa y estimaciones a partir de otras señales conocidas pueden ser realizadas en su lugar. Esta tesis pretende diseñar un modelo de estimación de cargas usando señales estándar como variables de entrada por medio de procedimientos estadı́sticos y machine learning. Se incluye una etapa de selección de inputs y el diseño y evaluación de la configuración de los modelos, los cuales están basados en Redes Neuronales Artificiales (ANNs). El proceso completo se describe en las siguientes lı́neas. Primeramente, los datos procedentes de las simulaciones son recopilados y examinados. Las cargas equivalentes de fatiga (DEL) (indicadores de fatiga que actúan como objetivo) y los stats (parámetros estadı́sticos de las señales estándar que constituyen posibles inputs del modelo) son calculados y su relación cruzada analizada. En segundo lugar, las dimensiones de los inputs son reducidas debido a su gran volumen. Con este objetivo, varios subsets de inputs son formados por pequeñas cantidades de stats usando dos filtros diferentes: Correlación (COR) y Análisis de Componentes Principales (PCA). Una vez hecho esto, el poder de estimación de cada subset es comparado junto a una gran variedad de configuraciones de ANNs. Finalmente, un innovador método de selección de inputs es desarrollado usando una optimización genética. Los resultados de este proceso permiten realizar numerosas observaciones. Por una parte, los resultadose de la estimación parecen prometedores ya que todos ellos caen por debajo del 4%. Por otra parte, redes multicapa con un número de neuronas entre 30 y 60 parecen una configuración apropiada para este propósito. Además, los resultados de la optimización genética también muestran un gran comportamiento sin la necesidad del trabajo previo llevado a cabo por los filtros (los cuales de hecho están influenciados por ciertas suposiciones a priori). Para concluir, en el campo de la energı́a eólica -donde la vida útil es un importante factor debido a su impacto en el CoE- este método puede promover un uso más eficiente de los componentes estructurales aumentando su vida en servicio. Palabras clave: aerogenerador, cargas, fatiga, estimación, machine learning, data mining, selección de inputs, filtro, correlación, PCA, ANN, aprendizaje supervisado, entrenamiento, algoritmos genéticos..

(10) Evolutionary tree, Charles Darwin (1809-1882)..

(11) Abstract Fatigue loads represent a critical element in several aeronautical applications and Wind Turbines (WTs) are not an exception. Their evolution over the machine service life usually determines its lifespan, hence their behavior knowledge can cause a relevant economic impact on the Cost of Energy (CoE). Nevertheless, their measurements are frequently difficult or expensive and estimations arising from other known signals can be carried out instead. This thesis aims to design a load estimation model using standard signal as inputs by means of statistical and machine learning procedures. It includes an input selection stage and the models layout arrangement and assessment, which are based on Artificial Neural Networks (ANNs). The whole process is described in the following lines. First of all, data from simulations is gathered and examined. Damage Equivalent Load (DEL) (fatigue indicator which acts as target) and stats (statistical parameters from standard signals that constitute possible model inputs) are computed and their cross relationships analyzed. Secondly, input dimensions are reduced due to their great volume. In this regard several input subsets formed by small quantities of stats are built using two different filters: Correlation (COR) and Principal Component Analysis (PCA). Once this done, each subset estimation power is compared along with a great variety of ANNs configurations. Eventually, an innovative input selection method is developed using a genetic optimization. The outcome of this process allows several observations. On one hand, estimation results seem promising as long as all of them fall below 4%. On the other hand, multilayered nets with a neurons number between 30 and 60 seem a suitable configuration for these purposes. Furthermore, the genetic optimization results also show a great performance without the necessity of the preliminary work carried out by filters (which in fact is biased by some a priori assumptions). To sum up, inside the wind power field -where lifespan is an important factor due to its influence on the CoE- this method may promote a more efficient usage of structural parts enhancing their service life. Keywords: Wind Turbine (WT), loads, fatigue, estimation, machine learning, data mining, input selection, filter, correlation, PCA, ANN, supervised learning, training, genetic algorithms..

(12)

(13) XIII. Contents List of Figures. XVIII. List of Tables. XX. List of Acronyms. XXI. List of Symbols 1 Introduction 1.1 Wind power, fatigue and estimation 1.2 Scope . . . . . . . . . . . . . . . . . 1.3 Methodology and structure . . . . . 1.4 Literature review . . . . . . . . . . .. XXIII. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. 1 1 2 3 4. 2 Problem statement 2.1 General approach . . . . . . . . . . . . . . 2.1.1 Standard signals - loads scheme . . 2.1.2 Machine learning . . . . . . . . . . 2.2 Model variables . . . . . . . . . . . . . . . 2.2.1 Standard signals (input) . . . . . . 2.2.2 Statistical parameters (stats) . . . 2.2.3 Loads (output) . . . . . . . . . . . 2.2.4 Fatigue indicators (DELs) . . . . . 2.2.5 Variables summary table . . . . . . 2.3 System implementation - Practical issues. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. 7 7 7 9 10 10 15 17 18 19 20. 3 Dimensionality reduction 3.1 Input selection methods . . . . . . . . . . . 3.1.1 Input types (variables and features) 3.1.2 Input selection techniques . . . . . . 3.1.3 Selected approach . . . . . . . . . . 3.2 Scatter plots . . . . . . . . . . . . . . . . . 3.3 Correlation filter . . . . . . . . . . . . . . . 3.3.1 Method description . . . . . . . . . . 3.3.2 Results . . . . . . . . . . . . . . . . 3.4 Principal Component Analysis filter . . . . 3.4.1 Considerations using PCA . . . . . . 3.4.2 Stopping rule . . . . . . . . . . . . . 3.4.3 Method description . . . . . . . . . . 3.4.4 Results . . . . . . . . . . . . . . . . 3.5 Wind related inputs . . . . . . . . . . . . . 3.5.1 Method description . . . . . . . . . . 3.5.2 Results . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. 23 24 24 24 27 28 31 31 31 35 36 36 38 38 40 41 41. . . . .. . . . ..

(14) XIV 3.6. Subsets summary table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42. 4 Wrapper results 4.1 Exploration . . . . . . . . . . . . . . . . . . . . . 4.1.1 Method description . . . . . . . . . . . . . 4.1.2 Damage Equivalent Loads results . . . . . 4.1.3 Subsets estimation comparison . . . . . . 4.1.4 General conclusions . . . . . . . . . . . . 4.1.5 Best models retraining . . . . . . . . . . . 4.1.6 Robustness analysis (loss of input signal) 4.2 Genetic Optimization . . . . . . . . . . . . . . . 4.2.1 Method description . . . . . . . . . . . . . 4.2.2 Results . . . . . . . . . . . . . . . . . . . 4.3 Summary tables . . . . . . . . . . . . . . . . . . 5 Wind estimation 5.1 Output wind signals . . . . . . . . . . . . . 5.2 Input selection . . . . . . . . . . . . . . . . 5.2.1 Correlation filter . . . . . . . . . . . 5.2.2 Principal Component Analysis filter 5.2.3 Subsets summary table . . . . . . . 5.3 Exploration wrapper . . . . . . . . . . . . . 5.3.1 Wind properties results . . . . . . . 5.3.2 General conclusions . . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. 43 44 44 47 55 60 62 64 64 65 66 68. . . . . . . . .. 69 69 69 70 71 72 72 72 79. 6 Conclusions and perspectives 81 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.2.1 Transferability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 A Aeroelastic simulation A.1 GAMESA G114 - 2.5 A.2 NEREA . . . . . . . A.2.1 Preprocess . A.2.2 Postprocess .. MW . . . . . . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. 87 87 89 89 90. B Spectral moments 91 B.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 C Scatter plots C.1 Blade root flapwise bending moment DEL . . . . . . . . . . . . . . . . . . . . . . . . C.2 Fixed shaft torque DEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.3 Tower base longitudinal bending moment DEL . . . . . . . . . . . . . . . . . . . . . D Correlation coefficients analysis D.1 Fundamentals . . . . . . . . . . . . . . . D.2 Coefficients description and formulation D.2.1 Pearson’s r . . . . . . . . . . . . D.2.2 Spearman’s ρ . . . . . . . . . . . D.2.3 Kendall’s τ . . . . . . . . . . . . D.3 Comparison . . . . . . . . . . . . . . . . E Correlation filter results. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 93 94 96 98. 101 . 101 . 102 . 102 . 102 . 102 . 103 105.

(15) XV E.1 Load estimation case . . . . . . . . . . . . E.1.1 Loads - stats correlation . . . . . . E.1.2 Subsets . . . . . . . . . . . . . . . E.2 Wind estimation . . . . . . . . . . . . . . E.2.1 Wind properties - stats correlation E.2.2 Subsets . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. F Error indicators G Models background G.1 Linear regression . . . . . G.2 Artificial Neural Networks G.2.1 Fundamentals . . . G.2.2 Architecture . . . G.2.3 Cross-validation . G.2.4 Training . . . . . .. . . . . . .. 105 105 109 116 116 120 127. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 129 . 129 . 130 . 130 . 131 . 132 . 134. H Optimization background 135 H.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 H.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 H.2.1 Genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137.

(16) XVI.

(17) XVII. List of Figures 2.1 2.2 2.3 2.4 2.5 2.6. Types of problem approaches: possible inputs and outputs. . Time series for input signals and 4 wind velocities. . . . . . . Histograms for input signals and 4 wind velocities. . . . . . . Electric power (P ) and generator speed (ω) theoretical curves Time series for target loads and 4 wind velocities. . . . . . . Histograms for target loads and 4 wind velocities. . . . . . .. . . . . . .. 8 13 14 15 17 18. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9. Types of estimation model inputs: variables and features. . . . . . . . . . . . . . . . . . . Input selection procedures. Adapted from [Guyon and Elisseeff, 2005] . . . . . . . . . . . . S Scatter plot example: θ̇ stats vs. Tm1 . Four velocity groups are colored. . . . . . . . . . . . Scatter plots of DEL against wind speed and Turbulence Intensity (TI). . . . . . . . . . . . Scatter plots of DEL against wind speed standard deviation and mean. . . . . . . . . . . . y Spearman IN-OUT coefficient. Top: Complete matrix. Bottom: θ̇ − Mm detail. . . . . . . 1 Variable ranking IN-OUT. Coefficients: 4 Pearson / × Spearman. . . . . . . . . . . . . . Pearson IN-IN coefficient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . COR95. Variable final selection. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included in the subset (IOlim ). . . . . . . . . . . . . . . . . . . Scree test application example. The inflection point indicates the first discarded Principal Component (PC). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complete PCA. The first seven eigenvalues are accompained by the original stat whose linear factor is the biggest. Dashed lines indicate the last PC included in the subset for each stopping rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial PCA. Dashed lines indicate the last PC included in the subset for each stopping rule. Wind inputs. Variable final selection. Coefficients: 4 Pearson / × Spearman. . . . . . . . .. 24 25 28 29 30 32 32 34. Scheme of wrappers designed in this work: exploration and optimization. . . . . . . . . . f Exploration results: Mm . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . 1 f Exploration results: Mm2 . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . S Exploration results: Tm . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . 1 S Exploration results: Tm2 . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . y Exploration results: Mm . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . 1 y Exploration results: Mm2 . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . S Exploration time: Tm . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . . 2 Box and whiskers symbol scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Target - estimation comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Output histograms comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Estimation Percentage Error (PE) comparison. . . . . . . . . . . . . . . . . . . . . . . Box-and-whiskers Percentage Error (PE) comparison. . . . . . . . . . . . . . . . . . . . Best models parameters for all subsets and loads. . . . . . . . . . . . . . . . . . . . . . Training algorithms stopping criteria, epochs and time reached during retraining. . . . . . Probability of appearance of each of the variables from the initial subset COR90 in the individuals of the last generation population. Selected components of the resulting subset are indicated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 43 48 49 50 51 52 53 54 55 56 57 58 59 62 63. 3.10 3.11. 3.12 3.13 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and histograms. . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . . . . . . . . . . .. 35 37. 39 40 41. . 67.

(18) XVIII 5.1 5.2. Spearman IN-OUT coefficient. Top: Complete matrix. Bottom: θ̇ − σ1 detail. . . . . . . . COR95. Variable final selection. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included in the subset (IOlim ). . . . . . . . . . . . . . . . . . . 5.3 Partial PCA. Dashed lines indicate the last PC included in the subset for each stopping rule. 5.4 Exploration results: Density. Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . 5.5 Exploration results: U pf low. Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . 5.6 Exploration results: Windshear. Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). 5.7 Exploration results: TI. Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . . . 5.8 Exploration results: σ1 . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . . . 5.9 Exploration results: Vm . Layers: 1( ) / 2(4) / 3(). Net type: FFN (◦) / CFN (•). . . . 5.10 Best models parameters for all subsets and wind properties. . . . . . . . . . . . . . . . . .. 70 70 71 73 74 75 76 77 78 79. A.1 GAMESA G114 - 2.5 MW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 A.2 Nacelle and hub detail. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 C.1 Scatter plots legend. Colors for each wind velocity group. . . . . . . . . . . . . . . . . . . 93 D.1 Correlation coefficients for 4 arbitrary data sets. . . . . . . . . . . . . . . . . . . . . . . . 103 D.2 Correlation coefficients for Anscombe’s Quartet data sets. . . . . . . . . . . . . . . . . . . 104 E.1 COR95 (Load). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.2 COR85 (Load). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.3 COR64 (Load). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.4 COR66 (Load). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.5 COR95 (Wind). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.6 COR85 (Wind). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.7 COR64 (Wind). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.8 COR66 (Wind). Subset stats. Coefficients: 4 Pearson / × Spearman. Dashed line indicates the last variable included (IOlim ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G.1 G.2 G.3 G.4 G.5 G.6. Linear regression from noisy data. . . . . . . . . . . . . . . . . . . . . . . . . Artificial neuron scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neural Network scheme example (fully connected Feed-Forward Network (FFN)). FFN (top) and Cascade-Forward Network (CFN) (bottom) schemes. . . . . . . Three-way data splits method. Sets built for cross-validation. . . . . . . . . . . Three-way data splits training example [Max. validation checks = 10]. . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . 110 . 112 . 114 . 116 . 120 . 122 . 124 . 126 . . . . . .. 129 130 131 131 132 133. H.1 Pareto front: 2 multiobjective optimizations (one for each connections configuration: CFN f Normalized Root Mean Squared Error (NRMSE)) and and FFN. Estimation error (Mm1 net size (neurons) are used as objectives. On the other hand, neurons, layers and inputs consitute design variables. Connections type (CFN and FFN) has been used as a parameter for each optimization (despite they are shown together). . . . . . . . . . . . . . . . . . . . 136.

(19) XIX. List of Tables 1.1. Works directly related with this project and their referred topics. . . . . . . . . . . .. 2.1 2.2 2.3 2.4. Load case for the Time series (TS) and histograms used as example. Statistical parameters applied to standard signals. . . . . . . . . . . Whöler slopes for each material. WT components and loads affected. Model variables: standard signals and loads. . . . . . . . . . . . . . .. 3.1. Input subsets obtained from input selection filters. . . . . . . . . . . . . . . . . . . . 42. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8. Exploration Master Table. . . . . . . . . . . . . . . . . Best overall models for each load. . . . . . . . . . . . . Optimization Master Table. . . . . . . . . . . . . . . . Optimization results. . . . . . . . . . . . . . . . . . . . Wind related inputs selected during the optimization. f Results review: Mm1 . . . . . . . . . . . . . . . . . . . S Results review: Tm1 . . . . . . . . . . . . . . . . . . . . y . . . . . . . . . . . . . . . . . . . Results review: Mm1. 5.1 5.2. Input subsets obtained from input selection filters. . . . . . . . . . . . . . . . . . . . 72 Best overall models for each wind property. . . . . . . . . . . . . . . . . . . . . . . . 79. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. . . . .. . . . . . . . .. 5 12 16 19 20. 45 64 65 66 67 68 68 68. A.1 GAMESA G114 - 2.5 MW specifications. . . . . . . . . . . . . . . . . . . . . . . . . 88 A.2 TI-Vm combinations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 C.1 Variables shown for each load. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 D.1 Anscombe’s Quartet series parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . 103 E.1 E.2 E.3 E.4 E.5 E.6 E.7 E.8 E.9 E.10 E.11 E.12 E.13 E.14 E.15 E.16. f Spearman (left) and Pearson (rigth) coefficients: Mm1 -Stats f Ranking following the maximum coefficient: Mm1 -Stats . . f Spearman (left) and Pearson (rigth) coefficients: Mm2 -Stats f Ranking following the maximum coefficient: Mm2 -Stats . . S -Stats Spearman (left) and Pearson (rigth) coefficients: Tm1 S Ranking following the maximum coefficient: Tm1 -Stats . . . S -Stats Spearman (left) and Pearson (rigth) coefficients: Tm2 S -Stats . . . Ranking following the maximum coefficient: Tm2 y Spearman (left) and Pearson (rigth) coefficients: Mm1 -Stats y Ranking following the maximum coefficient: Mm1 -Stats . . y Spearman (left) and Pearson (rigth) coefficients: Mm2 -Stats y Ranking following the maximum coefficient: Mm2 -Stats . . f COR95 (Load). Subset original and backup variables: Mm1 S COR95 (Load). Subset original and backup variables: Tm1 y COR95 (Load). Subset original and backup variables: Mm1 f COR85 (Load). Subset original and backup variables: Mm1. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. 106 106 106 106 107 107 107 107 108 108 108 108 110 111 111 112.

(20) XX E.17 E.18 E.19 E.20 E.21 E.22 E.23 E.24 E.25 E.26 E.27 E.28 E.29 E.30 E.31 E.32 E.33 E.34 E.35 E.36 E.37 E.38 E.39 E.40 E.41 E.42 E.43 E.44 E.45 E.46 E.47 E.48 E.49 E.50 E.51. S COR85 (Load). Subset original and backup variables: Tm1 . . . . y COR85 (Load). Subset original and backup variables: Mm1 . . . . f COR64 (Load). Subset original and backup variables: Mm1 . . . . S COR64 (Load). Subset original and backup variables: Tm1 . . . . y COR64 (Load). Subset original and backup variables: Mm1 . . . . Spearman (left) and Pearson (rigth) coefficients: Density-Stats . . Ranking following the maximum coefficient: Density-Stats . . . . . Spearman (left) and Pearson (rigth) coefficients: Upflow-Stats . . . Ranking following the maximum coefficient: Upflow-Stats . . . . . Spearman (left) and Pearson (rigth) coefficients: Windshear-Stats Ranking following the maximum coefficient: Windshear-Stats . . . Spearman (left) and Pearson (rigth) coefficients: TI-Stats . . . . . Ranking following the maximum coefficient: TI-Stats . . . . . . . . Spearman (left) and Pearson (rigth) coefficients: σ1 -Stats . . . . . Ranking following the maximum coefficient: σ1 -Stats . . . . . . . . Spearman (left) and Pearson (rigth) coefficients: Vm -Stats . . . . . Ranking following the maximum coefficient: Vm -Stats . . . . . . . COR95 (Wind). Subset original and backup variables: Density . . COR95 (Wind). Subset original and backup variables: Upflow . . COR95 (Wind). Subset original and backup variables: Windshear COR95 (Wind). Subset original and backup variables: TI . . . . . COR95 (Wind). Subset original and backup variables: σ1 . . . . . COR95 (Wind). Subset original and backup variables: Vm . . . . . COR85 (Wind). Subset original and backup variables: Density . . COR85 (Wind). Subset original and backup variables: Upflow . . COR85 (Wind). Subset original and backup variables: Windshear COR85 (Wind). Subset original and backup variables: TI . . . . . COR85 (Wind). Subset original and backup variables: σ1 . . . . . COR85 (Wind). Subset original and backup variables: Vm . . . . . COR64 (Wind). Subset original and backup variables: Density . . COR64 (Wind). Subset original and backup variables: Upflow . . COR64 (Wind). Subset original and backup variables: Windshear COR64 (Wind). Subset original and backup variables: TI . . . . . COR64 (Wind). Subset original and backup variables: σ1 . . . . . COR64 (Wind). Subset original and backup variables: Vm . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 113 113 114 115 115 117 117 117 117 118 118 118 118 119 119 119 119 120 120 121 121 121 121 122 122 123 123 123 123 124 124 125 125 125 125. F.1 Error indicators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 G.1 Simulations sets built in this work (different for each load). . . . . . . . . . . . . . . 133.

(21) XXI. List of Acronyms ANN. Artificial Neural Network. LR. Linear Regression. BR. Bayesian regularization. CFN. Cascade-Forward Network. CoE. Cost of Energy. COR. Correlation. DAS. Data Acquisition System. DEL. Damage Equivalent Load. FFN. Feed-Forward Network. IMU. Inertial Measurement Unit. LDD. Load Duration Distribution. LM. Levenberg-Marquardt. MSE. Mean Squared Error. NRMSE. Normalized Root Mean Squared Error. PC. Principal Component. PCA. Principal Component Analysis. PE. Percentage Error. PLC. Programmable Logic Controller. PSD. Power Spectral Density. RFC. Rainflow Counting Method. SCADA. Supervisory Control And Data Acquisition. SVD. Singular Value Decomposition. TI. Turbulence Intensity. TRL. Technology Readiness Level. TS. Time series. WT. Wind Turbine.

(22)

(23) XXIII. List of Symbols Standard signals P Generator electric power. T Generator torque. T d Generator torque demand. ω Generator velocity. Ω Rotor velocity. Ω̇ Rotor acceleration. ΩS Low speed shaft velocity. θ Blade pitch. θ̇ Blade pitch rate. θ̇d Blade pitch rate demand. ax Nacelle longitudinal acceleration. ay Nacelle lateral acceleration. an Nacelle nod acceleration. ar Nacelle roll acceleration.. Statistical parameters av Average. sd Standard deviation. sk Skewness. ku Kurtosis. mx Maximum. mn Minimum. rg Range. λn nth -order spectral moment..

(24) XXIV. Loads M f Blade root flapwise bending moment. T S Shaft torque. M y Tower base longitudinal bending moment.. DEL m1 First Wöhler exponent. m2 Second Wöhler exponent.. Wind properties Vm Mean wind velocity. σ1 Wind velocity standard deviation.. Correlation Spe Spearman correlation coefficient. P rs Pearson correlation coefficient. IIlim In-In limit (COR filter). IOlim In-Out limit (COR filter)..

(25) 1. Chapter 1. Introduction 1.1. Wind power, fatigue and estimation. Wind has been used as an energy source throughout several millennia. Boats propulsion or grain milling were two of its first applications. Its renewable and clean nature along with its strength and persistence in some geographical areas constitute its main advantages. This has encouraged humanity to improve wind exploitation and its progress has led to current WTs which are installed worldwide with a cumulative energy capacity of 500 GW. Nowadays, each of their components shows a sophisticated design and gathers a collection of high technology devices. Furthermore, their energy production (and therefore their size and complexity) is suffering a spectacular growth that everyday reaches higher levels. Nevertheless, energy production maximization attracts a great variety of technical obstacles. As an illustration, huge efforts have been made in pursuance of extracting the maximal kinetic energy from the wind by manufacturing aerodynamically efficient blade airfoils. Another of the most common ways of enhancing that energy production is by means of a lifespan extension. This forces engineers to consider different aspects during the WT development, service and maintenance such as equipment lifespan, environment evolution or parts structural loading over the years. Fatigue is an element closely related to that last item. From aircraft passenger window frames to wind turbines tower bases, it has represented a serious structural problem through the ages in the aeronautical industry. For many decades, several designing modifications have been carried out with the aim of reducing its effect but in recent years an even further step is being reached: cumulative damage reduction guided load monitoring. This technique consists in real time load observation and consequent control law determination with a view to reduce structural damage hence lengthening the component service life. Nonetheless, it presents a handicap: usually components loads are expensive, difficult or impossible to assess1 . Fortunately, some alternatives to direct measurements are available such as physical or statistical estimations. Certain tradition of statistical estimation problems exists inside the wind power industry even though it is mainly related to forecasting2 . In fact, wind power forecasting constitutes a key piece of the electricity markets these days. In this case, load estimations will not take part of forecasting issues as long as the estimation targets are present or past events. Instead of future values, the goal is to predict unknown values (due to their measurement cost or complexity). However, the principles are fundamentally the same. These responses are intended to be calculated arising out of other known or measured signals. More specifically, this study aims to estimate fatigue loads 1. Usually, measured strains are translated into loads using the structure rigidity. Strains are measured using strain gauges or more cutting-edge technological devices such us Fiber Bragg Grating (FBG) sensors. These are constituted by a short optical fiber that transmits or reflects certain light wavelengths. They present better performance than the former but their cost are exorbitant. None of these solutions seem suitable enough for common WTs. See [Hau, 2006] for more information relating WT sensors. 2 This science attempts to estimate a particular aspect of the future with relationships established from past and present data and their patterns or trends interpretations..

(26) 2. CHAPTER 1. INTRODUCTION. indicators taking WT standard signals3 as model input variables. For these estimation purposes, in this research, machine learning will be a crucial piece. This term comprises several methods and algorithms developed to give the computers the ability to learn without being explicitly programmed. One of their drawbacks is their computational cost but this is being surpassed by the increasing processing capabilities. These techniques have reached very high levels and the field is suffering a spectacular growth worldwide in many offbeat areas such as artificial intelligence or image recognition to name but two.. 1.2. Scope. As long as only simulation data is handled in this work, the target level of development is TRL4 3-4. This includes analytical and laboratory demonstrations. For this reason no practical implementation is applied here. Other authors have accomplished relevant environment demonstrations (TRL 5) by means of using real machine signals in their studies (see table 1.1 on page 5). A very important matter along this project is transferability. This refers to the method capacity to be transferred to other framework or situation; in other words, its generalization degree. This method flexibility is bear in mind all along the study. There exists three kinds of transferability: • Estimation problems. This means the process is capable to work from any kind of input and target variables serving as a baseline for any estimation problem. For instance, concerning the prediction of the wind turbines loads diverse approaches can be implemented such as estimating mechanical loads from standard signals or unknown loads from other measured loads. As an example, a wind estimation problem is carried out in chapter 5. • Simulation-real environment. Big differences between laboratory and real environment are probable to occur. While simulator only has certain parameters into account, real machines suffer from an assorted list of disturbances relating wind, machine and soil properties. This probably leads to a more complex estimation scenery and worst results than the synthetic calculations. • Machines and farms. Once the method is implemented in a turbine, the effects of moving to another one in the same or different farm have to be analyzed. Wake interference or manufacturing faults can be a problem between machines in the same farm whilst completely different wind and soil conditions can be found in another wind farm. These three transferability types are intended to be achieved, specially the first couple. In fact, the last one can only be properly assess when using field data. The second one, although needs also from measured signals, is achieved by designing a process that can be easily repeated with those signals. Finally, since these procedures are planned to be problem interchangeable as much as possible, adaptability between estimation problems is performed too. This is confirmed by the wind estimation carried out in chapter 5. In addition, while the last pair of transferability types are studied in other works, problem versatility is a new topic introduced in this research. Accordingly to all these aspects mentioned before the global goal is to create an automatic methodology which requires the minimal a priori or manual decisions. By doing this it is possible to achieve a repeatable, traceable, extrapolable and adaptable procedure. Another important remark is that this is an initial research on a relatively innovative topic in the industry. Besides the chosen findings and alternatives, other possible ways of carrying out 3 Standard signals are a group of known parameters used inside the machine computer that include sensor or control signals to name the most relevant. Consult section 2.2.1 for more information. 4 Technology Readiness Level (TRL) is a NASA established way of measuring the development evolution or maturity of any kind of technological innovation. It comprises 9 levels from basic principles observation (TRL 1) to successful mission operations (TRL 9). Later on, several organizations have embraced this system with slight modifications (US Government, European Commission, ESA...).

(27) 1.3. METHODOLOGY AND STRUCTURE. 3. the different steps must be considered and mentioned. Hence interesting topics deepening and bibliographic reviews are presented along the thesis. Doors are left open for future modification or rebuilding. More specifically, it must be pointed that only DEL loads are estimated here. Three different loads with two Whöler slopes are analyzed (see section 2.2.3). Modern wind turbines behave extremely differently depending the operational region where they work in each moment. This regions are usually divided into below and above rated (below and above the nominal velocity where nominal power is reached). For this reason, it seems that divide estimation models into regions could be a good idea. In this case not many simulation cases are among the below rated state and they constitute a small impact on fatigue loads so region separation has not been carried out. In addition, this segregation should be caught by the model itself.. 1.3. Methodology and structure. Data used in this project is selected from the three-bladed horizontal axis WT GAMESA G114 2.5 MW which is a representative model for mid-range power. Experiments are calculated using NEREA aeroelastic simulator, the company software for these purposes. A deeper description of the machine and simulator is gathered in appendix A. An essential organ of this research body was the developed software including all the calculations and algorithms. Two main programs are used for this task: • Scilab. In the beginning, data from NEREA output text files was read and postprocessed via Scilab scripts. This results were later used as input data for the models. • Matlab. The core of the work was carried out using this software. The most important toolboxes handled were the Statistics and Machine Learning Toolbox, Neural Network Toolbox and Optimization Toolbox Chapters in this work are strictly ordered following the developed procedure. Nevertheless, they can be consulted separately if special information is required. Every time it has been considered convenient, graphical diagrams are displayed for the sake of easier understandings and interpretations. The fundamental layout is briefly enumerated in the following paragraph. Firstly the general approach used in order to solve the estimation problem is presented in the second chapter. The evaluated standard signals and loads are also described and analyzed in a preliminary examination. In the next step (chapter 3), an input selection process is designed to counterbalance the great dimensions of the problem data. There, a great variety of input selection methods are mentioned. Several model parameters and input sets are assessed and compared in chapter 4. Later, an equivalent procedure is carried out in chapter 5 but in this case taking the wind properties as the final estimation targets. Finally extracted conclusions and possible improvements, additions or future evolutions are presented (chapter 6). Appendices are absolutely isolated sources of deeper information related with different aspects of the main study. There, several descriptions can be found for the characteristics of the aeroelastic simulations from where the data of this project comes (appendix A) or they way the spectral moments are computed (appendix B). Some results concerning the input selection process are available in appendices C (scatter plots) and E (correlation and subsets). Finally, interesting theoretical background related with the correlation coefficients (appendix D), different error indicators (appendix F), estimation models (appendix G) or optimization fundamentals (appendix H) can be also consulted..

(28) 4. CHAPTER 1. INTRODUCTION. 1.4. Literature review. Since interesting and generalist collections of statistical wind power forecast and fatigue load estimation references can be found in the second chapter of [Gallego, 2013] and section 1.2 of [Cosack, 2010] respectively, only recent and closely related to this job works are cited in this section. Firstly, a group of works which delve into statistical load estimation is commented. All of then use ANNs based models for this purpose. It must be differentiated those who use fatigue indicators as targets from other who use direct loads. • [Cosack, 2010] The Dr. Cosack’s thesis has served as a fantastic baseline and contains a lot of interesting information related with this topic. In fact, its outline share some perspectives with the one followed here. There, ANNs are first trained with simulated data and later with measured data from real WTs. In addition to fatigue equivalent loads, distributions are also estimated. Special consideration deserve chapters 3 and 4 where a disturbance detectability study and estimation model comparisons are carried out respectively. A brief summary of this thesis can be found in [Cosack, 2009], presented in the 2009 conference of the European Wind Energy Association (EWEA). • [Hofemann et al., 2011] Load TS (not fatigue indicators) are directly estimated from standard signals. Once again, models are constituted by ANNs. Transferability between turbines is one of the main objectives of this project. Work presented in the 2011 EWEA conference. • [Smolka et al., 2012] Following her coauthor Cosack footsteps, the fatigue loads are estimated using ANNs. In this case, measured data from an offshore prototype is used. Results for some loads are improved when including the sea state between the model inputs. Lecture from the 2012 conference of Research at Alpha Ventus (RAVE). • [Vera-Tudela and Kühn, 2014] A very interesting comparison between input selection methods for an ANNs based fatigue load estimation model is carried out. Similarly to the research carried out in the third chapter of this thesis, five different approaches are evaluated there. • [Vera-Tudela et al., 2015] A stochastic model following the Langevin approach is developed for load reconstruction. It seems that frequency results are better to those achieved with ANNs models. Slides from the 2015 RAVE conference. Second group is comprised by projects that estimate WT loads but using different approaches rather than ANNs such as physical modelization or state estimation using a database. • [Nelson and Manuel, 2003] A curious, even though a bit outdated, study where inflow parameters such as as wind components or Reynolds stresses are compared in order to rank their relevance when used as load estimation inputs. Linear regressions are used as models. Not much importance is given to its results as long as it only includes inflow parameters and they are not used in this work as possible model inputs. • [Hau, 2006] A physical approach to load estimation where linearized models and state-space theory is applied. Illustrative and detailed chapters are dedicated to WT components loads and both available and possible future sensors where this topics are deeply explained. A continuation from this project is carried out in [Jasniewicz, 2011]. • [Ochoa, 2012] This research studies the feasibility of using models based on state estimation with database for blade loads estimation purposes. The influence of turbulence on the blade lifespan is also studied. • [Koopman, 2013] An homogeneous massless beam is used as physical model for the tower base load estimation using tower top accelerations as input. Tower top deflection is also intended to be estimated arising from that acceleration..

(29) 1.4. LITERATURE REVIEW. 5. Table 1.1 contains a summary of these works. The terms statistics, physic and state refer to different approaches researched with load estimation purposes (see chapter 4 of [Cosack, 2010]). It is also interesting the fact that two of the studies have been carried out an input selection analysis and all of them use real environment measurements for their calculations. Table 1.1: Works directly related with this project and their referred topics.. Reference. Fatigue. Stat.. [Cosack, 2010] [Hofemann et al., 2011] [Smolka et al., 2012] [Vera-Tudela and Kühn, 2014] [Vera-Tudela et al., 2015]. ×. × × × × ×. [Nelson and Manuel, 2003] [Hau, 2006] [Jasniewicz, 2011] [Ochoa, 2012] [Koopman, 2013]. ×. × ×. Physic. State. × × × × ×. ×. IN sel. × ×. × × × ×. ANN. × ×. Measures × × × × × × × × × ×.

(30)

(31) 7. Chapter 2. Problem statement The problem is widely introduced and explained in this chapter. Possible approaches are outlined and commented. Later, the inputs (statistical parameters of standard signals) and outputs (equivalent loads) of the estimation model are described and analyzed briefly. At the end of the chapter some comments relating the practical implementation impact are pointed out.. 2.1. General approach. In this particular case of fatigue loads estimation quite a few approaches can be implemented. First of all, as stated in section 1.2, only simulation data is handled but the future measured data usage must be considered for the transferability. Seeing that, in case of simulations, the available information is comprised by 10-minute time series from various signals the below compared approaches are available. Even though desired model is intended to estimate load indicators deriving directly from standard signals, another peculiar approach could be executed. First, wind condition could be estimated from standard signals and then, by means of a transfer function (which could be another concatenated estimation model or a simulator), estimate load indicators from wind. This is a priori discarded due to the complexity increase as well as the possible error accumulation. Other promising procedures could be also revisited in the future. That is the case of estimation problems using loads measurements as model inputs. Even not so common, some turbines include load sensors not only in test prototypes but in mass-produced machines1 . This allows the design of original estimation methods such as: • Load-load. Arising from blade root moments other loads can be estimated. Specially tower loads are strongly correlated with the former. • Load-wind. Turbulence shows a great relationship with blade root loads standard deviation. Other wind properties could also be analyzed.. 2.1.1. Standard signals - loads scheme. Figure 2.1 on the following page represents the conceivable approaches of the direct relationship between standard signals and load indicators, the chosen approach in this case.. 1. The GAMESA 5 MW model includes strain gauges in its blades root. With that information control algorithms are modified for the shake of load reduction. One of these procedures is the Individual Pitch Control (IPC) which sends different pitch demands for each blade. Its relevance grows as turbulence wind increases and heterogeneous conditions are suffered along the rotor swept area..

(32) 8. CHAPTER 2. PROBLEM STATEMENT. Possible output (targets) Wind turbine measurements. Original raw data (Standard signals TS). Model validation. Load TS. Possible input (predictors). Time Series. Load Stats. Estimation model Stats. Load distribution. This work. Wind turbine simulation. Equivalent load (DEL). This work. This work. Figure 2.1: Types of problem approaches: possible inputs and outputs.. Both inputs and outputs are organized in figure 2.1 from fully information (TS in the top) to less information (bottom). Only the load stats - load distribution relative position could be arguable because they offer different kind of information. It must be clear that the method is based on supervised learning (see section 2.1.2 on the facing page), hence desired estimation outputs from training experiments must be introduced as model inputs in the training phase. In the next paragraphs various approaches are discussed. • Input2 . These are the signals or variables that work as the estimation model predictors. They represent the known x variable. Desired targets must be estimated with calculations based on them. – Time series (TS). The direct TS can be introduced as model inputs. – Statistical parameters (Stats). These can be calculated from the 10-minute simulations or measurements. They are explained in section 2.2.2 on page 15. • Output3 . These are the signals or variables that work as the estimation model targets. They represent the unknown y variable. Their estimated value must minimize the error respect to their real value (which is normally unknown). – Time series (TS). Same as before. – Stats. Same as before. – Load distributions (range or magnitude). These represent the occurrence probability of load cycles or exposure time from each load range or magnitude respectively. They are explained in section 2.2.3 on page 17. – Equivalent load (range or magnitude). These are a single value which represents the equivalent damage suffered in a TS but with a less descriptive information. They are explained insection 2.2.3 on page 17.. 2. Input variables in estimation slang can be referred as regressors, covariates, predictors or independent/exogenous/explanatory variables. Do not confuse with variable and feature terms used in chapter 3 which conform two different kinds of model inputs. 3 Output variables in estimation slang can be referred as regressands, targets or dependent/endogenous/response/criterion variables..

(33) 2.1. GENERAL APPROACH. 9. The estimation based on TS is a very similar approach to that used in forecasting problems. There, TS are directly estimated from previous observations. This does not seem to be a suitable way in this case because of the great amount of data manipulated. However, it could constitute a good method for a very fine real-time load estimation. Neither the selection of load stats as target appears to fulfill the imposed requirements. It would be necessary another model (transfer function) to translate load stats into final load indicators (distributions or equivalent loads). Due to the data quantity and the fatigue specifications, the stats - equivalent load is the selected choice. Load distributions are a promising target but a more complex method would be required for their estimation. Both load indicators (distributions and equivalent loads) are estimated in [Cosack, 2010] from standard signal stats. It is important to notice as well that one model is designed here for each single target. For example, a single ANN could be able to calculate several outputs, however its size may grow too much for obtaining good results (specially if very diverse outputs are desired). This is due to the different inputs and connections that each target estimation requires.. 2.1.2. Machine learning. The estimation model is intended to be based on machine learning procedures. This computer science branch is usually defined as the ability of a computer to learn to perform a task without being explicitly programmed to do so. The more common goals for machine learning are forecasting, estimation, outliers searching, pattern recognition, clustering, control optimization... The tool that has allowed the breathtaking machine learning evolution is ANNs. Although constantly changing owing to its relative recent arrival, this field is frequently divided into three main learning methods: • Supervised learning. During its training, a set of known both inputs and outputs is given to the model. In this period the model parameters are optimized to find the best results that minimize the estimation error. The applications of this learning method are countless. It can be employed always that a representative set of paired input and output is available. Due to the universal capabilities of models such as ANNs to represent any kind of non-linear relationships, results are usually very competitive. • Unsupervised learning. Targets are not supplied to the learning algorithm, hence inputs are referred as unlabeled data. The model searches for intrinsic structures, patterns or relationships between its inputs. This can be used as feature learning or clustering. Variables or individuals are divided into groups with similar properties. Outliers can also be identified. Examples of this are the NASA remote sensing unsupervised classification which can be used for discerning ocean water types from Earth observations [Romero et al., 2015] or the pattern and structure search in financial TS for trading decisions [Millin, 2010]. • Reinforced learning. The learning agent interacts dynamically with the environment. Each time step the program observes the current state, applies an action among all the possibles and is rewarded or punished whether the new state satisfies some a priori assumptions or not. The final goal is to maximize the cumulative reward. Contrary to supervised learning where data samples are available from the beginning, here they become available as the algorithm executes actions. This kind of training is been investigated massively for control purposes due to its direct application. In the wind field several examples can be found, here some for three different continents: [Anderson, 2010], [Sedighizadeh and Rezazadeh, 2008], [Hentschel, 2016]..

(34) 10. CHAPTER 2. PROBLEM STATEMENT. Another new learning type is deep learning. This technique involves both supervised and unsupervised learning. In other words, it carries out the feature learning and the data classification, thus is also named end-to-end learning. Its development is being closely related to Convolutional Neural Networks (CNNs) since they are the basic tool for deep learning. These nets include a first stage comprised by filtering layers and eventually common layers are in charge of the classification process. All these techniques usually comprise an initial training stage. Models performance is fully dependent from their configuration (basically net parameters -architecture, weights, biases... in ANN, see appendix G). It must be adjusted in a way that allows to obtain desired results. This is achieved via training algorithms whose main goal is the error reduction while fulfilling certain requirements or constraints. When output targets are available, those must be introduced as model known outputs during the training phase. In addition, machine learning is tightly related with artificial intelligence and data mining. The last is another important aspect of this work. As can be found in chapter 3, the search of valuable information inside a set of several variables is a key pre-process in estimation problems. In this project, (considering the particular aim and available data) the applied decisions relating possible methods or procedures are the following. • Learning technique: supervised. Loads are the main target of the estimation and ready for use sets are extracted from simulations or measured data eventually. This involves an initial training stage as commented before. • Estimation model: ANN. These are universal approximation models based on small and interconnected cells (see appendix G). Due to their great flexibility capacities are expected to provide acceptable results. Also Linear Regressions (LRs) are used for comparison purposes.. 2.2. Model variables. This section contains a description of all initial variables studied including standard signals and components loads. Their stats and fatigue indicators respectively computed as model parameters are also specified in each correspondent section. A summary table is located after them to allow to get a general idea at a glance.. 2.2.1. Standard signals (input). Standard signals (or better, their stats as shown in the diagram of figure 2.1 on page 8) are the variables selected as estimation model inputs. They can be usually obtained from the machine Programmable Logic Controller (PLC), Supervisory Control And Data Acquisition (SCADA) or different DAS4 . A deeper discussion of this issue can be found in the practical implementation section 2.3 on page 20. Their values are mainly provided by diverse sensors located in the wind turbine components and gathered in the central control of the machine. A very detailed description of these sensors is included in chapter 3 of [Hau, 2006]. The list on the facing page contains all the signals included in this research. They are divided into two main groups: firstly signals that are being employed in real environment already for control purposes or similar tasks and secondly signals that could be measured in the future.. 4. Data Acquisition System (DAS) usually refers to the group of devices that provides measured information from sensors. Commonly, though not compulsory, this term is used for measuring subsystems in test prototypes. It comprises all necessary equipment from the sensor physical measure to the final digital signal..

(35) 2.2. MODEL VARIABLES. 11. • Currently available signals. The following list comprises signals that are already in use in the majority of the modern installed turbines. They are categorized following the component where they are obtained or measured. – Generator ∗ Electric power (P ). Final power generated once losses have been discarded. ∗ Torque (T ). Electromagnetic momentum in the generator. ∗ Torque demand (T d ). Wind turbine control demands a determined generator torque for each wind velocity or turbine condition. Its corresponding torque demand is then obtained (assuming proportional control) h i Ṫe = KP g Te − Ted (V ) (2.1) ∗ Velocity (ω). Angular velocity measured in the electric generator, at the end of the drive train. – Rotor ∗ Velocity (Ω). Hub angular velocity. ∗ Acceleration (Ω̇). Hub angular acceleration. – Low speed shaft ∗ Velocity (ΩS ). Shaft angular velocity measured between the hub and the multiplier gearbox. – Blade ∗ Pitch angle (θ). Angular position of the blades respect to their longitudinal axis. When blades are in its design position pitch equals zero. ∗ Pitch rate (θ̇). Derivative of the pitch angle. ∗ Pitch rate demand (θ̇d ). Wind turbine control demands a determined pitch for each wind velocity or turbine condition. Its corresponding pitch rate demand is then obtained (assuming proportional control) h i (2.2) θ̇d = KP θ θ − θd (V ) – Nacelle ∗ Longitudinal acceleration (ax ). Nacelle linear for-back acceleration. ∗ Lateral acceleration (ay ). Nacelle linear right-left acceleration. • Potential signals. The installation of IMUs5 inside the nacelle with the aim of measuring new possible signals is being under investigation. Two accelerations show a promising potential and they are included in the study even though their use is not always feasible. – Nacelle. ∗ Nod acceleration (an ). Angular acceleration due to the nacelle spin around the transverse axis (equivalent to aircraft pitch). ∗ Roll acceleration (ar ). Angular acceleration due to the nacelle spin around the longitudinal axis (equivalent to aircraft roll).. 5. Inertial Measurement Unit (IMU) is a common equipment included in traditional aircraft and UAVs. It usually contains sensors such as accelerometers, gyroscopes and magnetometers. Its information is crucial for guidance, navigation and control purposes..

(36) 12. CHAPTER 2. PROBLEM STATEMENT. As the reader may have realized, two signals are repeated by their demand partners. The inclusion of both signals (real and demand) can be interesting because their significance is reasonably different. While the former represents the actual state of the component, the second gives an idea of the control situation. Normally their behavior is very similar but under certain circumstances their values can diverge providing valuable information about the control status. Another important remark must be done with three signals related with the shaft (rotor velocity, generator speed and low speed shaft). All of them measure angular velocity somewhere in the drivetrain. Nonetheless, they suffer little variations due to the components rigidity. A final commentary must be done relating this signals. Those which are related to the blades have only been studied using one of the three of them. This statement is also applicable to the flapwise moment that is used as target load and explained some pages ahead. The usage of Multiblade Components (MBC; which avoid the azimuthal dependency changing from rotating to non-rotating coordinates) seems not very useful in this research although further studies could be carried out. A sample of standard signals TS can be observed in the plots showed in figure 2.2 on the facing page. The first 200 seconds from a 10-minute simulation are presented. The chosen load case parameters are gathered in table 2.1. Table 2.1: Load case for the TS and histograms used as example.. Density. Upflow. Windshear. TI. Vm. 1.15. 10. 0.2. 22. [7 9 15 23]. From the 10 velocities simulated (from 7 m/s to 25 m/s; see appendix A) four different are chosen for a complete behavior representation: • Vm = 7 m/s. This is a clearly below-rated velocity. It is the slowest case included in the study. Even the smallest velocity, the generator speed is already in its linear region (see figure 2.4 on page 15). • Vm = 9 m/s. Despite the fact that nominal velocity is Vmn = 11 m/s, at Vm = 9 m/s the machine arrives at a relevant point. At this velocity the maximum shaft velocity is reached as figure 2.4 on page 15 show. This fact can also be also checked in its correspondent TS. • Vm = 15 m/s. A common above-rated state. It represents a very stable region as long as turbulence does not reach neither rated velocities nor extremely high ones. For this reason plots show nominal behavior the most of the 10 minute simulation. • Vm = 23 m/s. An extremely high velocity where safety stops are exhibit several times. Their peaks can be clearly observed in the electrical power (P ) or generator torque (T ) graphs. This information is also plotted in histograms (figure 2.3 on page 14). This format allows the viewer to get some considerations such as probability distributions or possibly related variables..

(37) 2.2. MODEL VARIABLES. 13. 100. 150. 200. θ̇ d [deg/s] 50. 100. 150. 5 150. 10 50. 100. 150. 100. 150. 200. 0. 50. 100. 150. 200. 0. 50. 100. 150. 200. 0. 50. 100. 150. 200. 0. 50. 100. 150. 200. 0. 50. 100. 150. 200. 0. 50. 100. 150. 200. 0 -5. 10 5. 200 10 0 -10. 200 1800. ω [rpm]. ΩS [ rpm]. 100. 12. 0. 12 10. 1600 1400 1200. 0. 50. 100. 150. 200. 1. ay [m/s2 ]. ax [m/s2 ]. 50. Ω̇ [deg/s2 ]. Ω [rpm]. 0. 50. 15. T d [kNm]. 10. 0 5. 200. 15. Vm =23 m/s. 30 20 10 0. 0 -1 0. 50. 100. 150. 0.5 0 -0.5. 200 2. -2. 2. 5. [deg/s ]. θ̇ [deg/s]. 50. 4 2 0 -2 -4 0. an [deg/s2 ]. Vm =15 m/s. θ [deg]. 2.5 2 1.5 1 0. T [kNm]. Vm =9 m/s. ar. P [MW]. Vm =7 m/s. 0 -5 0. 50. 100. Time [s]. 150. 200. 0. Time [s]. Figure 2.2: Time series for input signals and 4 wind velocities..

(38) Probability [%]. 14. CHAPTER 2. PROBLEM STATEMENT. 50 40 30 20 10 0. 25 20 15 10 5 0 0.5. 1. 1.5. 2. 25 20 15 10 5 0. 2.5. 5. Probability [%]. P [MW]. Probability [%]. 5. 10. 20. 30 24 18 12 6 0 1200. 30. -2. 0. 2. -2. 3_ [deg/s]. 1600. 10. 14. 0. 1. -1. 0. ay. [m/s ]. 0. an. 5 2. [deg/s ]. [deg/s]. 12. 14. + [rpm]. 1. -10. 2. 0. 10. +_ [deg/s ] 2. [m/s ]. 25 20 15 10 5 0 -5. 2. 35 28 21 14 7 0. 2. 25 20 15 10 5 0. 10. [ rpm]. 35 28 21 14 7 0. ax. 0. 3_ d. 12. +S. ! [rpm]. -1. [kNm]. 30 24 18 12 6 0. 1800. 35 28 21 14 7 0. 15. 50 40 30 20 10 0. 30 24 18 12 6 0 1400. 10. Td. 50 40 30 20 10 0. 3 [deg]. Probability [%]. 15. T [kNm]. 50 40 30 20 10 0 0. Probability [%]. 10. Vm =7 m/s Vm =9 m/s Vm =15 m/s Vm =23 m/s. -2. 0. ar. 2 2. [deg/s ]. Figure 2.3: Histograms for input signals and 4 wind velocities..

(39) 2.2. MODEL VARIABLES. 15. Taking a glance at the histograms of figure 2.3 on the facing page three groups of variables types can be distinguished following their probability distributions appearance: • Zero-mean variables. This group contains accelerations and pitch rate (this seems a logical behavior as long as pitch is not expected to grow indefinitely). As their histograms show, they have a normal probability distribution. Their standard deviation increases with wind velocity and skewness is not perceived at first sight. • Drive-train variables. Shaft and generator speeds (and consequently its torques) and electric power show a particular behavior. They seem to exhibit binormal distributions for some velocities which correspond to different regimes. In the electrical power case, the maximum peak corresponds to its nominal value (PN = 2.5 MW). In the generator case, the bigger peak is caused by the maximal angular speed (ωmax = 1680 rpm) and the smaller by the minimal (ωmin = 1100 rpm). In both cases, the smaller peak can not be properly observed as long as simulations for lower than 7 m/s wind velocities (a margin must be considered because TI forces smaller speeds) have not been carried out.. P [MW]. Figure 2.4 tries to clarify this performance. There, the ideal curves for electric power and generator speed are displayed along with sketches of their possible histograms. This distributions could indicate than other statistical parameters such as quantiles could be desirable in order to increase the inputs estimation power. However this has not been done in this research. 3 2.5 2 1.5 1 0.5 0. 1. 3. 5. 1800. ! [rpm]. 3 2.5 2 1.5 1 0.5 0. Simulation cases !. 7. 9. 11 13 15 17 19 21 23 25. 1400. 1. 0.25 0.5 0.75. 1. 1600 1400 1200. 1200 1000. 0.25 0.5 0.75. 1800. Simulation cases !. 1600. 0. 1000 1. 3. 5. 7. 9. 11 13 15 17 19 21 23 25. Vm [m/s]. 0. Probability [%]. Figure 2.4: Electric power (P ) and generator speed (ω) theoretical curves and histograms.. • Pitch. This variable is clearly dependent from wind velocity. It presents a normal distribution whose skewness is very high and positive as long as its peak is reached at low velocities. This is due to the large amount of time that this control device is not activated and stays at zero values in this regime.. 2.2.2. Statistical parameters (stats). As stated before, some stats are calculated from the standard signals obtained in 10-minute simulations. These are supposed to be the model inputs from which loads must be estimated. A summary of these parameters can be found in table 2.2 on the next page..

(40) 16. CHAPTER 2. PROBLEM STATEMENT Table 2.2: Statistical parameters applied to standard signals.. Parameter. Symbol. Average Standard deviation Skewness Kurtosis. av sd sk ku. Maximum Minimum Range. mx mn rg. First spectral moment Second spectral moment Third spectral moment Fourth spectral moment. λ1 λ2 λ3 λ4. The first four parameters correspond to the first to fourth central moments. The following group (mx, mn and rg) are related with signals range. Finally, spectral moments represent the Power Spectral Density (PSD) integral of the signals. The zeroth moment is the area under the spectral curve, this means the variance when PSD is normalized (λ0 = σ 2 ). Higher order moments give more weight to higher frequencies. Negative order spectral moments could also be used if lower frequencies are expected to retain relevant information. In this work they are calculated by means of a series expansion derived in [Holm, 1983]. This method is analyzed and compared in appendix B and in appendix G of [Cosack, 2010]. The combination of 14 input signals with these 11 statistical parameters gives a total of 154 possible model inputs. This great amount of variables requires an initial dimensionality reduction that is accomplished in chapter 3. 2.2.2.1. Potential statistical parameters. Other potential statistical parameters that have not been used in this work could be used if further development is needed. None of the following have been mentioned in the load estimation consulted bibliography: • Quantiles. Some histograms shown in figure 2.3 on page 14 display an appearance that does not look like a normal probability distribution. In fact, in the so-called ’Drive train variables’, their distribution resembles a binomial one. For this reason different quantiles could be born in mind if this dispersion is intended to be considered. • Frequency Magnitude. Sometimes, instead of the whole spectrum, only the magnitude in a desired frequency gives relevant information. Among these distinctive frequencies, 1P, 2P or 3P6 could be remarked. There exist two ways of computing this task: directly from the previously calculated PSD (a CPU expensive process) or by means of the optimized Goertzel algorithm7 .. 6. It is usual in wind engineering referring to rotor frequency (once per revolution) as 1P. Its armonics are therefore designated as 2P, 3P... These values sometimes are related to physical phenomena such as structural natural frequencies or rotor imbalances. For example, 1P is the first excitation frequency and 3P is the blade passing frequency for a three-bladed rotor. 7 The Goertzel algorithm is applied in tone detection (call progress decoding, frequency response measurements...). It uses trigonometric calculations in order to obtain the magnitude and phase (basic Goertzel) or the magnitude (optimized Goertzel) of a target frequency. A more detailed descrioption can be found in [Banks, 2002]..

(41) 2.2. MODEL VARIABLES. 2.2.3. 17. Loads (output). Loads (or better, their DEL as shown in figure 2.1 on page 8) are the variables selected as estimation model outputs. For the model training, they are calculated via simulations. If training is intended to be carried out with real data their values will be obtained by sensors such as strain gauges. These load measures are usually carried out only in a test model for each wind turbine model during a limited examination period. A very detailed description of these sensors is included in chapter 4 of [Hau, 2006]. The complexity and cost of these measures (also illustrated in that chapter) is one of the causes that explain why their calculations are cheaper by means of estimation methods. The following loads are used as targets in this work: • Blade root flapwise bending moment (M f ). • Fixed Shaft torque moment (T S ). • Tower base longitudinal bending moment (M y ).. M f [MNm]. These three loads joined together with two more (tilt moment and yaw moment) are a set that describes rather well the global wind turbine loading situation scenario. As it was done in 2.2.1 with stats, now loads TS and histograms are plotted in figures 2.5 and 2.6 for the same wind velocities explained before. 6 4 2 0. M y [MNm]. T S [MNm]. -2 0. 20. 40. 60. 80. 100. 120. 140. 160. 180. 200. 0. 20. 40. 60. 80. 100. 120. 140. 160. 180. 200. 40. 60. 80. 100. 120. 140. 160. 180. 200. 2 1.5 1. 40. Vm =7 m/s Vm =9 m/s Vm =15 m/s Vm =23 m/s. 20 0 0. 20. Time [s] Figure 2.5: Time series for target loads and 4 wind velocities..