• No se han encontrado resultados

Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score

N/A
N/A
Protected

Academic year: 2020

Share "Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score"

Copied!
11
0
0

Texto completo

(1)

Contents lists available at ScienceDirect

Neurocomputing

journal homepage: www.elsevier.com/locate/neucom

Data-driven

prognostics

using

a

combination

of

constrained

K-means

clustering,

fuzzy

modeling

and

LOF-based

score

Alberto Diez-Olivan

a,∗

, Jose A. Pagan

b

, Ricardo Sanz

c

, Basilio Sierra

d

aTecnaliaResearch&Innovation,IndustrialSystemsUnit,Donostia-SanSebastián,Spain bDieselEngineFactory,DiagnoseEngineeringandProductDevelopment,Cartagena,Spain cAutonomousSystemsLaboratory,UniversidadPolitécnicadeMadrid,Spain

dDepartmentofComputerSciencesandArtificialIntelligence,UPV/EHU,Spain

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received26July2016 Revised10December2016 Accepted4February2017 Availableonline22February2017

CommunicatedbyShenYin

Keywords:

Behaviorcharacterization Conditionmonitoring Constrainedk-meansclustering Fuzzymodeling

Localoutlierfactor Machinelearning

a

b

s

t

r

a

c

t

Today,failuremodescharacterizationandearlydetectionisakeyissueincomplexassets.Thisisdueto thenegativeimpactofcorrectiveoperationsandtheconservativestrategiesusuallyputinpractice, fo-cusedonpreventivemaintenance.Inthispaperanomalydetectionissueisaddressedinnewmonitoring sensordatabycharacterizingandmodelingoperationalbehaviors.Thelearningframeworkisperformed onthe basisofamachinelearningapproachthatcombinesconstrainedK-means clusteringforoutlier detectionandfuzzymodelingofdistancestonormality.Afinalscoreisalsocalculatedovertime, con-sideringthemembershipdegree toresultingfuzzysetsand alocal outlierfactor.Proposed solutionis deployedinaCBM+platformforonlinemonitoringoftheassets.Inordertoshowthevalidityofthe ap-proach,experimentshavebeenconductedonrealoperationalfaultsinanauxiliarymarinedieselengine. Experimentalresultsshowafullycomprehensiveyetaccurateprognosticsapproach,improvingdetection capabilitiesandknowledge management.Theperformanceachievedisquitehigh(precision,sensitivity andspecificityabove93%andκ=0.93),evenmoresogiventhataverysmallpercentageofrealfaults arepresentindata.

© 2017 Elsevier B.V. All rights reserved.

1. Introduction

Unexpected incidental failures imply an important impact in terms of risks, costs, resources and service loss that should be minimized [1]. The growing complexity of industrial equipment, systems andinstallations results inan ever-increasing amountof healthmonitoringinformation,whicheventuallyexceedthe capac-ityofmostfault detectionsystemsandmakes thedesign of suc-cessfulmaintenancemethodologiesmorechallenging.Moreover,it is importanttoprovide abetter understanding ofmonitored sys-tems andto efficiently characterize the normalbehaviors froma huge amountof historicaldata. Thelack ofknowledge aboutthe behavior of complex assets makes the problem of maintenance verydifficult.

Naval sector, for instance, is traditionally focused on preven-tive strategies,usually dividedby3–4maintenance difficultylevel groups: vessel crew, vessel base, shipyard, and manufacturer [2].

Correspondingauthor.

E-mailaddresses:[email protected],[email protected]

(A.Diez-Olivan),[email protected](J.A. Pagan),[email protected](R. Sanz), [email protected](B. Sierra).

Crew and base level maintenance tasks are planned and carried out during vessel operating time; however, shipyard and manu-facturer tasks are done in programmeddock periods. The whole vessellifecycleisdividedbylongoperatingperiods separatedby shortmaintenance periods,someofthemdry-dock.When an im-portant unexpected breakdown occurs during operation (at both shipyardormanufacturerlevel), vesselactivitystopsandplanned missionshavetobecancelled.Additionally,importantrepaircosts must be envisaged. In such cases it is often necessary to open dismantling routes aiming to remove defective parts. Sometimes a cesareanin thevessel hullor evena dry-dock have tobe per-formed to extract the involved equipment. Therefore, total costs may include: defective parts, reparation manpower, dismantling route procedures and a cesarean or a dry-dock. Indirect tasks are usually more expensive than normal component reparation processes. Moreover, when a failure arises while a mission is in process,objectives could notbe fulfilledorthe missionmightbe cancelledandvessel is returned tobase. Butthe worst-case sce-narioisthatinwhichvesselorcrewsafetyarethreaten.

This work is an effort to implement a novel machine learn-ingbased approachthat aims tominimisethe negativeeffects of unexpected breakdowns, providing a reliable fault detection and

(2)

predictionstrategy. Thisapproach hasbeenapplied overreal op-erationaldataacquiredfromanauxiliarydieselengineduringreal vessel operation, since it is one of the most critical vessel com-ponents:it supplies propulsion andenergy to the vessel andits behavioriscomplex,asitisareciprocatingenginematchedwitha turbocharger.Astudyin-depthwascarriedout intothepossibility ofimprovementthroughthe useofdata-drivenmachine learning techniquesto statisticallymodel the normal behavior of the en-gine,ina fullyautomated unsupervisedfashion.Todo so, behav-iorcharacterizationandfuzzymodelingareapplied tomonitoring sensordata. Moreover,knowledge models generated are compre-hensive,yetaccuratemethodstoanticipatepotentialcriticalfaults. Resultingmodelsandall availableinformationareintegratedina specific CBM+ (Condition Based Monitoring) system, which com-binesCBM,RCM(ReliabilityCenteredMaintenance)andAI (Artifi-cialIntelligence)capabilities.Althoughthestudyisfocusedonthe exploitationofoperationalparameters,itmustbementionedthat theproposedapproachcanbealsoappliedtoothertypesof oper-ationalparameters that canbe usedasfailure indicators,such as vibrations, fluids analysis, thermography information, in-cylinder pressureorultrasonicinformation.

Therestofthearticleisorganizedasfollows.Section2presents a review of related andprevious works underthe use of condi-tionbasedmaintenancemethods andstrategiesforimproving as-setsreliability,andthepositionofthepresentworkinthecontext ofprevious ones. Sections3 and 4 explain the machine learning basedapproaches used to characterize asset behavior anddetect anomalies,basedonconstrainedclusteringandfuzzymodeling, re-spectively.InSection 5thetestscenario ispresentedandthe ex-perimentalresultsobtainedare discussed.Finally,theconclusions achievedinthisstudyaregiveninthelastsection.

2. Relatedwork

Conditionbasedmaintenanceaimstoanticipateamaintenance operationbasedon evidencesofdegradation anddeviationsfrom normalassetbehavior. When observingtheconditionofa partic-ularsystemasetofmonitoringdevicesandsensorsmustbe con-sidered.Intelligentmonitoringof equipmentby meansofsensors isessentialinordertoacquire relevantdata,containingthe char-acterizationof operationalfaults in physicalsignals:acoustic and ultrasonicsensors, accelerometers,currentmeasurementsor ther-mocouplesareusuallyemployed[3,4].Inadditiontothisdata, en-vironmentalconditions andcontextual information,such as tem-perature,pressureorhumidityalsoprovideveryusefuladditional informationto enrichthe modeling process [5]. From such infor-mation, specific Key Performance Indicators (KPIs) are calculated andanalysedtodiscovertrendsandknowledgeofinterestthatcan leadtoapotentialcriticalfault.

Atraditionalpreventivestrategymayobtainhighreliability lev-elsifitiswell designed[6].However, itsometimes implies over-maintainingtheassets.Thisisduetothefactthatequipment man-ufacturersarealwaysconservativeintheirmaintenancepoliciesso thatreliabilityisachieved,butassuminghighmaintenancecosts.It iswellknownthatfailureprobabilityofmanycomponentsishigh atthebeginningandendofitsoperationallife,followingthe bath-tubfailure pattern [7]. Therefore, unnecessary maintenance tasks increase failure ratewhen a defectiveitem is installed ordueto ahumanmistake.Moreover,preventivestrategiesdonottake into accountoperationalcontextsuchasloadprofiles,numberofstarts or environmental parameters, which strongly affect components lifetime. Finally, preventive maintenance is erroneouslybased on the idea that the probability of occurrence of operational faults increasesexponentially ata certain time. Inpreventive strategies componentsare replacedorrepaired before that momentoccurs. Thisassumptionisnottrueinmanycases,sincethereareseveral

failure patterns in which failure probability does never increase [8].In such cases failure probability is constant in time. Thus, a componentcouldfail atanytime.Especiallyrelevantexamplesof this phenomenon are failure patterns of electrical andelectronic components,inwhichreparationandsubstitutiontasksatplanned periods oftime doesnot implyan improvementinterms of reli-ability.Forall thesereasonsthere stillexists importantreliability increaseandcostreductionmargins.

Inorder to improvereliabilityand toreduce costs an optimal maintenance strategy should provide a set of predictive, preven-tive andcorrectiveprocedures asaresultofa technicaland eco-nomicanalysisofeveryfailuremode,takingintoconsiderationthe relatedconsequences [9]. Reliability CenteredMaintenance(RCM) strategies include a Failure Mode, Effects and CriticalityAnalysis (FMECA). Once failure modes are identified and criticality classi-fied,maintenancetasksto beundertakenare establishedtoavoid faults consequences[10].When predictive maintenanceis techni-callypossibleandiseconomicallyworthit,comparedtopreventive andcorrectiveones, itisapplied.Maximumreliabilityisobtained whenarobustandtrustworthyfailureindicatorparameteris mon-itored.When it is not technicallypossible orit isnot affordable, preventivemaintenance strategiesare adopted.Corrective mainte-nance is only envisaged in casepredictive andpreventive main-tenance strategies are not feasible.If that isthe case, the conse-quencesof afault are criticalandonly correctivemaintenance is feasible.Inthosesituations,theuseofsafetydevicestoapply ap-propriate troubleshootingtasks orredesigningaffected asset(e.g. installingastandbycomponent)isrequired.

Whentheaimistoachievemaximumreliabilityanappropriate CBM+systemwithmonitoringcapabilitiesmustbeadopted, gath-eringandcombiningall kindof usefulsources ofinformation si-multaneouslyandprovidingthe prognosticsneededto assurethe correct operation of the assets(e.g. the critical components of a paediatricemergencydepartment)[11].Thus,resultingCBM+ sys-temmustincludedataacquisitionandprocessing,diagnosticsand prognostics and decision making functionalities [12]. Generated data-driven models for diagnostics and prognostics must be de-ployedin amonitoringplatform withonlinedata acquisitionand inspectioncapabilities[13].SeveralcommercialCBM+systemsare alreadyavailable, mostof themusing a widevariety of potential failureindicators,separately[14,15].

Data-drivenprognosticsmodelsarethecoreofthewhole pro-cess since they apply the behavioral and statistical methods for fault prediction and classification [16]. Artificial neural networks [17]andsupportvectormachines[18,19]areusuallyappliedto an-alyze dataandinfer such models.The use ofprojection methods (e.g. linear, nonlinear and orthogonal projections to latent struc-tures,kernelmethodsorPCA)fordimensionalityreductionand re-gressioncanhighlysupportthepredictionprocess[20–22]. Never-theless,dependingontheapplicationandwheneveritispossible, itmaybe beneficialtoinstead incorporatespecific knowledge di-rectlyintowhicheveralgorithmisapplied.

(3)

3. Behaviorcharacterization

Data-drivenbehaviorcharacterizationconsistsongrouping sim-ilardataintodatasets,whichphysicallyrepresentthesame opera-tionalcondition.Withinformedgroups,thereexistsdatapointsfar fromthepattern,whichcorrespondstothemeanpoint.Such pat-terncouldbeverysignificantinordertoclassifyortoidentify be-haviorslinkedtothedata,orinordertodetectortoinferpossible faults oranomalous operational conditions.Biggroups, orgroups that are closetogether, usually implynormal behaviors. Whereas smallgroupsoreventsthat arefarfromthepattern(ofthesame group orregardingabig group),implyanomalies oroutliers(e.g. noiseandtransientdata).Itmustbenotedthatthelearnerisonly providedwithunlabelledexamples.Whenlabelsareavailable, su-pervised learning and reinforcement learning can be applied to trainaclassifier.However,itisoftenverydifficulttoobtainalarge datasetcontainingrealfaults.

3.1. ConstrainedK-meansclustering

IntheproposedconstrainedK-meansapproach,kvalueis auto-matically provided based on cluster distribution andcluster data variance as established by He et al. and Wu et al. [25,26]. It is computedasapreviousstepoftheclusteringprocess.Variance ex-plained ofresulting classification model,orclusterscompactness, Cmp, is calculated until CmpkCmpk−1≤0.5. As Euclidean dis-tanceisapplied[27],itbecomescoherenttoaveragecluster scat-teringindex[28].Thememberofeach cluster shouldbe asclose toeachotheraspossible.Theclusterscompactnessiscomputedas itcanbeseeninEq.(1).

Cmp= 1

k k

i=1

||

σ

(

Ci

)

||

||

σ

(

X

)

||

σ

(

Ci

)

=

σ

1 C1 . . .

σ

m Ck

σ

(

X

)

=

σ

1 X . . .

σ

m X

(1)

wherekisthenumberofclusters;

σ

(Ci) isthevarianceofcluster

Ciwith

σ

Cpi=

1 |Ci|

n j=1

(

x

p jC

p

i

)

2and

σ

(X)isthedatavariancewith

σ

p

X = 1n n

i,j=1

(

xipx p j

)

2,i= j.

Whenknowledgeregardingthesystembehaviortobemodeled isavailableinadditiontothedatainstancesthemselves,the algo-rithmcanbemodifiedtomakeuseofthisknowledge[29].A con-strainedthatconsistsonspecifyinganassetmaininputfeature,Xf,

isthusconsideredsothatinstancesaregroupedonthebasisofXf

non-parametricdistribution.Giventhenumberofclusters,k,their width interms ofXf valuesare estimated asw=

max(Xf)−min(Xf)

k .

Then, for each cluster Ci, i=

(

1,...,k

)

, uk=min

(

Xf

)

+

(

i+1

)

w

andlk=uk1,withu0=min

(

Xf

)

,aresetasupperandlowerlimits

onXfvalues,respectively.Then, alocaldistance-basedoutlier

de-tection can be accurately performed considering theasset status, determinedbyitsmaininputfeaturevalues.

Givenasetofmfeatures,X=

{

X1,...,Xm

}

whereeachfeatureXi

cantakeavaluefromitsownsetofpossiblevalues

χ

i,andn

fea-ture vectorsor instances, xi=

(

x1,...,xm

)

χ

=

(

χ

1,...,

χ

m

)

, with

i=1,...,n, L2 normalization is calculated for each instance xi in

the data set inorder to minimisethe impact of having different rangeofvaluesofrawdataintheresultingclassificationmodel.It iscomputedastherootofthesumofitssquaredelements, asit canbeseeninEq.(2).

L2

(

xi

)

=

m

l=1

|

x2

il

|

(2)

OncethenormalizationofdatasamplesisperformedusingL2, andinordertocalculatethedistancebetweeninstances,xiandxj,

Euclideanmetriciscomputed(seeEq.3).

D

(

xi,xj

)

=

||

xixj

||

=

m

l=1

xilxjl

2 (3)

wherexi= xi

L2(xi) andxj= xj

L2(xj) are the L2-normalizedinstances

xiandxj,respectively.

Theconvergencecriteriaisestablishedasamaximumnumber of iterationsand a stability threshold, which checks the clusters compactnessvariationofresultingclassificationmodelfromith it-erationtoiterationi+1(seeEq.1).

An iterative outlier detectionloop is then performed, so that fromgroups ofinstances formed during theconstrained-learning stage, outliers are detected and isolated and behaviors of inter-estgivena certain assetmain input featurevalues are character-ized. Theyare represented by instances that belong to a certain group butdiffers notoriouslyfromthepattern. For eachiteration intheoutlierdetectionprocessandforeachcluster,Ci,ananomaly

thresholdiscalculatedbasedonthedistancesofinstancesgrouped inCitothecentroid,asitcanbeseeninEq.(4).

Th

(

Ci

)

=

1

|

Ci

|

xjCi

D

(

xj,ci

)

+3

xjCi

D

(

xj,ci

)

2

|

Ci

|

−1

(4)

where ci=|C1i| xjCixj is the centroid of cluster Ci and xj, j=

{

1,..,

|

Ci

|}

,isthejthinstancegroupedinclusterCi.

Events whose distance to the centroid are over the anomaly thresholdaresetasoutliers.Consequently,foreachiteration dur-ing the outlier detection process and for every cluster, centroids are recalculated filtering out detected outliers. The process stops when no more outliers are detected. Outliers detected within a cluster and small clusters could imply abnormal asset behaviors andoperationalfaults.

TheoverallalgorithmcanbeseeninAlgorithm1.

Algorithm1 ConstrainedK-meansclusteringforoutlierdetection. Input:asetofmfeatures,X=

{

X1,...,Xm

}

1: ComputekusingEquation1

2: SelectanassetmaininputfeatureXfX=

{

X1,...,Xm

}

3: Computew= max(Xf)−min(Xf) k

4: forallCi,i=

(

1,...,k

)

do

5: Find uk=min

(

Xf

)

+

(

i+1

)

w and lk=uk−1, with u0= min

(

Xf

)

6: forallxj

{

lk,uk

}

do 7: xj

{

Ci

}

8: endfor 9: endfor 10: outliers=

{}

11: whileouliersfounddo 12: forallCi,i=

(

1,...,k

)

do

13: ComputeTh

(

Ci

)

usingEquation4

14: ifD

(

xj,ci

)

>Th

(

Ci

)

then

15: xjoutliers

16: RemovexjfromCi 17: endif

18: endfor 19: endwhile

(4)

4. Anomalydetection

Inthissectiontheanomalydetectionprocessispresented.Itis performedonthebasisofbehaviorscharacterizedfromdataby ap-plyingconstrainedK-means clusteringandonoutliers found.The mostimportantbehaviortobeconsideredisnormality.

4.1.Fuzzypartition

Fuzzyrules generationprocessandinference enginearebased on work proposed by Cingolani et al. [32]. Fuzzy controllers are currentlyconsideredtobeoneofthemostimportantapplications of the fuzzy set theory proposed by Zadeh [33]. This theory is basedonthenotionofthefuzzysetasageneralizationofthe or-dinaryset characterized by a membership function

μ

that takes valuesfromtheinterval[0,1]representingdegreesofmembership intheset.Fuzzycontrollerstypicallydefine anon-linearmapping fromthesystem’sstatespacetothecontrolspace.Thus,itis pos-sibletoconsider the output ofa fuzzycontroller asa non-linear controlsurfacereflectingtheprocessoftheoperator’sprior knowl-edge.Afuzzycontroller isakindoffuzzyrule-basedsystemthat iscomposedbythefollowingparts:

a knowledgebasethat comprisestheinformation usedbythe expertoperatorintheformoflinguisticcontrolrules,

a fuzzification interface, which transforms the crisp values of theinputvariablesintofuzzysetsthatwillbeusedinthefuzzy inferenceprocess,

an inference system, which uses the fuzzy values from the fuzzificationinterfaceandtheinformationfromtheknowledge basetoperformthereasoningprocess,and

thedefuzzificationinterface,whichtakesthefuzzyactionfrom theinferenceprocessandtranslatesitintocrispvaluesforthe controlvariables.

The knowledge base encodesthe expertknowledge by means ofasetoffuzzycontrolrules.

In orderto considereach group ofsimilar events,normaland outliers,in a relevantway, a fuzzypartition intwo fuzzy sets is definedovertheuniverseUioftheEuclideandistancestocentroid inclusterCi.LetdibetheEuclideandistanceofeventxitocentroid

ofcluster Ci,computedasitcan beseeninEq.(3).The

member-shipfunction ofthesefuzzysets, respectivelydenoted as

μ

n and

μ

oaredefinedinEq.(5)andEq.(6).

μ

n

(

di

)

=

Th(C

i)di

Th(Ci)−minn ifdi∈[minn,maxn]

0 otherwise (5)

μ

o

(

di

)

=

d

iTh(Ci)

maxoTh(Ci) ifdi∈[mino,maxo]

0 otherwise (6)

whereminn andmaxn andmino andmaxoare the minimumand

maximum values in fuzzy sets normal and outlier, respectively,

μ

n(di):di →[0, 1] quantifiesthe degreeof membership ofdi to

normaland

μ

o(di):di→ [0, 1]quantifiesthe degreeof

member-shipofditooutlier.ObtainedfuzzypartitionisdescribedinFig.1. Notethat consideringamembershipdegreeallows toprovide ex-pertswithmoreinterpretableinformationabouttherealstatusof theasset.

4.2.Eventscore

In order to distinguish betweenoutliers and real faults, a lo-caloutlierfactoriscomputedforeacheventdistance,asitcanbe seeninEq.(7).ItisbasedontheworkproposedbyBreunigetal. [34] and it measures the degree of isolation of a point with re-spect to its neighbors. Thus, the local density is also considered

Fig.1. Proposedfuzzypartition.

whendeterminingifanoutlierisanactualanomaly.

LOF

(

di

)

=

djN(di) LRD

(

dj

)

LRD

(

di

)

|

N

(

di

)

|

(7)

where LRD(di) is the local reachability distances of di,computed

fortheclosest subsetofdistances N(di)ofsize max

|C

i|

10,1

,asit isshowninEq.(8).LRD(dj)iscalculatedlikewise.

LRD

(

di

)

=

djN(di)

reachDist

(

di,dj

)

|

N

(

di

)

|

−1 (8)

where reachDist

(

di,dj

)

=max

{

Kdist

(

di

)

,dj

}

and Kdist

(

di

)

is the K−distanceneighborhood of di,withK=

|{

outliers

}|

foreach

cluster, being K=max

|Ci|

100,1

in the case that no outliers are

foundinclusterCi.

The score ofeventxi is then definedasa combination ofthe

membershipfunctiontofuzzysetsnormalandoutlierandthelocal densityofdistancestothenormalbehavior.

score

(

xi

)

=

μ

n

(

di

)

LOF

(

di

)

max

{

LOF

}

ifdi∈[minn,maxn]

μ

o

(

di

)

LOF

(

di

)

max

{

LOF

}

ifdi∈[mino,maxo]

(9)

where LOF=

(

LOF1,...,LOFn

)

, calculated for the whole set of

dis-tancesd=

(

d1,...,dn

)

.

Aneventwillbeconsidered asanomaly ifits scorefallsbelow −0.5. Fig. 2 shows an example of the evolution of the resulting eventscore calculatedover time. Theproposed event scoreis an optimalmethodforreducing false positivesinanomaly detection process.Italsoallowsuserstoaccessresultsquicklyandefficiently.

5. Testscenario

Themain goalofthepresentworkis toimproveanomaly de-tectionandpredictioncapabilitiesofexistingconditionmonitoring strategies.Inthissense,proposed approachshouldbeableto im-prove currentmathematicalmodels andpreventive strategiesput inpractice.Oncetheknowledgemodelsarelearnedandvalidated, theyaredeployedintheCBM+systemtoautomaticallyanticipate real-timefaultsinanonlinefashion.

5.1. Experimentalsetup

(5)

Fig.2. Eventscoreexample.

Fig.3. Barchartofresultingclustersdistribution.

A set of operational features were monitored, collected and analysed. From these parameters the engine behavior and its healthstatuscanbeestablished.TheyareshowninTable2.Then, the data processing techniques proposed are employed to learn statistical models fromhistorical data, which will allow identify-ing andmodelingnormalbehaviors, isolatingoutliersand detect-ingoperationalfaults.

Aneventiscollectedeveryminuteanditiscomposed by val-ues of all monitored parameters at a specific time instant. The

Table1

Auxiliarymarinedieselenginemainspecifications.

Parameter Value

Numberofcylinders 12V Ratedmaximumpower 1200kW

Ratedoperatingspeed 1800rpm(constant)

Bore 165mm

Stroke 185mm

Compressionratio 15.5

CBM+ system automatically transfers asset sensors’ readings in buffermodetoan on-boarddatabase.Thesedataaredailysentin streamingmodetoagroundcontroldatabaseforfurtheranalysis. Prognosticsmodelslearntfromhistoricaldataarethen appliedin real-timefortheintelligentmonitoringoftheasset,checkingnew eventson-board.

(6)

Table2

Monitoredauxiliarydieselengineparameters.

Typeofvariable Parameters

Contextualvariables Environmentalpressureandtemperature Relativehumidity

Circulationseawatertemperatureandpressure Engineinputvariables Inletandreturnfuelflow

Fuelpressure Enginespeed

Coolingwaterpressureandtemperature Oilpressureandtemperature Coolingwatertemperature

IntakemanifoldchargeairpressureandtemperatureA Startingairpressure

Engineoutputvariables Exhausttemperatureofcylinders1Ato6Aand1Bto6B InletexhausttemperatureofturbochargersAandB OutletexhausttemperatureofturbochargersAandB AandBturbotemperaturedrop

Generatorinputvariables Alternatorcoolingairtemperature Generatoroutputvariables Alternatorfrequency

Alternatorreactiveandactivepower Alternatorvoltageandintensity

AlternatorwindingtemperatureinphasesR,SandT Alternatorbearingtemperatures

Table3

Percentageofvarianceexplainedforeachnumber ofclusters.

Numberofclusters %ofexplainedvariance

2 76.03

3 88.23

4 93.46

5 95.32

6 95.91

7 96.52

8 96.96

sothattimeregionswheretheengineisnotworkingarenot con-sidered,sincetheyhavenosignificanceregardingoperational fail-uremodes.

Domain experts’contributiontakesplacewhenidentifyingthe asset input feature in the constrained K-means clustering step. Then,theproposedapproachisautomaticallyappliedand anoma-liesfound attheendoftheprocessarecheckedandvalidatedby thesameexperts.Todoso,meetingsandinterviewswithexperts mustbeheld.

The software platform used is Java Platform Enterprise Edi-tion fromOracle. It is based on the Java programminglanguage. Experimentswere conducted inthe Java-based Eclipse RCP (Rich Client Platform, Kepler version) environment on Ubuntu-Linux 14.04 (64bits), on a CORE i5 desktop PC with 4 GB of RAM memory.

Inthefollowingsubsectiontheanalysisprocessfollowedbythe proposed methodologyis shown overa real dataset.Moreover, a cylindersleaking problemcharacterised by low exhaustgas tem-peratureincylinders, andan alternatorproblemcharacterisedby highintensityandreactivepoweraredetectedanddiscussed.

5.2.Experimentalresults

The proposed methodology has been tested on two months’ timeoperationaldata,fromJanuarytoFebruary2015,ofan auxil-iarydieselengine.Atotalof17,377eventsareanalyzed.The num-ber of clusters, k, is set to 8 given the percentage of explained variance,asitcanbeseeninTable3.Althoughother complemen-tarytests were performedwithdifferentk values (e.g.k=6 and

k=10), resultsobtainedwere lessaccurate intermsofthe num-beroffalsenegativesandfalsepositives.

As a result of the constrained K-means clustering-based out-lier detectionprocess, a totalof78 eventsare isolated from nor-mal patterns. The resulting cluster distribution can be seen in Table4andinFig.3.

More indetail,events groupedin each cluster canbe seen in Fig.4.The dashedlinerepresentsthecluster centroidand opera-tionalparametersvaluesarerepresentedbydots,beingeachevent a set of dots (one value per operational parameter) at a specific time instant. Operational parameters values are normalized be-tween0and1.ItcanbeseenasasimplifiedKohonenmapor net-workwithasmallnumberofnodes,fromlowertohigherengine load,andnoneighborhoodfunctionwhenupdatingtheBMU(best matchingunit)ateachiteration[35].

As it can be expected, clusters containing stable engine load conditionsgroupedthemajorityofevents.Thatisthecase,for in-stance,ofCluster3,4and5.Outliersdetectedinsuchclustersare more likelyto imply realsystem faults. However, anddepending onnatureandtypeofsystemfault,aproblemcanalsooccurwhen theengineisnotinstableoperatingconditions.Thatbehaviorcan be observed inrelation to Cluster0, whenthe engine isstarting up. InFig.5 atypical normalengine behaviorunder stable oper-ation is shown. It corresponds to normal eventsgrouped within Cluster3.

Among outliers found, some events correspond to abnormally lowexhausttemperaturesincylinders,probablyduetoascavenge fire and/or a defective fuel valve, both ofwhich are caused by a fuelsystemfault.Thisfaultwaspresentinatotalof13events dis-tributedindifferentclusters.InFig.6anexampleofsuchbehavior inthreeevents ofoneofthe clustersformedcan beappreciated. An alternator system symptom was also detected in 2 events in two differentclusters. Itischaracterized by extremely high alter-nator intensityandreactive powerata normalengine load, asit can be seen in Fig.7. It isusually producedduring docking ma-noeuvresandcouldleadtocriticalsystemfailureorcompromised security.

(7)
(8)

Table4

Clustersdistribution.

Cluster 0 1 2 3 4 5 6 7

Totalnumberofevents 330 13 751 2098 10,550 3265 286 6

OutliersFound 9 0 13 7 33 15 1 0

Fig.5. Normalenginebehavior.

(9)

Fig.7. AlternatorSystemfaultdetectedatanormalengineload.

Table5

Resultsobtainedpercluster.

Cluster 0 1 2 3 4 5 6 7

Realnormal 326 13 750 2094 10,547 3263 285 6 Predictednormal 328 13 750 2094 10,547 3263 285 6 Realfault

Fuelsystem 4 0 1 3 3 2 0 0

Alternatorsystem 0 0 0 1 0 0 1 0

Predictedfault 2 0 1 4 3 2 1 0

Table6

Globalconfusionmatrix.

78outliersfound Predictednormal Predictedfault

RealNormal TN=17,362 FP=0 RealFault FN=2 TP=13

To quantify the number of anomalous events in each cluster, an eventisconsideredasarealfaultifitsscoreisbelow−0.5.Given that thetest casescenariosshowingthe fuelsystemand alterna-tor faults were deliberately chosen to be difficult to detect, it is stillencouragingthattheclassificationoffaultyeventsrisesabove the falsepositive rate,accurately distinguishing realfaultsamong outliersfound.

Then theapproach foranomaly detectionwastestedon a 10-foldcross-validationbasisforeachcluster,bysegmentingthetotal setofclustereventsinto10equalparts.Thus,theconfusionmatrix presentedinTable6isobtained, containingtheaverageresultsof the10folds.Notethatthe constrainedK-means clusteringstepis performedon thewholedata setonlyonce, inordertoestablish the differentengine load operational rangesthat will be used to isolatetheoutliersandpredicttherealfaults.

In order to evaluate the results,the precision, sensitivity and specificity ofthe detectionprocess are calculated.Theyare three widely used quality measures in this kind of processes. As it is

Table7

Globalprecision,sensitivity,specificityandκcoefficient.

Realnormal Realfault Globalresults

Precision 99.99% 100% 99.98% Sensitivity 100% 86.67% 93.34%

Specificity 100%

κ 0.93

shown in Table 7, precision, sensitivity and specificity are glob-ally above 93%, so the approach accurately limits false anoma-liesand undetectedfaults. The inter-rater agreement statistic co-efficient (Cohen’s kappa,

κ

) is also computedaiming to evaluate theagreement betweennormalandfault events[36]. The result-ing

κ

coefficientis0.93,thereforeahighstrengthofagreementis achieved.

Bytakingintoaccountestimationsmadebythisapproachand nextassetmaintenance periods,usageplanning,spareparts avail-ability and manpower resources, optimal maintenance strategies canbesuggested.

6. Conclusions

(10)

resultsshowthepotentialoftheproposedapproachtosuccessfully address fault modes characterization problem, providing a finer andeasiertointerprettechniqueforexpertstoanticipatepotential failuresonthebasis ofthehealth statusoftheassetfrom moni-toringsensordata.The complexityofsuchassets(e.g.marine en-ginesand industrial machinery) makes their behavior difficultto understandandinterpret.Undersuch circumstances itis particu-larlychallengingtoanticipateabnormalbehaviorsthatoften fore-seesignificantandcostlyfaults.

Itwasfoundthatmachinelearningmethodscanhighlysupport thetraditionaldiagnosticsandprognosticscondition-based main-tenancestrategies.The combinationofconstrainedK-means clus-teringforoutlierdetectionandbehaviorcharacterization,thefuzzy modeling ofdistances to patterns found and thefinal LOF-based eventscoreprovideafullycomprehensiveyetaccurateprognostics approach.Discussedtestscenario shownan accurate detectionof criticalfaultsinanauxiliarydieselengine,characterizedby abnor-malexhaust temperaturesin cylindersregarding fuel systemand by extremely high alternatorintensity and reactive power in al-ternator system. The performance achieved is quite high (global precision,sensitivityandspecificityabove93%and

κ

=0.93),even more so given that a very small percentage of real faults are presentindata.

The deployment of discussed data-driven models in a CBM+ system results in important benefits in terms of fault detection andprediction. The monitoring andanalysis of additional opera-tional parameters (e.g. in-cylinder pressure, torque, turbocharger speed,cylinder heads vibration, main bearings vibrationand flu-idsanalysis),andenvironmentalconditions(e.g.weather variabil-ityfromArcticSeato RedSeawaters), willallow increasing pre-dictionaccuracyandsystemcapabilitiesasmorefailuremodesare considered.Thiswill allowmaximizing reliability andminimising risksandunnecessarycostsderivedfromunexpectedbreakdowns. Moreover,predictionsanddiagnosisobtainedbyadoptingthis ap-proachcanserveasinput foranoptimaldecisionmakingsupport andplanningofmaintenanceoperations,whichwillleadto impor-tantbenefitsinthereliability,safetyandperformanceoftheassets.

References

[1] D.E.Barber,Shipboardconditionbasedmaintenanceandintegratedpower sys-teminitiativesPh.D.thesis,MassachusettsInstituteofTechnology,2011. [2] W.C.Greene,Evaluationofnon-intrusivemonitoringforconditionbased

main-tenanceapplicationsonUSNavypropulsionplantsPh.D.thesis,2005. [3] C.JamesLi,S.Li,Acousticemissionanalysisforbearingconditionmonitoring,

Wear185(1)(1995)67–74,doi:10.1016/0043-1648(95)06591-1.

[4] W.T.Peter,Y.Peng,R.Yam,Waveletanalysisandenvelopedetectionforrolling element bearing fault diagnosis their effectivenessand flexibilities, J. Vibr. Acoust.123(3)(2001)303–310,doi:10.1115/1.1379745.

[5] M.Braglia,G. Carmignani,M. Frosolini,F.Zammori,Data classificationand mtbfpredictionwithamultivariateanalysisapproach,Reliab.Eng.Syst.Saf. 97(1)(2012)27–35,doi:10.1016/j.ress.2011.09.010.

[6] M. Rausand, J. Vatn, Reliability centredmaintenance, in: Complex System MaintenanceHandbook,Springer,Berlin,Heidelberg,2008,pp.79–108,doi:10. 1007/978-1-84800-011-7_4.

[7] C.Milkie,A.N. Perakis,Statisticalmethodsfor planningdieselengine over-haulsintheuscoastguard,NavalEng.J.116(2)(2004)31–42,doi:10.1111/ j.1559-3584.2004.tb00266.x.

[8] M.John,Reliabilitycenteredmaintenance,1997,

[9] G.W.Vogl,B.A. Weiss,M. Helu,A review ofdiagnosticand prognostic ca-pabilitiesandbestpracticesfor manufacturing,J.Intell.Manuf.(2016)1–17, doi:10.1007/s10845-016-1228-8.

[10] J.M.Brown,J.A.CoffeyIII,D.Harvey,J.M.Thayer,Characterizationand prog-nosisofmultirotorfailures,in:StructuralHealthMonitoringandDamage De-tection,Volume7,Springer,Berlin,Heidelberg,2015,pp.157–173,doi:10.1007/ 978-3-319-15230-1_15.

[11]F.Kadri,F.Harrou,S. Chaabane,Y.Sun,C.Tahon,Seasonal arma-basedspc chartsforanomalydetection:applicationtoemergencydepartmentsystems, Neurocomputing173(2016)2102–2114,doi:10.1016/j.neucom.2015.10.009. [12] A.Coraddu,L.Oneto,A.Ghio,S.Savio,M.Figari,D.Anguita,Machine

learn-ingforwearforecastingofnavalassetsforcondition-basedmaintenance ap-plications, in:2015 International Conference on ElectricalSystems for Air-craft,Railway,ShipPropulsionandRoadVehicles(ESARS),IEEE,2015,pp.1–5, doi:10.1109/ESARS.2015.7101499.

[13]B.S.J.Costa,P.P.Angelov,L.A.Guedes,Fullyunsupervised faultdetection and identificationbasedonrecursivedensityestimationand self-evolving cloud-basedclassifier, Neurocomputing150(2015) 289–303,doi:10.1016/j.neucom. 2014.05.086.

[14]M.Baqqar,MachinePerformanceandConditionMonitoringUsingMotor Op-eratingParametersThroughArtificialIntelligenceTechniquesPh.D.thesis, Uni-versityofHuddersfield,2015.

[15]L.Sebastiani,A.Pescetto,L.Ambrosio,etal.,Theconditionmonitoringsystem foroptimalmaintenance-possibleapplicationonoffshorevessels,in:Offshore MediterraneanConferenceandExhibition,OffshoreMediterraneanConference, 2013.

[16]H.Yang,J.Mathew,L.Ma,Intelligentdiagnosisofrotatingmachineryfaults-a review,in:3rdAsia-PacificConferenceonSystemsIntegrityandMaintenance, ACSIM2002,Cairns,Australia,2002,pp.385–392.

[17]W.Li,Z.Zhu,F.Jiang,G.Zhou,G.Chen,Faultdiagnosisofrotatingmachinery withanovelstatisticalfeatureextractionandevaluationmethod,Mech.Syst. SignalProcess.50(2015)414–426,doi:10.1016/j.ymssp.2014.05.034. [18]Z.Zhou,J.Zhao,F.Cao,Anovelapproachforfaultdiagnosisofinductionmotor

withinvariantcharactervectors,Inf.Sci.281(2014)496–506,doi:10.1016/j.ins. 2014.05.046.

[19]R. Jegadeeshwaran, V. Sugumaran, Fault diagnosis of automobile hydraulic brakesystemusingstatistical featuresandsupport vector machines,Mech. Syst.SignalProcess.52(2015)436–446,doi:10.1016/j.ymssp.2014.08.007. [20]S.Yin,G.Wang,H.Gao, Data-drivenprocessmonitoringbasedonmodified

orthogonalprojectionstolatentstructures,IEEETrans.ControlSyst.Technol. 24(4)(2015)1480–1487,doi:10.1109/TCST.2015.2481318.

[21]E.Kokiopoulou,J.Chen,Y.Saad,Traceoptimizationandeigenproblemsin di-mensionreductionmethods,Numer.LinearAlgebraAppl.18(3)(2011)565– 602,doi:10.1002/nla.743.

[22]F.Zhou,J.H.Park,Y.Liu,Differentialfeaturebasedhierarchicalpcafault de-tectionmethodfordynamicfault,Neurocomputing202(2016)27–35,doi:10. 1016/j.neucom.2016.03.007.

[23]R. Akerkar, P.Sajja, Knowledge-Based Systems, Jones&Bartlett Publishers, 2010.

[24]W.J. Verhagen, R.Curran, Knowledge-based engineeringreview: conceptual foundations and researchissues, in: NewWorld Situation: New Directions in Concurrent Engineering,Springer, Berlin, Heidelberg,2010, pp. 267–276, doi:10.1007/978-0-85729-024-3_26.

[25]J.He,A.-H.Tan,C.-L.Tan,S.-Y.Sung,Onquantitativeevaluationofclustering systems,in:ClusteringandInformationRetrieval,Springer,Berlin,Heidelberg, 2004,pp.105–133,doi:10.1007/978-1-4613-0227-8_4.

[26]W.Wu,H.Xiong,S.Shekhar,ClusteringandInformationRetrieval,11,Springer, 2004,doi:10.1007/978-1-4613-0227-8.

[27]C.C.Aggarwal,Proximity-basedoutlierdetection,in:OutlierAnalysis,Springer, Berlin,Heidelberg,2013,pp.101–133,doi:10.1007/978-1-4614-6396-2_4. [28]C.Legány,S. Juhász, A.Babos, Clustervaliditymeasurementtechniques,in:

Proceedingsofthe 5thWSEASInternational ConferenceonArtificial Intelli-gence,KnowledgeEngineeringandDataBases,in:AIKED’06,WorldScientific andEngineeringAcademyandSociety(WSEAS),StevensPoint,Wisconsin,USA, 2006,pp.388–393.

[29]S. Basu, I.Davidson,K. Wagstaff,ConstrainedClustering:Advances in Algo-rithms,Theory,andApplications,CRCPress,2008.

[30]K.Wagstaff,C.Cardie,S.Rogers,S.Schrödl,etal.,Constrainedk-means clus-teringwithbackgroundknowledge,in:ICML,1,2001,pp.577–584.

[31] P.Bradley,K.Bennett,A.Demiriz,Constrainedk-meansclustering,Redmond, 2000,pp.1–8.

[32]P.Cingolani,J.Alcalá-Fdez,jfuzzylogic:ajavalibrarytodesignfuzzylogic con-trollersaccordingtothestandardforfuzzycontrolprogramming,Int.J. Com-put.Intell.Syst.6(sup1)(2013)61–75,doi:10.1080/18756891.2013.818190. [33]L.A. Zadeh, Fuzzy sets, Inf. Control 8 (3) (1965) 338–353, doi:10.1016/

S0019-9958(65)90241-X.

[34]M.M.Breunig,H.-P.Kriegel,R.T.Ng,J.Sander,Lof:identifyingdensity-based localoutliers,in:ACMSigmodRecord,29,ACM,2000,pp.93–104,doi:10.1145/ 335191.335388.

[35]A.Carrascal, A.Díez, A. Azpeitia,Unsupervised methodsfor anomalies de-tectionthrough intelligentmonitoringsystems, in:InternationalConference on HybridArtificialIntelligence Systems,Springer, Berlin,Heidelberg,2009, pp.137–144,doi:10.1007/978-3-642-02319-4_17.

[36]J.L.Fleiss, B.Levin,M.C.Paik,StatisticalMethodsforRatesandProportions, JohnWiley&Sons,2013,doi:10.1002/0471445428.

(11)

targetingimportantsectorssuchasmaritime,renewableenergy,railway,agro-food, civilstructuresandmachine-tool.

JoseAntonioPagánRubioobtainedanIndustrial Engi-neeringdegreeatUniversidad PolitécnicadeCartagena (ETSII-UPCT)in 2001.Thatyear heentered NAVANTIA, wherehehasworkedasreliabilityengineerinIntegrated Logistic Support(ILS) for defense programsuntil2007. HeisaReliability-CenterMaintenance(RCM) facilitator. Afterthat, hehasbeen working asanexperienced di-agnosisengineerinNAVANTIAEngineFactoryand heis alsoresponsibleforNAVANTIASmartMaintenance R&D project,startedin2012.Heisformedinpredictive main-tenancetechniquesasvibration,thermographyor ultra-sounds. Nowadays heis a PhD studentat Universidad PolitécnicadeCartagena(ETSII-UPCT)intheDepartment ofFluidsMechanicsandCombustionEngines,beinghisthesisrelatedtodiesel en-ginefailurediagnosisandprediction.

Ricardo Sanz is a Full Professorin Automatic Control and SystemsEngineeringattheUniversidadPolitecnica de Madrid. He got a degree in electrical engineering (1987)and aPh.D.inroboticsandartificialintelligence (1990).Since 1991,hehas been amember ofthe De-partmentofAutomaticControlattheUniversidad Politec-nica deMadrid, Spain. His main research interests fo-cusaround controlarchitectures forintelligent systems, being involved in research lines on autonomous con-trol,softwaretechnologiesfor complex,distributed con-trollers,real-timeartificialintelligence,cognitivesystems, bio-inspiredcontrollersandphilosophical aspectsof in-telligence.HehasbeenassociatededitoroftheIEEE Con-trolSystemsMagazineandchairmanoftheOMGControlSystemsWorkingGroup andtheIFACTechnicalCommitteeonComputersandControl.Heisalsoassociated editoroftheInternationalJournalofMachineConsciousnessandtheBiologically InspiredCognitiveSystemsJournal.Since2004heisthecoordinatoroftheUPM AutonomousSystemsLaboratory.

Referencias

Documento similar