• No se han encontrado resultados

Post-Nonlinear Mixtures and Beyond

N/A
N/A
Protected

Academic year: 2023

Share "Post-Nonlinear Mixtures and Beyond"

Copied!
8
0
0

Texto completo

(1)

Post-Nonlinear Mixtures and Beyond

Jordi Sol&Cas&'~* and Christian Jutten'

Signal Processing Group, University of Vic, Sagrada "ilia 7, 08500, Vic, Spain jordi. soleQuvic. e8

http: //we . u v i c . ev/eps/recerca/ca/proce.sament/inici, html

'

LIS-(CNTIts U M R n"5083), INPG, 46 AV. F.Viallet-38031 Grenobls Cedex, France Christian.Jutten~inpg.fr

http: //m. lis. inpg.f r/pagea.perso/ jutten/ jutten. htm

Abstrnct. Although sources in general nonlinear mixturm arc not separable iising only sta- tistical independence, a special and realistic case of nonlinear mixtnres, the post nonlinear (PNL) mixture is separable choosing a suited separating system. Then, a natural approach is based on the estimation of tho separating Bystem parameters by minimizing an indcpendence criterion, like estimated mwce mutual information. This class of methods requires higher (than 2) order statistics, and cannot separate Gaarsian sources. However, use of [weak) prior, like source temporal correlation or nonstationarity, leads to other source separation J g w rithms, which are able to separate Gaussian sourra, and c a n even, for a few of them, works with second-order statistics. Recently, modeling time correlated s011rces by Markov models, we propose vcry efficient algorithms hmed on minimization of the conditional mutual informa- tion. Currently, using t h e prior of temporally correlated sources, we investigate the fesihility of inverting PNL mixtures with non-bijectiw non-liacarities, like quadratic functions. In this paper, we review the main I C A and BSS results for riunlinear mixtures, present P N L models and algorithms, and finish with advanced resutts using temporally correlated s n u ~ s m

1 The nonlinear ICA and BSS problems

1.1 Model and problem

Consider N samples of t h e m-diinension ohserved random vector x, modeled by

x = .F(s)

+

n (1)

where

F

is a n unknown mixing mapping assumed invertible, s is an unknown n-dimensional source vector containing the source signals 8 1 , s ~ ~ .

. .

~ s,,, which arc arsumed tu be statistically independent, and n is a n additive noise, independent of the sources.

Such a model is usual in multidimenuionat signsl processing, where cach sensor receives an unknown superimposition of unknown soiirce signals a t time instants t = 1,. . . ,

N.

Then, the goal is to recover the rt unknown actual source signals q ( t ) which have given rise to the obsrrved mixtures. This i s referrod to as the blind soiirce Separation (BSS) problem, blind since n o OF very little prior information about the sourccs is required. Since the only assirmpt.ion is the independence of s o u r m , the basic idea in blind source separation consists in estimating a mapping

G,

only from the observed data x, such that y = B(x) are statistically inticpendent. Thc method, based on s t a h t i c a l independence, c o n s t i t u h a gericric approach called independcnt component analysis (ICA).

In the following, we assume that there are m many mixturps as sources (m = n) and that noise is zero. i n that case (a8 mil as if m

>

n), it is clear that estimating the inverw mapping

G

directly provides the sourcm On the cuntrary, if m C: n, identification of 4 and source estimation are unrelated tasks, and extra priors are required for separating the sources.

(2)

1.2 N o n l i n c a r m i x t u r e s

The general nonlinear ICA problem then consists in estimating a mapping

4

:

yields components

--t (H)" that

Y = G(x) (2)

which are statistically independent, only using the observations x.

For general nonlinear mappingu, E o F can lead to indepentlent s i g n a l s y which are still mixtures of the independent wurces x. In other words, ICA and BSS are not. equivdent: one can easily design a nonlinear mapping which niixes the sources and provides statistically independent variables yi

vi

= hi(s) # h ( s o ( L ) ) (3)

where a(i) is a permutation over { 1 , 2 , .

.

, ,ti}

Moreover, if the separation would Iic achieved, each estimated output ya would only depend on a unique source ye = /A,(S,(P). Then, strong distortions coutd stilt occur, due to the mapping h,(.).

One reason for this is that if U and er are two independent random variables, any of their functions

I(.)

and g(v) (where f and g are invertible functions) are independent too.

In the nonlinear BSS probtem, one then expect to find signals 1, = hi(so<,l), i = 1 , . . . n. Of course, it would be nice to restrict h, to identity functioIi: then, sources would be recovered up to weaker indeterminacies, e.g. scale and permutation indeterminacies as in linear mixtures. But is it possible? In other words, can priors on the sources and/or the mixing mapping be sufficient for this?

Generally, rising IC.4 for solving the nonliriear BSS problem requires additional prior informations or suitable regularizing constraints.

1.3 Outline

This paper is organized as follows. After having presented the nonlinear ICA and BSS problems in strtion 1, we consider two ways for regularizing the problem of source weparation in nonlinear mixtures. The first way, based on structural constraints, is explained in Section 2. The second way, based on prior information on the sources, is presented in Section 3. Finally, new ideas for inverting nowbijective fiirictions are investigated in Seclion 4.

2

Structural constraints

2.1 L i n e a r mod&

In the case of regular linear inodcls, the mapping 7 is linear and can be represented by x = As where A is a square invertible matrix. I n this c a e it suffices to constrain t h e separating model

G

to lie in the subspace of invertible square tnatrices, and one has to estimate a matrix B such that y = B x = Hs has independent components. First theoretical results on sepwability of linear mixtures have been proposed by Comon in 1994 [SI.

2.2 PNL mixtures

PNL m i x i n g and s e p a r a t i n g models In the post-nonlinear (PNL) model, the nonlinear obser- vations have the following specific forin (Figure 1):

One can see that the P N L model consists of a linear mixture followed by a component-wise nonlin- earity .fi acting on emh output independently from the others. T h e nonlinear functions (distortions)

f i are assumed tu be invertible.

(3)

-

Mixing System

-

Separating System -+

Fig. 1. The mixing-separating system for PNL mixturps

Besidm its theomtical interest, this modcl betonging to the L-ZMNL3 fainily suits perfectly for a lot of real-world applications. For instancc, such mod& appear in sensors array processing [14], satellite and microwave communications [17], and in many biological systems [lo].

The simplest choice for t.he separating system F is the mirror structure of the mixing system

F

(see Figure I). In [22], it has been shown t.hat these mixtures are separable for distributions having at nioet one Gaussian source, with the same indet.erminacies as linear mixtures.

PNL algorithms In [22], the mntual information

between the components yl,.

. .

,yn of the output vector is used as the independence criterion in bot.h linear and non-linear stages. For the linear part, minimization of the miitual information leads to the same estimation equations as for linear mixtures [8,5]

where components $, of t h e vector are score functions of the coniponrnts y, of the vector y.

Here p,(u) is the pdf of 11, and pi(.) its derivative.

For the nonlinear stage, one can derive from the estimating equations the gradient rule [22]

Here zc is the kth component of the observation vector, b,k is the eletncnt iC of tlir scparaling matrix B, and g; is the derivativp of the k-tli nonlinear function gr. T h e exact roniputation algorithm depends on the specific parametric form of the nonlinear mapping g,+{Ok,zk). In [22], a inultilayer perceptron network is u s 4 for modeling the functions g C ( B b , q ) , k = 1,. . . , n.

Contrary to BSS of linear mixtures, separation performance for nonlinear mixtures is strongly related to the estimation accuracy of the score functions (7) [22]. The score functions (7) must be estimated adaptively from the output vector y. Several alternative ways to do this are considered in [22]. The first approach is to estimate the pdf, and then compute using differentiation the score function. Pdf estimation based on the Cram-Charlier expansion [6,8] fails except for mild post- nonlinear distortions. For hard nonlinearities, a pdf estimation based on kernel methods is preferablc.

The second method estimates the score funct,ions directly, and provides very good results for hard L stands for Linear and ZMNL stands for Zerc-Memory NonLinearity: it is a separable model with a linear stage fallowed by a nonlinear (static) distortion.

(4)

Fig. 2. A Wiener system consists of a filter follmed by a distortion (left). For inverting the Wiener system, une uses a HaIiinierstein system (right), which consists of a distortion fallowed by a linear filter.

nonlinearities, toe. A well performing batch type method for estimating the score functions has been introduced i n a later paper [all. Several other authors have studied methods for blind separation of post-nonlinear mixtures starting from different viewpoints : other patameterizations of non-linear mappings [13,1,15,14, tcinporal decorrelatian of S O I I I C ~ [24], geometrical approaches [la, 31, ,md wnie tricks to improve tho dgoritms 1201.

2.3 CPNL m i x t u r e s

Separability of PNL mixtures can be generalized to convolutive PNL (CPNL) mixtures, in which the instantancuiirs mixture ( ~ n a t r i x A) is replaced by linear filters (matrix of filters A(,?)), where each sourre is independent and identically distributed (iid) [21. In fact, denoting A(z) =

C k

A ~ z - ~ , and definiiigss

(...,

s T ( k - l ) , s T ( k ) , s T ( k + l ) , . . . )Tandck ( . . . , e T ( k - l),eT(k),eT(k+l),...)T, we have:

e = f (As) 19)

where f acts co~npunentwiue, and:

T h e iid nature of t h e source sattipleu, i.e. the temporal independence of s , ( k ) , i = 1,. . . , n , insures the spatial independence of s. Then, the CPNL m i x t u r e can be viewed ;t9 a particular PNL mixtures. For FIR niixirig matrix A ( r ) , (9) correspoitds to a finite dimension PNL mixture and t h e separability holds. For more general filter (IIK) matrix, (9) is an infinite dimension PNL mixture, and the separability can be conjectured.

Algorithms for separating sources in CPNL mixtures are based on randoin procesvev (instead of random r-driables)indel)~nndence, which leads to very t.ricky criteria. Practically, one can use simplified criterion like J = I ( y l ( n ) g j ( n - k)), wliich demands high computation cost, but can still be simplified.

2.4 Wiener systems

With a siritahle parameterization, it can be easily shown that the problem of blind inversion of Wiener systcms (Fig. 2) is equivalent t o the source separation problem in PNL mixtures [23]. Its output writL% as

where s ( t ) is the independent and identically distributed (iid) input, e ( t ) is the observation, hjk) denotes the entries of the iinknown filter H , assumed invertible, and f is the unknown nonlinear mapping, assumed invertible a n d nietnotyless.

(5)

3

In this section we show that prior information on the sources can simplify the indeterminacics.

Specifically, we exploit the temporal correlation of the sources. Each source si(t), i = 1 , . . . ,n is assumed to be temporally correlated, and is modeled by a q-order Markov model, i.e. :

Time correlated

sources in nonlinear mixtures

P r i ( ~ , ( t ) l ~ i ( t - I ) , . . . , ~ i ( l ) ) =~s.(a,(t)lst(t- I),... , s i ( t - q ) ) (12) where pa, denotes the pdf of the random variable si.

Since output independence leads to source scparation, a possible approach for separating source is to consider a criterion measuring the independence of the out.put y . Following I16,7l, one can use the conditional mutual information of y, denoted by I :

whichisalwaysnonnegative,andzeroifandonly if thevariableswi(t) = g i ( % ) l g i { f - l ) , . . . , y i ( t - q ) are statistically independent for i = 1,. 1 I s l , i.e. the signals y i ( t ) , i = 1 , . . . , n are independent Markovian process.

Considwing the separation structure (Fig. l ) , where y(t) = Bz(t) and z , ( t ) = g i ( 8 ; , ~ , ( t ) ) , ~ the outputs can be estimated by minimizing:

(14) In practice, under the ergodicity conditions, t.he mathematical expectation (14) can be estimated by a time averaging, denoted .?(B,@), which rrquires the estimation of the coitditional densities of the estimated murces. Asympt.otically, cxtmding the results for linear niixturcs rif Markovian sourcps [7], the equivalence of the mutual information minimization method with the Maximum Likelihood method still holds for PNL mixtures of Markovian sources. As for linear mixtures, this method based on Markov model is very efficient for post-nonlinear mixture. For an computation cost equal to Y , where g is the order of the Markov niodel, the performance (in term of SNR) is increased and it becomes possible t.o separate Gaussiari wurces [ll].

4

Compensation

of

non-invertible nonlinearitiss

For PNL mixing system with non-bijective functions, the previous algorithms cannot inverse the mixtures, because of the indeterminacy of the inverse functions. For exaimpie, when the unknown non-linear functions are f i ( . ) = (.)', the inversc is as

+m.

In this case, assuming the inverse is known, we have to solve the sign indeterminacy. We investigated if this is possible using the time corrclation of the signals.

Assuming the non-linear inverse function i s known and equal t o

fl,

it remains to estimate its sign e and the linear part of the separating structure. Thun, a t each t.ime 31, thc output sample z,(n

+

1) is predicted by linear predict.ion (LP) with the N prtwinns samples:

N - 1

%(la

+

1)

= 1

cjzi(n - k),

k=O

and we select the sign t which minimizes thc square error ( i i ( s

+

1) -

t d m ) ' ,

where t = +1 or e = -1. In fact, if lei1 is large, the prediction is easy. On the contrary, the sign determination

'

g i ( & , x * ( t ) ) is a parametric model of si(.), where e, can represent a set of parameters

(6)

Fig. 3. Minimum (drshcd circles), maximum (dashed diamonds) and average (plain) performance over 10 experiments, versus a (left) and versus the number of samples, for a = 0.5 (right).

is difficult when le,l is close t o zero. For avoiding noisy estimation of t h e sign, it is necessary to chose the LYC order N greater than 2 or 3. Practically, since sign changes occur when le,l is close to zero, for decreasing the computation cost, we can estimate the sign only for (et1

<

8.

The linear part (the inatrix 8) is estimated according to estiniation equation (6).

4.1 E x p e r i m e n t a l results

In the first experiment, we corisidcr two raidom colored sourcm mixed with a linear matrix

A =

(2)

with a E [0.2,0.8]. The non-linear mappings are f,(.) = (.)'~ The sources are generated by filtering a Gaussian noise (500 samples) with the AK filter l/AR(z), where A R ( z ) = [l - 4 . 6 ~ - ' + 8 . 5 ~ - ~ -

8 . 0 ~ - ~ + 3 . 8 2 - ~ -0.752-"]. E'igrrrc 3 plots the minimutn, the maximum and the average performance over 10 experiments, measured as the residual error in d B (E, =

&[&?I),

versus a. We see

that the perfort~iarice does not depends on a , i . e . on the mixture hardness.

It1 the second experiment, we studied how the performance varies with the sample number.

In figurc 3, using t h e same mixing system ay in the previous experiment, with a = 0.5, we plot the maximnm, minimum and average values of SNR, versus the sample number of samples. Each experiment is stilt repeated 10 times. Due t o the strong correlation between successive samples, it is necessary to use a large n u n i l w of sampltu: for achieving a good estimation of t h e linear part as well as a good decision on the sign of the non-linear part^

Figure 4 is depicted one typical example of recovered source (plain line) and t h e true source (dashed line). We can observe that sometimes the truc sign is lost (due to a wrong estimation) but it mainly occurs for lei1 close to zero, and the true sign is quickly recovered.

Although this work is promising, the next step, which consists in blindly estimating the (un- known) non-linear mapping *gi and B, is much more tricky. However, this simple example shows how weak priors can be used for solving i l l - p o d problems.

5 Concluding remarks

In this paper, we h a w conside~ed ICA and BSS problems for nonlinear mixture models. It appears clearly BSS and ICA arc difficult arid ill-posed problems, and regularization is necessary for actually achieving ICA solutions which coincide to BSS.

(7)

7 " " " " ' I

-R A ,,

& & .m

A

i o 4 m d,

I

Fig.4. Recovered Paurce (plain) and the true source (dmhed) for one experiruent

In this purpose, two main ways can be used. First., solving the nonlinear BSS problem appro- priately using only the independence assumption is lrossible only if mixtures as well as separation structure are structurally constrained: for example post-nonlinear mixtures. Second, prior informa- tion on sources, for example temporally correlated sourcez, can simplify the algorithms or reduce the indeterminacies in thc solutions.

A lot of work remains to be dotie in studying the nonlinear ICA and USS protrlenis. Especially, the solution of more general n o n - h e a r problems, adding a few priors: temporal correlation, more obsexvations than sources [O, 121. Finally, up to now, the research has addressed mainly theoretic&

problems. The results will become more widely interesting only if they can be validated on realistic problems using real-world data [4].

6 Acknowledgments

This work has been partly supported by the European Commission project BLISS (IST-1999-14190), and by the University of Vic under the grant R0912.

References

1. S. Achard, D. T. Pham, and C. Jutten. Blind sourcc separation io post nonlinear mixturcs. In Pmc.

Int. Conf. on Independenf Component AnaIyJis and Signal Separation ( E A 2OOl), pages 295-300, San Diego, CA, USA, 2001.

2. M. Elabaio-Zadeh, C . Jutten, and K. Nayebi. Separating convolutive post noli-linear mixtures. In Pmc.

Int- Conj. o n Independent Component Analysia and Signal Separation (ICA ZOOl), pages 138-143, San Diego (California, USA), 2001.

3. M. Babaio-Zadeh, C. Jutten, and K. Nayebi. A geometric approach far separating past nonlinear mixtures. I n Pmc. of the XI European Signal Processing Conf (EUSIPCO ZOOZ), vulume I[, pages 11-14, Toulouse, Rance, 2002.

4. G. Bedoya, C. Jutten, S . BPrmejo, and J. Cabest,any. Improving semiconductor-b-ed chemical sensor arrays using advanced algorithms for blind source separation. In Proceeding8 of Sfcon 04, January 27-29, New Orleans (USA), 2004.

5. A. Cichocki aod S . 4 Amari. Adaptrue Blind Signal and Image Pmcessing - Lenrnrng Algorithms and Applications. J. Wiley, 2002.

6. P. Comon. lndependent component analysis, a new concept? Signal Processing, 36(3):287-314, 1994 7. S. Hmeini, C . Jutten, and D.-T. Pham. Markovian source separation IEEE Tkans. on Signal Processing,

51:3009-3019. 2003.

(8)

8. A. Hyv&inen, .I.Karhuncn, and E. Oja. Independent Component Analysis. J. Wiley, 2001.

9. A. Ilin, S. Achard, and C. Jutten. Bayesian versus constrained structure approaches for source separation in post-notilinear mixtures. In Proceedings IJCNN 2004 (to appear), 2004.

10. M. Korenberg and I. Hunter. The identification of uonlinear biological systems: Lh’L cascade models.

B i d . Cybemetics, 43(12):125-134, December 1995.

11. A. Larue, C. Jutten, and S. Hosseiui. Markovian source separation in non-linear mixtures. Granada, Spain, September 2004. Submitted to ICA 2004.

12. J. Lee, C. Jutten, and hi. Verleysen. Non-linear ICA using isometric dimensionality reduction. Granada, Spain, September 2004. Submitted to ICA 2004.

13. T.-W. Lre, B. Koehler, and R. Orglmeister. Blind murce separation of nonlinear mixing models. In Neuml Network8 for Signal Pmcesrmg V U , pages 406415. I E E E Press, 1997.

14. A. Parascliiv-Ioncscu, C. Jutten, arid (2. Bouvier. Sonrce scparation based processing Cor smart sensor arrays. IEEE Sensors Joumal, 2(6):663-673, December 2002.

15. H. Peng, Z. Chi, and W. Siu. A semi-parainctric hybrid neural model for nonlinear blind signal separa- tiun. Inl. J. of Neuml Systetrrs, 10(2):79-94, 2000.

16. D. T. Pliam. Fast algorithm for -timating mutual information, entropies and score functions. In Pmc. h i . Cunf. o n Independent Componenb Analyses and Signal Separafron (ICA 2003), pages 17-22, NaraJapan, 2003.

17. 5. Prakriya and D. Hathakus. Blind identification of LTI-ZMNL-LTI nonlinear channel models. IEEE Duns. an Signal Pmrmsing, 13(12):3007-3013, December 1995.

18. C. Puntonet, hZ. A l w e z , A. Prieto, aud B. Prieto. Separation of sources in a class of post-nonlinear mixtures. lo Pmc. 6th European S y m p . on Artificial Neurol Networks (ESANN’SB), pages 321-326, B r u g q Belgium, April 1998.

19. hi. Solazzi, R. Parisi, and A. Uucini. Blind source separation in nonlinear mixtures by adaptive spline rieural networks. I n Proc Int. Conf. on Independenl Component A n d y & and S+gnol Sepnmfion ( E A POOJ), pagn 254-259, San Diego, C A , U S A , 2001.

20. 1. Sol&Casals, M. Babaie-Zadeh, C. Jutten, and D.-T. Pham. Improving algorithm speed in PNL mix- turc scpnration and Wiener system inversion. In Proc. Int. Covt on Independent Component Analysia and Stqnd Sepamtiun (!CA 2803), Nara (Japan), April 2003.

21. A . Taleb arid C. Jutten. Batch algorithm for sourcc separation in pmt-nonlinear mixtures. In P m . Int.

Cunf on Independent Component Anulysaa and Signol Sepamhon (ICA l999)* page 155-160, Aussnis, Fratice, 1999.

22. A. Taleb and C. Jutten. Source separation in post-nonlinear mixtures. IEEE %ns. on Signal Processing, 47(10):2807-282[1, 1999.

23. A. Talcb, J. So16-~haIs, arid C. Jutten. Quz&i-nonpar;unetric blind inversion of Wiener systems. IEEE f i n s . o n Signal Pmcessmg, 49(5)-917-924, 2001.

24. A. Ziehe, M. Kawanabe, S. Harmeling, aud K.-R. Muller. Separation of post-nonlinear mixtures using ACE and tcmyoral decorrelation. 1x1 Pmc. Int. Cunf. on Independent Cumponent A n o l y s i ~ and Signal Sepamtion (ICA 20011, pages 433-438, San Diego, USA, 2001.

Referencias

Documento similar