• No se han encontrado resultados

Spanish sign language interpreter for mexican linguistics

N/A
N/A
Protected

Academic year: 2017

Share "Spanish sign language interpreter for mexican linguistics"

Copied!
6
0
0

Texto completo

(1)

Spanish Sign Language Interpreter for Mexian

Linguistis.

F.L. Juan-Barragán 1

,J. A. Pérez-Grana 1

, F. Cervantes 1

,M. Shwarzblat Y Katz 1

, M.G.

Olide-Márquez 1

,and A.P. Pérez-Sanhez 1

1

Instituto Tenológio Superior de Chapala, Chapala,Jaliso,ZP:45900, Méxio.,

uisjuanitshapala.om

Abstrat

Wepresentheretherstvisualinterfaefora

Mex-ian Spanish Sign Language translator on its rst

development stage: sign-writing reognition. The

software was developed for the unique

harater-istis of Mexian linguistis and was designed in

order to use sentenes or a sequene of signs in

sign-writingsystemwhih aredeodedbythe

pro-gram and onverted into a series of images with

movementthatorrespondtotheMexiansign

lan-guage system. Using a lexial, syntati and

se-mantialgorithmsplusfreesoftwaresuhasAPIs's

fromJava,videoonvertersoftware,database

man-agerlikeMySQL,PostgresandSQlite,waspossible

to read and interpret the rih and omplex

Mexi-an language. Ourappliation forvisual interfae

showed to be apable of reading and reonstrut

eahsenteneusedfortheinterpreterandtranslate

it into a high denition video. The average time

of video display vs number of sentenes to

inter-pret, probedto bein linearrelation withan

aver-agetimeoftwoseondspersentene. Thesoftware

has overometheproblem of homonym words

fre-quently used in Spanish language and verb tense

relation foreah sentene, speial symbolssuh as

#,%,$,et. are stillnotreognized intothe

soft-ware. Keywords:MexianSignLanguage(MLS);

Spoken languagetranslation; Sign animation;

Sin-tatiAlgorithm

Introdution

TheSignLanguageisasystememployedtostablish

ommuniationbetweenpersonswithdis-apability

bothauditiveandphoneti. Apersonwithsuh

dis-apabilityfaes several obstaleswhile integrating

into soiety. Inorder to overomesuhdiulties

(betweenaregularperson andadeafordeaf-mute

person) SignLanguageInterpreterssoftwareshave

been developed to attend this imperative need of

ommuniation. Inthe last twodeadesthere has

been more and more advanes in visual interfaes

forSign LanguageInterpreters(Pardoaet al2009,

Halawani2008,Dyngetal. 2008,Pradaetal. 2008,

Barraet al. 2007, Masakata2006, Meurant 2004,

Nyst2004,Endbarg-Pederson2003,Stouke1960).

The development of a Sign Language Interfae

(SLI) strongly depends on the ountry where the

SignLanguageisusedsinenotonlytheSLIis

vari-ablewithin dierentountries but also eah

oun-tryhasitsownharateristisfortheiroial

lan-guages.

Itis well knownthat MexiousesSpanishasits

oial language, however, the mexian linguistis

(ML's)dierabruptlybetweentheSpanishusedin

other ountries suh as Spain for example or the

kind Spanish linguistis used in south Amerian

ountries suh as Colombia. This is why on the

presentdatetherearestillmanyissuesand

misun-derstandingstowardspersonswhohaveaknoledge

os Spanish dierent than the mexian linguistis

anduseittriyngtoommuniateinMexio. Hene,

translating Mexian text into Mexian Sign

Lan-guage(MSL)requires aunique andspeial

knowl-edgewithinthis harateristilanguage.

SLI'sforSpanishlanguagehavebeenalready

pro-duefor Rodriguez(1991),Prada et al.(2008)and

Pardoa et al (2009), all of these works were

pro-duedonlyforEuropeanspanishlinguistisanduse

3Davatartehnologyto translatetext intoSL. To

thedatethearenoworksrelatedtoaMexianSign

LanguageInterfae(MSLI).

Produing a low ost software for Mexian

lan-guage has been a priority in Mexio ever sine

2006, Mobil hardware devies have been reated

asSLInterpreter(Leyboónet al. 2006),

neverthe-less,the software applied tothis newmobildevie

was ineient to deoding ML's, the system does

not aounts for dierent hand positions, plae of

thehand gesture, hand diretion and most

impor-tant: fae expression, whih is one of the

prevail-ing fators for a deaf-mute ommuniation, sine

emotionalexpressionplaysadeisivefatoronadd

meaning to eah prhase for deaf-mutes (private

(2)

DIF,Jaliso).

This devie (Leyboón et al. 2006) was too

robust and expensive for mass prodution. The

onsequenes of not having suh tools for deaf

and deaf-mute persons has reated an enormous

inapaity of ommuniation among the soiety

added with disrimination fators towards the

personwithsuhdis-apability.

Another fator to be taken into aount at the

time of reating a new type of SLI, points at the

uniqueulturalharateristispresentineah

oun-tryandthen,withdierentpriorityofbasineeds.

ThetoolsdevelopedforourMSLIoriginateonthe

basisoftwomainprioritiesformexiansoiety: (a)

the need to ommuniate from one person to

an-other in order to obtainand provide one or more

serviesand (b)theneedofthepersonwith a

dis-apability to have a meaningful response to this

ommuniation. Onthe latterissue, mexian

per-sons with disapability nd it moremeaningful to

interatwithavideoimagethatanshowupa

sen-sibleemotion thana3Davatar.

While 3D avatarsan indeed be onstruted to

representan emotion whiledisplaying, ithas been

prooved for loal experts on SL (Desarrollo

Inte-gral para la Familia, Jaliso, private

ommunia-tion)that Mexiansdonotrelatewellwithvirtual

imageswhiletryingtoommuniateafeelingor

ne-essity. At the same time we do not disriminate

the advantage of having 3D avatars for the ase

of simplewritten instrutionsorwebpage

transla-tionin ahuman-mahineor inamoregeneral

per-spetive: human-objetinterpretation,here,Prada

et al. (2008) oers the best deal for the

intera-tionwhen ommuniationbetweenpeople itisnot

aprimemanneroranbeavoided.

As an overall piture, the development of the

MSLI does not involvesany new relevant work in

theáreaofsignalproessing,thealgorithmsusedin

thisworkremainthesameknownatdate,however,

thisisaworkthatfousesonthedevelopmentarea

of engenieering, meaning a newpratial and

ino-vative tool devolopedto meet urrentsoialneeds

inthemexiansoiety. Theoutlineofthispaperis

presentedasfollows: setion2desribesthe

mate-rial and methodology employedinorder to explain

thedierenesofanSLI formexianlanguage.

Methodology

In order to have an appropriate translation from

speehtransriptionsintoSLitisneessarytohave

a parallel orpus institution to t the translation

models, test them, evaluate them and have them

orretedineahphaseoftheproess. Inourase,

the developmentof our MSLIwas performed with

the aid of a well founded federal institution:

De-Table 1: Employed software and hardware for

MSLI.

Employedsoftware EmployedHardware

APIs's(Java) videoamera

videoonvertersoftware

databasemanagerMySQL regularkeyword

Postgres regularsreen

SQlite

sarrollo Integral para la Familia de Jaliso (DIF,

Jaliso), whih is a solid foundation institution in

Mexio dediated to aid persons with this type

of disapability (among other funtions in its

pri-mary ativities). DIF, Jaliso, provided us with

theuniqueopportunitytoworkwithseveralexperts

(makingatotalgroupof8instrutorsforMSL)in

ordertotestandapplytheMSLInwiththeir

respe-tivestudents. Thisadvantagegaveusthe

opportu-nitytoattendtothebasiandrealneedsforadeaf

ordeaf-mutepersoninoursoiety.

At the time of working with this orpus it was

notied that a 3D avatar would not be helpful at

adressing person-person translations, where, a 3D

avatar is more suitable to adress a human-objet

translations in a good feasible way. A mayor

em-phasiswaspointedfromDIFJalisoovertoa

deaf-mutepersonhavingarealmeaningful

ommunia-tion,hene,theneedofaonsistenttranslating

sys-teminvolvinghumanfaialexpression. Inthis

man-ner,itwashoosentheuseof interativevideosas

theapropiatewayofprovidingsuh answer.

Thedevelopmentoftheplatformwasdividedinto

four main stages: harater reognition modulus,

the syntatimodulus, the semanti modulus and

thesyntaxmodulus,thesestagesexplainasuient

oordinatedmethodandagoodeienydegreeof

quality.

The rst stage (Figures 1,2) is the harater

reognitionproess performed with a lexial

algo-rithm. Thedesign ofour lexial algorithmonsist

onadoublebuerompilersystem. Forthedouble

buersystemwehavetakentheoupleusedbyAho

etal.1990.

Thelexialanalysisonsistsontheidentiation

ofthe wordorsenteneand theelimination ofnot

usablewordsfortheMSL. Withinthis proessthe

texttobetranslatedisrstapturedonsreen

us-ingabasikeyboard. Afterwards,thesystem

indi-atesto the userif any invalid harater hasbeen

typed,theinvalidharatersinthisstageare#,%,

,$,,

.

,et. Oneallharatersareread,the

pro-essoftranslationisallowedtoontinueonlyifall

haratersarereognizedasvalidharaters. Ifthe

programndsoneinvalidharaterthetranslation

proess stops and a warning message is displayed

onsreenaskingforthetextto beretyped.

(3)

anal-ysis. The algorithm applied was the normal form

of Chomsky'salgorithm (Chomsky1965)this

pro-ess onsists on the identiation of the sentene

struture: subjet, prediate, nouns, onjuntions,

verbs, transitive verbs, intransitiveverbs,

omple-ments, adverbials, and the use of the tenes on

eah verb in the sentene (present, past, and

fu-turetenes). Duringthisproesswordslikeartiles

oronjuntionsinthesenteneareeliminated(sine

signlanguagedoesnotusesanyofthose). Onethe

struture ofthesenteneis analyzedtheproessis

allowedto ontinue(Figures3,4).

Onthethirdstagewendthesemantisanalysis,

thisproessusesthealgorithmtoarrangethemain

sense ofthesenteneandeliminateanyambiguous

senses. Themainobjetiveofthisproessisto

iden-tify eitherhomonymandhomographwords(words

that sound or arewriten in asimilar way suh as

asaoraza,bothexistentonthemexianlanguage

andwithdierentmeaningsdependingontheform

they are aplied in eah sentene). This stage was

ompleted at a70%rate due to theomplexity of

the algorithm, in order to omplete this stage an

emergentbannerappearsat thetime theuseruses

anhomographorhomonimeword,thebannershows

upamenu withoptionsfordierentmeaningsand

givestheusertheoptiontohoosethemore

onve-nientone.

The last stage of the SLI is the generation of

the sequene of images seleted from a respetive

database of the sign language. One the

seman-tis is produed, the syntax algorithm (tal de tal

año) simplyusesthebeforestrutureto seletand

displaytheappropriatevideoprovidingthenthe

ex-petedtranslation.

Results

Ourmain resultsaredesribed aspresentedin the

methodology. In the rst stage we obtained the

produts of sentene typing as well as the voie

reognition proess, it was mesured that this

pro-esstookanaveragetimeof2.5seondsper

reog-nizedsentene. Theaveragetime ofpattern

reog-nitionisfasterwhenthesenteneisonlywritten

in-stead ofspoken,thisaveatwavesonthefat that

voieaptureproessesarestillnotfullyundestood

andkeepedyet underdevelopment.

For the seond stage of the MSLI it was

possi-ble to identify subjet, prediate and verb within

thesentene'sgrammaranddistinguishbetweenthe

present, past or future tenes for the verb (sine

mostofSLdoesnotusesverbtenes)atthispointit

wasalsopossibletoeliminateotherwordsnotused

byMSL suh asonjuntionsorintransitiveverbs.

AsshowninFigure4thereognitionproessofthe

SLIproduesasreenforthevideosequeneanda

spaeat thebottomtoshowthewrittenor spoken

sentene. This typed sentene either, followsor it

ismadetoadapttothegrammatialorderof

Mex-ianlinguistis(noun, verb, prediate,et) in order

toontinuethetranslation. Theproessaldostops

whenaunknownsymbolistypedoraunreognized

wordisspoken(suhasaaainsteadofasa).

Afterwards, the MSLI produes the

orrespon-dentvideointhedatabaseanddierentiatessimilar

senteneswithsimilarwordsbutdierent

gramati-alpuntuation(suhasthespanishmexianaent,

wihthenaddsatotaldierentmeaningtothe

sen-tene).

The semantimodulus wasompleted at a70%

due to the omplexity of the algorithm dealing

with homograph words. In order to solve

om-plete suh inonveniene a reursive help banner

was applied, every time the user types an

homo-graph word a graphi menu appears with several

optionstohoosefrom. Onetheonvenientoption

ishosenbytheusertheinterpretationproess

on-tinues and the videois displayed. Forthe ase of

ompoundsentenestheSLIproduesonlyasingle

videotoshowwiththemainideasoftheompound

sentene.

Within our results it was also reated a small

buer during the proess of lexial analysis. This

buerisathreadmanagerreatedtostoragewords

whih mightbeaptured with voiereognitionin

the implementation of future hardware devie for

sound reognition, notyet employed at this stage

of theSLI but left aside in order to fous only on

thisrststage ofSLI,written sentenereognition

transformedintovideoimage. Finallyitwasnoted

that the quality of the nal videos shows a small

ash in between videos this aveat is due to the

multimedia javaversionemployed forthe

develop-mentofMSLI.

Disussion

TheobtainedresultswiththeSLIweretakentoDIF

Jalisoanditwasprovedhowthegeneralneeds

de-manded at time were mostly overed. Regarding

the employed algorithms used in this SLI: lexial,

syntati and semanti, it is plausible to modify

andgeneratetheurrentsyntatiandsemanti

al-gorithmsemployed hereas mentionedby Montero

(2004) and Earley (2001). As for the lexial

al-gorithm it is now urrently used the standard

al-gorithm of Tokens sine it only onsists on

pat-ternreognitionandtherearenotnewandspeis

needsthatreaquirestoemployorperformanykind

ofmodiationtoit.

An important fat to be addressedis ouruse of

videoimagesinsteadofa3Davatarsuhasthease

of Prada et al. 2008 orHalawani 2008. This

dif-ferene reatesbothadvantagesand disadvantages

(4)

Figure 1: From left to right: (a) SLI Maín

win-dow. (b) Interpreter sreen, () Introdued

sen-tene for SLI. The window of the SLI ontains a

text eldwerethesenteneisintrodued,this

sen-tenemust bewritten insimpletenses,thesystem

aeptslowerandupperasetext.

fat ofhavinga 3D avatarresultsveryonvenient

at the time of simple interation for instrutions

orguidelines using toolssuh aswebpages orany

otherhuman-mahineinterfae,neverthelessthe3D

avatarfaes limitations at the moment of

human-humaninterationswheretheourreneofoffaial

gesturestakesamayorroletobetakenintoaount

inorderaddmeaningtotheonversationfora

per-son with this type of dis-apability. Adding more

detailed expression to this aveat has a high ost

Figure2: Firsttypeofsimplesentene,greeting.

Figure 3: Constrution of a sentene using simple

tenses.

in development but if suh improvement ould be

ahieved thistype oftoolon theSLI ouldhavea

mayor impat on many areas of interation for a

deaf-muteperson.

Thefatofadeaf-muteinteratingwithanother

person reates a basi need to reeive some type

of faial gesturein eah sentene in order to have

a omplete meaning of the onversation. In this

ase, the turn point omes into rating aomplex

onversation,hene,havingaompletesequeneof

imagestofollowupsuhtypeofinteration,theuse

of videoimagesfaes suh diultyand it isbeen

leftasfuture work.

Duringtheproessoftranslatingasenteneinto

imagetheaveragetimeresultantineahproess(2

seonds) learlyindiates that eah algorithmand

theemployedproess produesareliable and

on-sistentresponse that onnetswithsuient

feasi-bility the ontinuity of the translation proess as

suh. Therelationofthe mainvariableswith time

(5)

orrespon-Figure4:Fromuptodown: (a)Changeofmeaning

inthewordaordingtoaentuationsign. (b)

ur-rentbannerforunknownsign,()Senteneseletion

in thease ofhomonym and homograph word. In

(b)theSLIsystemisprogramedtodetethomonym

wordsonthedatabasebyshowinganurrent

ban-ner. Thedesiredmeaningforthesentene orword

isgivento behosenafterwards.

Figure 5: Changes in the meaning of a sentene

dependingongrammatialsignapplied inthe

sen-tene.

dene. Thislinearorrespondeneisthe

mathemat-ialproof of how eah oneofthis variables hasan

independentroleintheproessbesidesthatthe

or-thogonality between them it is also a way to

de-sribetheevolutionofthesystem.

As for the ash seen between images for

trans-latedsenteneitwasnotedthatsuheetwasdue

tooperativesystemanditsdierentversions. This

aw is strongly seen for Windows Operative

Sys-tem (OS), speially, Windows Vistaand Windows

7. The eet greatly diminishes at Windows XP,

while for Ma OS there is no appearane of suh

ash. Westrongly believe that suh disontinuity

isdue totheJavaMedia FrameworkAPIfor

Win-dows whih most probably needs a new

ompati-bility upgradefor themultimedia versionfor

Win-dows,whihisnottheaseoftheMaOS.

Figure6: Formupto down: (a)Errorwindow. (b)

Corretion Sreen, () Sentene. The SLI system

does not adtmits signs suh as$,%,&, dot, olon.

If the user types an invalid harater the systems

displaysamessageon (a). Thesystem anis able

to interpret nomorethan 12digits and the

stru-tureofthisshouldbeinaontinuousform(without

separationsorblankspaes).

TheaseofaSLI asanewtoolforMexian

lin-guistis has the advantage of being the rst tool

in this ountryapable ofreprodueand translate

simplehuman-humanonversationswhereas ithas

been mention the appearane of an photo-eletri

sensor for hand movement asaSLI tool

(Leybon-Ibarra et al. 2006) this partiular software uses

aphoto-eletrieletrode systemwithin an

adapt-ableglovewith3Davatarsreenimagestoprodue

thetranslation,nevertheless,themainlimitationof

suh systemonsists intheinapabilityof freedom

(6)

orien-tation,rossedorbendedngerpositionsandfaial

expression.

Conlusions

ThenewSLIforMexianlanguagewaspresentedin

this artile,our main resultsshowedto be reliable

and satisfatory within the seleted proof sample

whihwereseletedwithintheurrentneedsofthe

atual MSL for Mexians and speially aimed at

the expressed needs of DIF Jaliso. The program

showedtobeapableofoveromingmanyMSL

dif-ultiessuhashomographandhomonymwordsto

onstrutasentene.

Ourvisualinterfaeshowedtobeapableof

read-ing and reonstrut eah sentene used for the

in-terpreterandtranslateitintoahighqualityvideo.

Theaveragetimeofvideodisplayvsnumberof

sen-tenes to interpret, took anaverage time of 2

se-onds per sentene and probed to behold within a

linearrelation, whihshowshow bothvariable are

independentbetweenthem (hene abletodesribe

dierentharateristisoftheevolutionofthe

sys-tem).

Nevertheless, some aveats are still to be

on-sidered and investigated suhas theuse ofspeial

symbols suh as #, %, $, et. whih are still not

reognized into the software. The use of omplex

ompoundsentenesarestillyetnotreognizedby

theSLI andhasto betakeninto aountfor more

elaborateddialogues. Morework forfaialgesture

reognitionhastobedonesinemostofSLpersons

uses themto hange themeaning ofeahsentene

and it isalso awayto reognize themeaning ofa

phaseusedtoommuniatewiththem. Inthemean

time,theuseofSLIforanaverageonversationand

averageneedsofadeaf-mutepersonwitharegular

non-sapientMSLpersonisnowoveredandapable

to performitsmaintask.

Referenes

[1℄ Aranda.,B.E.,2008.Lavulneraióndelos

dere-hos humanos de las personas Sordas en

Méx-io.ComisionNaionaldelosDerehosHumanos,

CNDH.

[2℄ R. Barra, R. Córdoba, L.F. Haroa, F. F

er-nándeza, J. Ferreirosa, J.M. Luasa, J.

Maías-Guarasab, J.M. Monteroa and J.M. Pardoa,

Speeh to sign language translation system for

Spanish,ApliedSoftComputing,2008.

[3℄ Comparán,J.J.,1999.LenguaEspañolaI.

Méx-io: AMATE. Disapaidades, E. C. La

sor-dera y la pérdida de la apaidad auditiva.,

http://www.sitiodesordos.om.ar/sordera.htm.

[4℄ DingLilia,Modellingandreognitionofthe

lin-guistiomponentsin AmerianSign Language,

ApliedSoft Computing,2008,421,105.

[5℄ Donés, R., & Ortíz, C., 2005, XXXV

SIM-POSIO INTERNACIONAL DE LA SEL:

http://www3.unileon.es/dp/dfh/SEL/atas.htm.

[6℄ J. Earley, An Eient Context-Free

Parsing Algorithm, PhD tesis,

Uni-versity of California, Berkeley,

Cali-fornia, 1970, pp. 94-102,

http://www-

2.s.mu.edu/afs/s.mu.edu/projet/mt-55/lti/Courses/711/Class-notes/p94-earley.pdf.

[7℄ El Universal. 2006, :

http://www.eluniversal.om.mx/artiulos

/30484.html.

[8℄ Estrada, B., 2008, Sordos:

www.sordos.org.mx/artiulo.do.

[9℄ Galiia, S. N., 2000, Instituto Politénio

Na-ional Centro de Investigaión en Computaión

LaboratoriodeLenguajeNatural,Análisis

sintá-tio: http://www.gelbukh.om/Tesis

/Soa/tesisnal.htm.

[10℄ Garía, J. R., & Giner, B., 2007, Pearson,

PrentieHall.

[11℄ Leybon I. J, Ramirez B. M.R., Piazo T. V.,

Photo-Eletri Sensor Applied to Hand Fingers

Movement, Computaión y Sistemas, 2006, 10,

556.

[12℄ Lodares,J. R., Apliaiones Lexemátiasala

EnseñanzaDelEspañol,2009,Clarín,Revistade

NuevaLiteratura,78.

[13℄ J. M. Montero M., Desarrollode un Entorno

paraelAnálisis Sintátiode unaLengua

Natu-ral,UniversidadPoliténiadeMadrid, España,

2004,

http://lorien.die.upm.es/juanho-/pfs/JMMM/pfjmmm.pdf.

[14℄ Nuno, R. (1998). Correlatos neurosiológios

del lenguaje de senas en el nino sordo. Proyeto

deInvestigaión.

[15℄ J.M. Pardoa, J. Ferreirosa, V. Samaa, R.

Barra-Chiotea, J.M. Luasa, D. Sánhezb and

A.GaríabSpokenSpanishgenerationfromsign

language,2009,ApliedSoftComputing,123.

[16℄ Dr.Sami M.Halawani, Arabi Sign Language

TranslationSystemOnMobile Devies,

Interna-tionalJournalofComputerSieneandNetwork

Seurity,2008,8,1.

[17℄ SuphattharahaiC.,TowardstheDevelopment

of Speaker-Dependent and Speaker-Independent

Markov Model-Based Thai Speeh Synthesis,

2009,JournalofComputerSiene,5,905.

Referencias

Documento similar

[r]

The HLSML (High Level Signing Markup Language) notation we have cre- ated, was designed with the following four objectives: i) Generate an XML-based notation which could be used

Therefore, it sends the speech tran- scriptions from the speech recognition module to the machine translation module and the subtitle module, and the sequence of glosses from

In that work, the main sources of errors are reported in order of decreasing importance as follows: (a) the differences in the number of words and signs for parallel sentences, due

The signing process begins when a sign message, de- fined using HLSML, reaches the synthesizer. This message is parsed in the HLSML parsing module to obtain the se- quence of

languages of study: Spanish, Catalan and English as a foreign language. Learners’

A survey and critique of American Sign Language natural language generation and machine translation systems.. Technical report, University of

The database also allows storing different realizations of a sign, all of them related to each other by means of a common entry in the database. The database can store infinitive