• No se han encontrado resultados

( 25 ms – 30 ms – 40 ms – 50 ms ) ( / Maximum – / Average )

N/A
N/A
Protected

Academic year: 2023

Share "( 25 ms – 30 ms – 40 ms – 50 ms ) ( / Maximum – / Average ) "

Copied!
20
0
0

Texto completo

(1)

- Frame Length

● Normalization :

( 25 ms – 30 ms – 40 ms – 50 ms ) ( / Maximum – / Average )

- % Overlap ( 40 % – 50 % – 60 % )

● Framming :

● Silence removal : ( Threshold Based )

● Windowing : ( Hamming window – Hanning window )

Audio Signal Preprocessing

● Filtration : ( LPF – HPF – BPF – BRF ) ( With the same length )

Time

Freq.

● Signals alignment : ( Zero padding - Constant # frames )

(2)

Audio Signal Preprocessing

● Silence removal :

X

X

(3)

Audio Signal Preprocessing

● Normalization : ( / Maximum )

(4)

Audio Signal Preprocessing

- Frame Length ( 30 ms ) - % Overlap ( 60 % )

● Framming :

(5)

Audio Signal Preprocessing

● Windowing : ( Hamming window )

(6)

Audio Signal Preprocessing

● Signals alignment : ( Zero padding )

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.119 0.954 -0.236

2.296 -1.404

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.119 0.954 -0.236

2.296 -1.404

0 0 0 0 0

1 st

method

(7)

Audio Signal Preprocessing

● Signals alignment : ( Zero padding )

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.119 0.954 -0.236

2.296 -1.404

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968

0 -0.076 -0.597

0 -0.236

2.296 0 -1.404

0.119 0 0.954 -0.236

0 2.296 -1.404

2 nd

method

(8)

Audio Signal Preprocessing

● Signals alignment : ( Constant # frames )

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.119 0.954 -0.236

2.296 -1.404

30 ms

50 ms

(9)

Audio Signal Preprocessing

● Signals alignment : ( Problem of constant frame length )

0.119 -0.968 -0.076 -0.597 -0.236 2.296 -1.404 0.119 0.954 0.240 0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.679 -0.968 -0.076 -0.597 -0.236 2.296 -1.404

0.119 0.954 -0.236

2.296 -1.404

30 ms

30 ms

(10)

Audio Signal Preprocessing

● Filtration : ( LPF – HPF – BPF – BRF )

(11)

Audio Signal Features extraction

(12)

1- Time domain features :

Energy

Zero Crossing Rate

Average Magnitude

Time Domain Features

(13)

-

2 Frequency domain features :

Short term energy, (Est).

Short term ZCR, (Zst).

Mean frequency, (𝑓𝑚𝑒𝑎𝑛).

Standard Deviation, (𝑓𝑠𝑑).

First Quartile, (Q25).

Third Quartile, (Q75).

Interquartile range, (QIQR).

Skewness, (𝑓𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠).

Frequency centroid, (𝑓𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑).

Mean fundamental frequency, (𝐹𝑚𝑒𝑎𝑛).

Mel Frequency Cpestral Coefficients, (M𝐹CC11).

1109 954 0.240 0.679 -0.968

0.076 0.597 0.236 2.296 1.404 0.316

Ex:

Frequency Domain Features

(14)

Time vs Frequency Domain

Time

Freq.

(15)

Speech Energy

)

2 ( m X

m

E n



(16)

Speech Energy

(17)

Signal ZCR

(18)

Signal ZCR

(19)

STE vs ZCR

Signal

Energy

ZCR

(20)

Average Magnitude

Referencias

Documento similar

A Carta dos Direitos Fundamentais da União Europeia significou, no entanto, uma oportunidade que a Convenção que elaborou a Carta não desperdiçou de superar um dos

JUNIO JULIO AGOSTO

In this work an ASR system of isolated Quechua numbers is developed where Mel-Frequency Cepstral Coefficients MFCC, Dynamic Time Warping DTW and K-Nearest Neighbor KNN methods

We got familiar with pre-processing of speech databases based on Mel-frequency cepstral coefficients (MFCC) and used a matlab library to successfully implement feature extraction on

HASTA EL 50% DE DECUENTO EN MERCADERÍA SELECCIONADA * APLICA RESTRICCIONES 15 DE FEBRERO 15 DE MARZO MILAGRO BOUTIQUE PASSION ROPA Y ACCESORIOS 10% Y 15% DE DESCUENTO * APLICA

Such approach, consisting in classifying patterns of so-called Mel- frequency cepstral coefficients (MFCC), does not re- quire prior pitch estimation and has proven to be fairly

Volatilidad: Medida de la frecuencia e intensidad de los cambios del precio de un activo definido como desviación estándar de dicho cambio en un horizonte de 12 meses.. Rentabilidad

η 1 A 30% de potencia calorífica nominal y régimen de baja temperatura (**) / At 30% of rated heat output and low-temperature regime (**) (*) Régimen de alta temperatura