<- Atrás

Revista ELECTRO

Vol. 45 – Año 2023

Artículo

TÍTULO

Método Basado en Espectograma de Mel y Expresiones Regulares para la Identificación en Tiempo Real de Vehículos de Emergencia

AUTORES

Alberto Pacheco González, Raymundo Torres, Raúl Chacón Blanco, Isidro Robledo Vega

RESUMEN

En situaciones de emergencia , el desplazamiento de ambulancias por las vialidades de una urbe puede ser problemático debido al tráfico vehicular. El presente trabajo presenta un método para la detección de sirenas de vehículos de emergencia en tiempo real. Para obtener la huella digi tal de una sirena Hi -Lo se aplicaron diversas técnicas de procesamiento digital de señales y simbolización de señales, mismas que fueron contrastadas contra un clasificador de audio basado en una red neuronal profunda, partiendo de un mismo conjunto de 280 audios de sonidos ambientales y 38 audios de sirenas Hi -Lo. En ambos métodos se evaluó su precisión a partir de una matriz de confusión y diversas métricas. La precisión del algoritmo DSP desarrollado presentó una mayor capacidad para discriminar entre la señal y el ruido, en comparación con el modelo CNN.

Palabras Clave: detección de vehículos de emergencia, huella digital de audio, simbolización de series temporales, detección de eventos acústicos, espectrogramas de Mel.

ABSTRACT

In emergency situations, the movement of vehicles through city streets can be problematic due to vehicular traffic. This paper presents a method for detecting emergency vehicle sirens in real time. To derive a siren Hi-Lo audio fingerprint it was necessary to apply digital signal processing techniques and signal symbolization, contrasting against a deep neural network audio classifier feeding 280 environmental sounds and 38 Hi-Lo sirens. In both methods, their precision was evaluated based on a confusion matrix and va rious metrics. The precision of the developed DSP algorithm presented a greater ability to discriminate between signal and noise, compared to the CNN model.

Keywords: emergency vehicle detection, audio fingerprint, time series symbolization, acoustic even t detection, Mel spectrogram .

REFERENCIAS

[1] U. Mittal, P. Chawla, “Acoustic Based Emergency Vehicle Detection Using Ensemble of deep Learning Models”. Procedia Computer Science, 2023.
[2] “ISO 7731: Ergonomics-Danger signals are further subdivided and work areas-Auditory danger signals,” International Organization for Standardization, 2013.
[3] C. Guy, R. Sherrat, D.M. Townsend, “Canc ellation of siren noise from two-way voice communication inside emergency vehicles”. Computing & Control Engineering Journal. 13. 5 – 10, 2002.
[4] F. Meucci, L. Pierucci, E. Del Re, L. Lastrucci, P. Desii, “A realtime siren detector to improve safety of guide in traffic environment,”16th European Signal Processing Conference, pp.25-29, 2008.
[5] J. Schroder, S. Goetze, V. Grutzmacher, J. Anemüller, “Automatic acoustic siren detection in traffic noise by part-based models. Acoustics, Speech, and Signal Processing”, 1988. ICASSP-88., 1988 International Conference on. 493-497, 2013.
[6] V. Tran, Y. Yan, W. TSAI, “Detection of ambulance and fire truck siren sounds using neural networks”, Research World International Conference, Hanoi, Vietnam, July 2018.
[7] Y. Ebizuka, S. Kat o, M. Itami, "Detecting approach of emergency vehicles using siren sound processing", IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, pp. 4431-4436, 2019, doi: 10.1109/ITSC.2019.8917028.
[8] J. Haitsma, T. Kalker, “A Highly Ro bust Audio Fingerprinting System”, International Society for Music Information Retrieval Conference, 2002.
[9] J. Proakis, D. Manolakis, “Digital Signal Processing – Principles, Algorithms and Applications”, 3rd Edn., Prentice Hall International Inc., Upper Sa ddle River, NJ, USA, ISBN 0-13-394338-9, 1996.
[10] Md. Rahman, “Digital Signal Processing. Proceedings of the Short Course on Microcontroller/Microprocessor Based Industrial Production”. 1. DSP-1 – DSP, 2004.
[11] Md. Shahrin, M. Huzaifah, “Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks”, 2017.
[12] V. Boddapati, A. Petef, J. Rasmusson, L. Lundberg. “Classifying environmental sounds using image recognition networks”. Procedia Computer Science, Vol. 112, pp. 2048-2056, 2017, ISSN 1877-0509.
[13] V. K. Kukkala, J. Tunnell, S. Pasricha, T. Bradley, "Advanced Driver-Assistance Systems: A Path Toward Autonomous Vehicles," in IEEE Consumer Electronics Magazine, vol. 7, no. 5, pp. 18-25, Sept. 2018, doi: 10.110 9/MCE.2018.2828440.
[14] R. Vio, W. Wamsteker, W, “Limits of the cross‐correlation function in the analysis of short time series”, Astronomical Society of the Pacific, 113(779), p. 86, 2001.
[15] MathWorks (2023), Correlation (Documentation ), [en línea], disponible: https://www.mathworks.com/help/dsp/ref/correlation.html
[16] Y. Li and A. Ray, “Unsupervised Symbolization of Signal Time Series for Extraction of the Embedde d Information,” Entropy, vol. 19, no. 4, p. 148, Mar. 2017, doi: 10.3390/e1904014.
[17] J. Lin, et al., “Experiencing SAX: a novel symbolic representation of time series”, Data Mining and Knowledge Discovery, 15, 107 –144, 2007, DOI: 10.1007/s10618-007-0064-z
[18] K. J. Piczak, ‘‘ESC: Dataset for environmental sound classification’’ in Proc. 23rd ACM Int. Conf. Multimedia, 2015, pp. 1015 –1018.
[19] J. Salamon, C. Jacoby, and J. P. Bello, ‘‘A dataset and taxonomy for urban sound research,’’ in Proc. 22nd ACM Int. Conf. Multimedia, 2014, pp. 1041 –1044.
[20] K. Palanisamy, D. Singhania, A. Yao. 2020. “Rethinking CNN Models for Audio Classification”, 2021, arXiv:2007.11154.
[21] L. Nanni, G. Maguolo, S. Brahnam, y M. Paci, “An Ensemble of Convolutional Neural Networks for Audio Classification”, Applied Sciences 11(13): 5796, 2021.
[22] P. Zinemanas, “An Interpretable Deep Learning Model for Automatic Sound Classification”, Electronics 10(7): 850, 2021.
[23] E. Tsalera, A. Papadakis, M. S amarakou, “Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning”, Journal of Sensor and Actuator Networks 10(4): 72, 2021.
[24] S, Sungho, “Self-Supervised Transfer Learning from Natural Images for Sound Classification”, Applied Sciences 11(7): 3043, 2021.
[25] B. McFee, C. Raffel, D. Liang, D. Ellis, M. McVicar, E. Battenberg, O. Nieto, ‘‘Librosa: Audio and music signal analysis in PyThon,’’ in Proc. 14th PyThon Sci. Conf., pp. 18 –25, 2015.
[26] S. Adapa, "Urban Sound Tagging using Convolu tional Neural Networks", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 5 –9, New York University, NY, USA, oct. 2019.
[27] P. Kabai, (2022), Audio File Format Specifications [en línea], disponible: https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html
[28] Apple Inc. (2011), Apple Core Audio Format Specification 1.0 [en línea], disponible: https://developer.apple.com/library/archive/documentation/MusicAudio/Reference/CAFSpec/CAF_intro/CAF_ intro.html
[29] Moonpoint, (2016), AFconvert. Moonpoint [en línea], disponible: https://support.moonpoint.com/os/os-x/audio/afconvert.php
[30] Apple Inc, (s.f.), Apple Documentation [en línea], disponible: https://apple.github.io/turicreate/docs/userguide/
[31] J. Lin, E. Keogh, S. Lonardi, B. Chiu, “A symbolic representation of time series, with implications for streaming algorithms”, en Proc. 8th ACM SIGMOD workshop on Research issues in data mining a nd knowledge discovery (DMKD '03), New York, NY, USA, pp. 2 –11, 2003, DOI: 10.1145/882082.882086
[32] R. Torres, “Detección y clasificación de objetos usando sensores acústicos y aprendizaje automático para sistemas de frontera inteligentes ”, Tesis de máster, D PI, TCNM campus Chihuahua, Chihuahua, Chih, ( en proceso).

CITAR COMO:

Alberto Pacheco González, Raymundo Torres, Raúl Chacón Blanco, Isidro Robledo Vega, "Método Basado en Espectograma de Mel y Expresiones Regulares para la Identificación en Tiempo Real de Vehículos de Emergencia", Revista ELECTRO, Vol. 45, 2023, pp.184-189.

VERSIÓN PDF

(Abrir archivo PDF en una nueva pestaña)