Search for: [Keywords = "audio"] - PAS Journals

advanced search

Search results

Search for: [Keywords = "audio"]

Filters

Search results

Number of results: 35

items per page: 25 50 75

Sort by:

of 2

Ambisonics’ setup quality assessment through measurements and computations based on ITD and ILD functions using subbands and a Gammatone Filter Bank

Marcin Dąbrowski Jan Skorupa Wojciech Raszewski Maciej Głowiak

International Journal of Electronics and Telecommunications | 2024 | vol. 70 | No 2 | 307-313 | DOI: 10.24425/ijet.2024.149546

Keywords immersive audio sound localization ambisonics

Download PDF Download RIS Download Bibtex

Abstract

Poznan Supercomputing and Networking Center (PSNC) developed an ambisonic installation and workflow as part of audio-visual 8K VR 360° immersive media experiments. This work aimed to investigate the quality of performance of the PSNC setup through both subjective tests as well as simulations providing objective parameters of interaural characteristics in a real-life scenario of PSNC studio. For the objective part, an algorithm for angle estimation has been proposed and computations were performed.

Authors and Affiliations

Marcin Dąbrowski

1

Jan Skorupa

1

Wojciech Raszewski

1

Maciej Głowiak

1

Institute of Bioorganic Chemistry of the Polish Academy of Sciences Poznan Supercomputing and Networking Center, Poland

Analysis of a novel FPGA-based system for filtering audio signals using a finite impulse response filters

Adrian Lipowski Paweł Majewski Sławomir Pluta

International Journal of Electronics and Telecommunications | 2022 | vol. 68 | No 1 | 19-26 | DOI: 10.24425/ijet.2022.139843

Keywords audio filtering FIR filter FPGA signal processing

Download PDF Download RIS Download Bibtex

Abstract

In this article, an analysis of an innovative system for filtering signals in the audible range (16 Hz - 20 kHz) on programmable logic devices using a filters with a finite impulse response, is presented. Mentioned system was neat combination of software and hardware platform, where in the program layer a multiple programming languages including VHDL, JavaScript, Matlab or HTML were used to create completely useful application. To determine the coefficients of polynomial filters the Matlab Filter Design & Analysis Tool was used. Thanks to the developed graphic layer, a user-friendly interface was created, which allows easily transfer the required coefficients from the computer to the executive system. The practical implementation made on the FPGA platform, specifically on the Altera DE2- 115 development kit with the FPGA Cyclone IV, was compared with simulation realization of Matlab FIR filters. The performed research confirm the effectiveness of filtration in real time with up to 128th order of the filter for both audio channels simultaneously in FPGA-based system.

Authors and Affiliations

Adrian Lipowski

1

Paweł Majewski

1

Sławomir Pluta

1

Opole University Technology, Opole, Poland

Subjective Assessment of the Speech Signal Quality Broadcasted by Local Digital Radio in Selected Locations in Wroclaw under Studio and Home Conditions

Stefan Brachmański Maurycy Kin Patrycja Zemankiewicz

International Journal of Electronics and Telecommunications | 2022 | vol. 68 | No 4 | 687-693 | DOI: 10.24425/ijet.2022.141290

Keywords Digital Audio Broadcasting speech quality quality assessment

Download PDF Download RIS Download Bibtex

Abstract

In October 2018, local digital radio was launched to cover the agglomeration of Wroclaw. The implementation of this undertaking required many tests, including qualitative ones, that refer to both music and speech. This paper presents the results of subjective tests based on the evaluation of speech quality of signals recorded at various points in Wroclaw. Measurements were carried out in accordance with the recommendations of the International Telecommunication Union as well as in ordinary acoustic conditions in listeners’ flats. The rating was made for male and female voices. The most important conclusion is that for speech signal assessment in meaning of the quality the test conditions do not influence the obtained results. The other fact confirmed in the experiment was that the receiving place of DAB+ signal in the Single-Frequency Network also does not affect the perceived voice quality.

Authors and Affiliations

Stefan Brachmański

1

e-mail:

ORCID:

Maurycy Kin

1

Patrycja Zemankiewicz

1

Wroclaw University of Science and Technology, Poland

Sound localisation definition in parametrically and non-parametrically decoded first-order ambisonic systems

Jacek Majer

International Journal of Electronics and Telecommunications | 2024 | vol. 70 | No 2 | 355-360 | DOI: 10.24425/ijet.2024.149552

Keywords ambisonics spatial audio parametric decoding psychoacoustics

Download PDF Download RIS Download Bibtex

Abstract

This study assessed sound localisation definition in ambisonic systems using two-non-parametric and three parametric decoders, in a two-dimensional format. The sound samples were played back through eight loudspeakers arranged in a circle. The participants compared pairs of sound samples to determine which sample offered a more precise perception of the sound source’s location. The data analysis, using a Bradley-Terry probability mode, revealed that parametric decoders were preferred with a 60–83% probability. Among the parametric decoders, the COMPASS method, which utilizes the Multiple Signal Classification algorithm for sound source direction estimation, received the highest scores for sound localisation judgements.

Authors and Affiliations

Jacek Majer

1

Chopin University of Music, Department of Sound Engineering, Chair of Musical Acoustics and Multimedia, Warszawa, Poland

Retrospecting Polish Audio Engineering Society Membership on 20th Anniversary of the Polish Section of the Audio Engineering Society

Bożena Kostek Marianna Sankiewicz

Archives of Acoustics | 2011 | vol. 36 | No 2 | 187-197 | DOI: 10.2478/v10168-011-0016-x

Keywords audio and sound engineering musical acoustics

Download PDF Download RIS Download Bibtex

Abstract

In this article some key events concerning founding Polish Section of the Audio Engineering Society were presented. In addition, the history covering International Symposia on Sound Engineering and Mastering was outlined. Also, papers contained in this issue were shortly reviewed.

Authors and Affiliations

Bożena Kostek

Marianna Sankiewicz

A Study on of Music Features Derived from Audio Recordings Examples – a Quantitative Analysis

Aleksandra Dorochowicz Bożena Kostek

Archives of Acoustics | 2018 | vol. 43 | No 3 | 505-516 | DOI: 10.24425/123922

Keywords music genres audio parametrization music features

Download PDF Download RIS Download Bibtex

Abstract

The paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that have an easy interpretation in the context of perceptual relevance. Within the study parameter values were extracted from music excerpts, gathered and compared to determine to what extent they are similar within the songs of the same performer or samples representing the same piece.

Authors and Affiliations

Aleksandra Dorochowicz

Bożena Kostek

16th International Symposium on Sound Engineering and TonmeisteringWarszawa, Poland, October 8 – 10, 2015

Archives of Acoustics | 2016 | vol. 41 | No 1 | 169-175 | DOI: 10.1515/aoa-2016-0017

Keywords sound engineering tonmeistering Audio Engineering Society

Download PDF Download RIS Download Bibtex

Abstract

The 16th International Symposium on Sound Engineering and Tonmeistering (ISSET) organized by the Institute of Radioelectronics and Multimedia Technology (Warsaw University of Technology), Department of Sound Engineering (Fryderyk Chopin University of Music) and the Polish Radio, under auspicious of the Polish Section of the Audio Engineering Society was held in Warsaw on October 8-10 in 2015. The main topics of the Symposium covered mostly all domains of audio engineering, i.e. musical acoustics, noise control, signal processing, room acoustics, radio and television, multimedia, sound engineering and tonmeistering, perception and quality assessment, and many others. The extra attention has been paid for the problems of loudness of audio programs in radio and TV broadcasting. Over 60 people from different branches of audio technology participated in this Symposium and shared their knowledge and experiences during the paper sessions, technical tours, workshops and special presentations. The selection of abstracts of the papers presented at the ISSET’2015 are inserted below.

Implementation of a Novel Audio Network Protocol

Jaeho Lee Hyoungjoon Jeon Pyungho Choi Soonchul Kwon Seunghyun Lee

Archives of Acoustics | 2018 | vol. 43 | No 4 | 637–645 | DOI: 10.24425/aoa.2018.125157

Keywords AoIP DANTE SR System audio mixer

Download PDF Download RIS Download Bibtex

Abstract

Recently, the rapid advancement of the IT industry has resulted in significant changes in audio-system configurations; particularly, the audio over internet protocol (AoIP) network-based audio-transmission technology has received favourable evaluations in this field. Applying the AoIP in a certain section of the multiple-cable zone is advantageous because the installation cost is lower than that for the existing systems, and the original sound is transmitted without any distortion. The existing AoIP-based technology, however, cannot control the audio-signal characteristics of every device and can only transmit multiple audio signals through a network. In this paper, the proposed Audio Network & Control Hierarchy Over peer-to-peer (Anchor) system enables all audio equipment to send and receive signals via a data network, and the receiving device can mix the signals of different IPs. Accordingly, it was possible to improve the system-application flexibility by simplifying the audio-system configuration. The research results confirmed that the received audio signals from different IPs were received, mixed, and output without errors. It is expected that Anchor will become a standard for audio-network protocols.

Authors and Affiliations

Jaeho Lee

Hyoungjoon Jeon

Pyungho Choi

Soonchul Kwon

Seunghyun Lee

Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions – Headphones vs Loudspeakers: A Case Study

Bartłomiej Mróz Bożena Kostek

Archives of Acoustics | 2022 | vol. 47 | No 1 | 71-79 | DOI: 10.24425/aoa.2022.140733

Keywords human perception audio-visual interaction 3D perception binaural spatial audio

Download PDF Download RIS Download Bibtex

Abstract

This study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology assumptions and the experimental setup are described to the participants. Then, the results are presented and analysed using statistical methods. The results indicate that the headphone trials showed much higher perceptual ambiguity for the listeners than when the sound is delivered via loudspeakers. The influence of the visual modality dominates the audio-visual evaluation when loudspeaker playback is employed. Moreover, when the visual stimulus is present, the headphone playback pattern of behavior is not always in response to the loudspeaker playback.

Authors and Affiliations

Bartłomiej Mróz

1 2

Bożena Kostek

2

Multimedia Systems Department, Gdansk, Poland
Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk, Poland

On the Consumption of Multimedia Content Using Mobile Devices: a Year to Year User Case Study

Przemysław Falkowski-Gilski

Archives of Acoustics | 2020 | vol. 45 | No 2 | 321-328 | DOI: 10.24425/aoa.2020.133152

Keywords audio coding broadcasting mobile devices multimedia signal processing streaming services

Download PDF Download RIS Download Bibtex

Abstract

In the early days, consumption of multimedia content related with audio signals was only possible in a stationary manner. The music player was located at home, with a necessary physical drive. An alternative way for an individual was to attend a live performance at a concert hall or host a private concert at home. To sum up, audio-visual effects were only reserved for a narrow group of recipients. Today, thanks to portable players, vision and sound is at last available for everyone. Finally, thanks to multimedia streaming platforms, every music piece or video, e.g. from one’s favourite artist or band, can be viewed anytime and everywhere. The background or status of an individual is no longer an issue. Each person who is connected to the global network can have access to the same resources. This paper is focused on the consumption of multimedia content using mobile devices. It describes a year to year user case study carried out between 2015 and 2019, and describes the development of current trends related with the expectations of modern users. The goal of this study is to aid policymakers, as well as providers, when it comes to designing and evaluating systems and services.

Authors and Affiliations

Przemysław Falkowski-Gilski

Performance Analysis of VoIP Data over IP Networks

Dariusz Strzęciwilk

International Journal of Electronics and Telecommunications | 2021 | vol. 67 | No 4 | 743-750 | DOI: 10.24425/ijet.2021.139801

Keywords MOS VoIP RTCP QoS audio codecs transmission quality

Download PDF Download RIS Download Bibtex

Abstract

The paper presents the results of research and analysis of voice data transmission quality in IP packet networks. It analyses mechanisms allowing for the assessment of packet telephony data transmission quality. Possible transmission quality levels and adequate quality metrics, applicable in the recommendations of standardisation organisations, as well as suggested limit values conditioning acceptable voice data transmission quality were indicated and discussed. A packet network model was designed and tested, taking into account VoIP architecture supporting various audio codecs used for voice compression. Transmission mechanisms based on audio codecs G.711, G.723, G.726, G.728 and G.729 were investigated. It was shown that for delay-sensitive traffic which fluctuates beyond its nominal rate, selected codecs have an advantage over others and allow for better transmission quality of VoIP traffic with guaranteed bandwidth and delay.

Bibliography

[1] S. K. Puspita FM and S. Z. Taib BM, “Improved models of internet charging scheme of single bottleneck link in multi qos networks,” 2013. [Online]. Available: http://ddms.usim.edu.my:80/jspui/handle/123456789/15429
[2] A. R. Modarressi and S. Mohan, “Control and management in next-generation networks: challenges and opportunities,” IEEE Communications Magazine, vol. 38, no. 10, pp. 94–102, 2000. [Online]. Available: https://doi.org/10.1109/35.874976
[3] D. Strzęciwilk, K. Ptaszek, P. Hoser, and I. Antoniku, “A research on the impact of encryption algorithms on the quality of vpn tunnels’ transmission,” in ITM Web of Conferences, vol. 21. EDP Sciences, 2018, p. 00011. [Online]. Available: https://doi.org/10.1051/itmconf/ 20182100011
[4] H. J. Kim and S. G. Choi, “A study on a qos/qoe correlation model for qoe evaluation on iptv service,” in 2010 The 12th International Conference on Advanced Communication Technology (ICACT), vol. 2. IEEE, 2010, pp. 1377–1382.
[5] D. Strzęciwilk, “Examination of transmission quality in the ip multiprotocol label switching corporate networks,” International Journal of Electronics and Telecommunications, vol. 58, pp. 267–272, 2012. [Online]. Available: http://doi.org/10.2478/v10177-012-0037-z
[6] A. J. Estepa, R. Estepa, J. M. Vozmediano, and P. Carrillo, “Dynamic voip codec selection on smartphones,” Netw. Protoc. Algorithms, vol. 6, no. 2, pp. 22–37, 2014. [Online]. Available: https://doi.org/10.5296/npa.v6i2.5370
[7] W. M. Zuberek and D. Strzeciwilk, “Modeling traffic shaping and traffic policing in packet-switched networks,” Journal of Computer Sciences and Applications, vol. 6, no. 2, pp. 75–81, 2018. [Online]. Available: http://pubs.sciepub.com/jcsa/6/2/4
[8] D. Cohen, “Specifications for the network voice protocol,” UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST, Tech. Rep., 1976. [Online]. Available: https://www.rfc-editor.org/info/rfc741
[9] J. Davidson, J. Peters, J. Peters, and B. Gracely, Voice over IP fundamentals. Cisco press, 2000. [10] S. Ganguly and S. Bhatnagar, VoIP: wireless, P2P and new enterprise voice over IP. John Wiley & Sons, 2008.
[11] B. Hartpence, Packet Guide to Voice over IP: A system administrator’s guide to VoIP technologies. " O’Reilly Media, Inc.", 2013.
[12] S. Deering and R. Hinden, “Rfc2460: Internet protocol, version 6 (ipv6) specification,” 1998.
[13] K. Ramakrishnan, S. Floyd, and D. Black, “Rfc3168: The addition of explicit congestion notification (ecn) to ip,” 2001.
[14] K. Nicholas, “Definition of the differentiated services field in the ipv4 and ipv6 headers,” RFC 2474, 1998.
[15] F. Baker, J. Polk, and M. Dolly, “A differentiated services code point (dscp) for capacity-admitted traffic,” Internet Engineering Task Force (IETF), 2010.
[16] D. Strzęciwilk, R. Nafkha, and R. Zawi´slak, “Performance analysis of a qos system with wfq queuing using temporal petri nets,” in International Conference on Computer Information Systems and Industrial Management. Springer, 2021, pp. 462–476. [Online]. Available: https://doi.org/10.1007/978-3-030-84340-3_38 [17] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, “An architecture for differentiated services,” 1998.
[18] D. C. Dowden, R. D. Gitlin, and R. L. Martin, “Next-generation networks,” Bell Labs technical journal, vol. 3, no. 4, pp. 3–14, 1998. [Online]. Available: https://doi.org/10.1002/bltj.2125
[19] G. R. Ash, Traffic engineering and QoS optimization of integrated voice and data networks. Elsevier, 2006.
[20] M. H. Miraz, S. A. Molvi, M. A. Ganie, M. Ali, and A. H. Hussein, “Simulation and analysis of quality of service (qos) parameters of voice over ip (voip) traffic through heterogeneous networks,” arXiv preprint arXiv:1708.01572, 2017. [Online]. Available: https://arxiv.org/abs/1708.01572
[21] E. T. Affonso, R. D. Nunes, R. L. Rosa, G. F. Pivaro, and D. Z. Rodriguez, “Speech quality assessment in wireless voip communication using deep belief network,” IEEE Access, vol. 6, pp. 77 022–77 032, 2018. [Online]. Available: https://doi.org/10.1109/ACCESS.2018.2871072
[22] J. Yu and I. Al-Ajarmeh, “Call admission control and traffic engineering of voip,” in 2007 Second International Conference on Digital Telecommunications (ICDT’07). IEEE, 2007, pp. 11–11.
[23] T. ITU, “Recommendation g. 114, one-way transmission time,” Series G: Transmission Systems and Media, Digital Systems and Networks, Telecommunication Standardization Sector of ITU, 2000.
[24] J. H. James, B. Chen, and L. Garrison, “Implementing voip: a voice transmission performance progress report,” IEEE Communications Magazine, vol. 42, no. 7, pp. 36–41, 2004. [Online]. Available: https://doi.org/10.1109/MCOM.2004.1316528
[25] J. G. Beerends, C. Schmidmer, J. Berger, M. Obermann, R. Ullmann, J. Pomy, and M. Keyhl, “Perceptual objective listening quality assessment (polqa), the third generation itut standard for end-to-end speech quality measurement part i—temporal alignment,” Journal of the Audio Engineering Society, vol. 61, no. 6, pp. 366–384, 2013. [Online]. Available: http://resolver.tudelft.nl/uuid:91d98cbc-d802-40d3-a1bb-a58d67668728
[26] R. D. Nunes, R. L. Rosa, and D. Z. Rodríguez, “Performance improvement of a non-intrusive voice quality metric in lossy networks,” IET Communications, vol. 13, no. 20, pp. 3401–3408, 2019. [Online]. Available: https://doi.org/10.1049/iet-com.2018.5165
[27] B. Naderi and R. Cutler, “An open source implementation of itu-t recommendation p. 808 with validation,” arXiv preprint arXiv:2005.08138, 2020. [Online]. Available: https://arxiv.org/ct?url=https%3A%2F%2Fdx. doi.org%2F10.21437%2FInterspeech.2020-2665&v=69f1738e
[28] A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, “Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,” in 2001 IEEE international conference on acoustics, speech, and signal processing. Proceedings (Cat. No. 01CH37221), vol. 2. IEEE, 2001, pp. 749–752.
[29] S. Voran, “Objective estimation of perceived speech quality. i. development of the measuring normalizing block technique,” IEEE Transactions on speech and audio processing, vol. 7, no. 4, pp. 371–382, 1999. [Online]. Available: https://doi.org/10.1109/89.771259
[30] M. Coto-Jimenez, J. Goddard-Close, L. Di Persia, and H. L. Rufiner, “Hybrid speech enhancement with wiener filters and deep lstm denoising autoencoders,” in 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI). IEEE, 2018, pp. 1–8. [Online]. Available: https://doi.org/10.1109/IWOBI.2018.8464132
[31] L. Ding and R. A. Goubran, “Speech quality prediction in voip using the extended e-model,” in GLOBECOM’03. IEEE Global Telecommunications Conference (IEEE Cat. No. 03CH37489), vol. 7. IEEE, 2003, pp. 3974–3978. [Online]. Available: https://doi.org/10.1109/GLOCOM.2003.1258975
[32] J. A. Bergstra and C. Middelburg, “Itu-t recommendation g. 107: The e-model, a computational model for use in transmission planning,” 2003.
[33] R. Jain, “Quality of experience,” IEEE multimedia, vol. 11, no. 1, pp. 96–95, 2004. [Online]. Available: https://doi.org/10.1109/MMUL.2004.10000
[34] A. Eskandar, M. Syed et al., “Performance analysis of voip over gre tunnel.” International Journal of Computer Network & Information Security, vol. 7, no. 12, 2015. [Online]. Available: http://doi.org/10.5815/ijcnis.2015.12.01
[35] R. S. Ramakrishnan and P. V. Kumar, “Performance analysis of different codecs in voip using sip,” in The Conference on Mobile and Pervasive Computing, 2008, pp. 142–145.
[36] S. Ragot, B. Kovesi, R. Trilling, D. Virette, N. Duc, D. Massaloux, S. Proust, B. Geiser, M. Gartner, S. Schandl et al., “Itu-t g. 729.1: An 8-32 kbit/s scalable coder interoperable with g. 729 for wideband telephony and voice over ip,” in 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, vol. 4. IEEE, 2007, pp. IV–529. [Online]. Available: https://doi.org/10.1109/ICASSP. 2007.366966

Authors and Affiliations

Dariusz Strzęciwilk

1

Institute of Information Technology, University of Life Sciences, Warsaw, Poland

Audio Compression using a Modified Vector Quantization algorithm for Mastering Applications

Shajin Prince Bini D A Alfred Kirubaraj J Samson Immanuel Surya M

International Journal of Electronics and Telecommunications | 2023 | vol. 69 | No 2 | 287-292 | DOI: 10.24425/ijet.2023.144363

Keywords vector quantization scalable perceptual coder audio mastering bit stream

Download PDF Download RIS Download Bibtex

Abstract

Audio data compression is used to reduce the transmission bandwidth and storage requirements of audio data. It is the second stage in the audio mastering process with audio equalization being the first stage. Compression algorithms such as BSAC, MP3 and AAC are used as standards in this paper. The challenge faced in audio compression is compressing the signal at low bit rates. The previous algorithms which work well at low bit rates cannot be dominant at higher bit rates and vice-versa. This paper proposes an altered form of vector quantization algorithm which produces a scalable bit stream which has a number of fine layers of audio fidelity. This modified form of the vector quantization algorithm is used to generate a perceptually audio coder which is scalable and uses the quantization and encoding stages which are responsible for the psychoacoustic and arithmetical terminations that are actually detached as practically all the data detached during the prediction phases at the encoder side is supplemented towards the audio signal at decoder stage. Therefore, clearly the quantization phase which is modified to produce a bit stream which is scalable. This modified algorithm works well at both lower and higher bit rates. Subjective evaluations were done by audio professionals using the MUSHRA test and the mean normalized scores at various bit rates was noted and compared with the previous algorithms.

Authors and Affiliations

Shajin Prince

1

Bini D

1

A Alfred Kirubaraj

1

J Samson Immanuel

1

Surya M

1

Karunya Institute of Technology and Sciences, Coimbatore, India

EMD-based time-frequency analysis methods of audio signals

Marcin Lewandowski Qizhang Deng

International Journal of Electronics and Telecommunications | 2024 | vol. 70 | No 2 | 323-329 | DOI: 10.24425/ijet.2024.149548

Keywords empirical mode decomposition nonstationary audio data time-frequency analysis

Download PDF Download RIS Download Bibtex

Abstract

Using appropriate signal processing tools to analyze time series data accurately is essential for correctly interpreting the underlying processes. Commonly employed methods include kernel-based transforms that utilize base functions and modifications to depict time series data. This paper refers to the analysis of audio data using two such transforms: the Fourier transform and the wavelet transform, both based on assumptions regarding the signal's linearity and stationarity. However, in audio engineering, these assumptions often do not hold as the statistical characteristics of most audio signals vary over time, making them unsuitable for treatment as outputs from a Linear Time-Invariant (LTI) system. Consequently, more recent methods have shifted towards breaking down signals into various modes in an adaptive, data-specific manner, potentially offering benefits over traditional kernel-based methods. Techniques like empirical mode decomposition and Holo-Hilbert Spectral Analysis are examples of this. The effectiveness of these methods was tested through simulations using speech signals for both kernel-based and adaptive decomposition methods, demonstrating that these adaptive methods are effective for analyzing audio data that is both nonstationary and an output of the nonlinear system.

Authors and Affiliations

Marcin Lewandowski

1

Qizhang Deng

2

Warsaw University of Technology
University of New South Wales Sydney

Timbra: An online tool for feature extraction, comparative analysis and visualization of timbre

Filip Szymański Ewa Łukasik Magdalena Chudy

International Journal of Electronics and Telecommunications | 2024 | vol. 70 | No 2 | 349-354 | DOI: 10.24425/ijet.2024.149551

Keywords timbre audio descriptors feature extraction comparative analysis visualization

Download PDF Download RIS Download Bibtex

Abstract

Dariah.lab is a research infrastructure created for digital humanities, consisting of state-of-the-art hardware and dedicated software tools. One of the tools developed for digital musicology is Timbra, a web-based application for conducting research on sound timbre. The aim was to create an easy-touse online tool for non-programmers. The tool can be used to calculate, visualise, and compare different timbre characteristics of uploaded audio files and to export the extracted parameters in CSV format for further processing, e.g. by classification tools. The application offers extraction and visualisation of scalar features such as zero crossing rate, fundamental frequency, spectral centroid, spectral roll-off, spectral flatness, band energy ratio, as well as feature vectors (e.g. chromagram, spectral contrast, spectrogram, and MFCCs). An interested user can compare selected sound characteristics using various types of plots and run dissimilarity analysis of timbre parameters by means of 2D or 3D multidimensional scaling (MDS). The paper showcases potential applications of the tool based on presented case studies. In terms of implementation, the calculations are performed at the backend Django server using Librosa and standard Python libraries. Dash library is used for the frontend. By offering an easy-to-use tool accessible anytime and anywhere through the Internet, we want to facilitate timbre analysis for a broader group of researchers, e.g. sound engineers, luthiers, phoneticians, or musicologists.

Authors and Affiliations

Filip Szymański

1

Ewa Łukasik

1

Magdalena Chudy

2

Poznan University of Technology, Poznan
Institute of Art, Polish Academy of Sciences, Warsaw, Poland

Real-Time Acoustic Phenomena Modelling for Computer Games Audio Engine

Bartłomiej Miga Bartosz Ziółko

Archives of Acoustics | 2015 | vol. 40 | No 2 | 205-211 | DOI: 10.1515/aoa-2015-0023

Keywords sound reflection transmission attenuation real-time audio processing

Download PDF Download RIS Download Bibtex

Abstract

This article presents an efficient method of modelling acoustic phenomena for real-time applications such as computer games. Simplified models of reflections, transmission, and medium attenuation are described along with assessments conducted by a professional sound designer. The article introduces representation of sound phenomena using digital filters for further digital audio processing.

Authors and Affiliations

Bartłomiej Miga

Bartosz Ziółko

A Lifting Wavelet Domain Audio Watermarking Algorithm Based on the Statistical Characteristics of Sub-Band Coefficients

Zhi Tao He-ming Zhao Jun Wu Ji-hua Gu Yi-shen Xu Di Wu

Archives of Acoustics | 2010 | vol. 35 | No 4 | 481-491 | DOI: 10.2478/v10168-010-0037-x

Keywords audio watermarking lifting wavelet transform statistical characteristics sub-band coefficients

Download PDF Download RIS Download Bibtex

Abstract

In this paper, a new lifting wavelet domain audio watermarking algorithm based on the statistical characteristics of sub-band coefficients is proposed. First of all, an original audio signal was segmented and each segment was divided into two sections. Then, the Barker code was used for synchronization, the LWT (lifting wavelet transform) was performed on each section, a synchronization code and a watermark were embedded into the first section and the second section, respectively, by modifying the statistical average value of the sub-band coefficients. The embed strength was determined adaptively according to the auditory masking property. Experiments show that the embedded watermark has better robustness against common signal processing attacks than present algorithms based on LWT and can resist random cropping in particular.

Authors and Affiliations

Zhi Tao

He-ming Zhao

Jun Wu

Ji-hua Gu

Yi-shen Xu

Di Wu

Comparative Study of Visual Feature for Bimodal Hindi Speech Recognition

Prashant Upadhyaya Omar Farooq M.R. Abidi Priyanka Varshney

Archives of Acoustics | 2015 | vol. 40 | No 4 | 609-619 | DOI: 10.1515/aoa-2015-0061

Keywords Aligarh Muslim University audio visual corpus AVASR bimodal DCT DWT

Download PDF Download RIS Download Bibtex

Abstract

In building speech recognition based applications, robustness to different noisy background condition is an important challenge. In this paper bimodal approach is proposed to improve the robustness of Hindi speech recognition system. Also an importance of different types of visual features is studied for audio visual automatic speech recognition (AVASR) system under diverse noisy audio conditions. Four sets of visual feature based on Two-Dimensional Discrete Cosine Transform feature (2D-DCT), Principal Component Analysis (PCA), Two-Dimensional Discrete Wavelet Transform followed by DCT (2D-DWT- DCT) and Two-Dimensional Discrete Wavelet Transform followed by PCA (2D-DWT-PCA) are reported. The audio features are extracted using Mel Frequency Cepstral coefficients (MFCC) followed by static and dynamic feature. Overall, 48 features, i.e. 39 audio features and 9 visual features are used for measuring the performance of the AVASR system. Also, the performance of the AVASR using noisy speech signal generated by using NOISEX database is evaluated for different Signal to Noise ratio (SNR: 30 dB to −10 dB) using Aligarh Muslim University Audio Visual (AMUAV) Hindi corpus. AMUAV corpus is Hindi continuous speech high quality audio visual databases of Hindi sentences spoken by different subjects.

Authors and Affiliations

Prashant Upadhyaya

Omar Farooq

M.R. Abidi

Priyanka Varshney

Professors Marianna Sankiewicz-Budzynski and Gustaw K.E. Budzynski Founders of the Polish Audio Engineering

Andrzej Czyżewski Bożena Kostek

Archives of Acoustics | 2018 | vol. 43 | No 3 | 353-355 | DOI: 10.24425/123907

Keywords Polish Radio sound engineering Audio Engineering Society radiocommunications

Download PDF Download RIS Download Bibtex

Abstract

Biography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.

Authors and Affiliations

Andrzej Czyżewski

Bożena Kostek

Performance Evaluation of Audio Coding by Amalgam AAC and FLAC Audio codec using MDCT and INTMDCT Algorithm

M. Davidson Kamala Dhas R. Priyadharsini

International Journal of Electronics and Telecommunications | 2019 | vol. 65 | No 3 | 533-539 | DOI: 10.24425/ijet.2019.129810

Keywords Audio Codec Advanced Audio Coding (AAC) Free Lossless Audio Codec (FLAC) Modified Discrete Cosine Transform (MDCT) Integer Modified Discrete Cosine Transform (IntMDCT) Mean Opinion square (MOS) Perceptual Evaluation Audio Quality (PEAQ)

Download PDF Download RIS Download Bibtex

Abstract

The MDCT and IntMDCT Algorithm is widely utilized is Audio coding. By lifting scheme or rounding operation IntegerMDCT is evolved from Modified Discrete Cosine Transform. This method acquire the properties of MDCT and contribute excelling invertiblity and good spectral mean .In this paper we discuss about the audio codec like AAC and FLAC using MDCT and Integer MDCT algorithm and to find which algorithm shows better Compression Ratio(CR).The confines of this task is to hybriding lossy and lossless audio codec with diminished bit rate but with finer sound quality. Certainly the quality of the audio is figure out by Subjective and Objective testing which is in terms of MOS (Mean opinion square), ABx and some of the hearing aid testing methodology like PEAQ(Perceptual Evaluation Audio Quality) and ODG(Objective Difference Grade)is followed. Execution measure, that is Compression Ratio(CR) and Sound Pressure Level (SPL) is approximated.

Authors and Affiliations

M. Davidson Kamala Dhas

R. Priyadharsini

Dynamically Programmable Analog Arrays in Acoustic Frequency Range Signal Processing

Piotr Falkowski Andrzej Malcher

Metrology and Measurement Systems | 2011 | No 1 | 77-90 | DOI: 10.2478/v10178-011-0008-1

Keywords FPAA audio processing switched-capacitor analog circuits design

Download PDF Download RIS Download Bibtex

Abstract

Field programmable analog arrays (FPAA), thanks to their flexibility and reconfigurability, give the designers quite new possibilities in analog circuit design. The number of both academic projects on FPAA and applications of commercially available programmable devices is still growing. This paper explores the properties and parameters of two most popular FPAA circuits: the AnadigmVortex AN221E04 and AnadigmApex AN231E04 from the Anadigm company. The research conducted by the authors led to the discovery of some undocumented features of these devices. Several applications for audio processing were built and tested. The results show that these circuits can be used in medium-demanding audio applications. Thanks to dynamic reconfigurability, they also allow to build an universal analog audio signal processor. These circuits can also act as a versatile platform for rapid prototyping and educational purposes.

Authors and Affiliations

Piotr Falkowski

Andrzej Malcher

Assessment of Audio-Visual Environmental Stimuli. Complementarity of Comfort and Discomfort Scales

Jan Felcyn Anna Preis Marcin Praszkowski Małgorzata Wrzosek

Archives of Acoustics | 2021 | vol. 46 | No 2 | 279-288 | DOI: 10.24425/aoa.2021.136582

Keywords audio-visual interaction environment assessment discomfort comfort environmental perception environmental quality

Download PDF Download RIS Download Bibtex

Abstract

The aim of the study was to examine how the wording of a question about audio, visual and audiovisual stimuli can affect the assessment of the environment. The participants of the psychophysical experiments were asked to rate, on a numerical scale, audio and visual information both separately and together, combined into mixes. A set of questions was used for all the investigated audio, visual, and audio-visual stimuli. The participants were asked about the comfort or the discomfort caused by the perceived stimuli presented at three different sound levels.
The results show that there are no statistically significant differences between the assessment of comfort and discomfort associated with visual samples. Actually, the comfort and discomfort ratings are equivalent to the extent that a discomfort rating can be represented as the opposite to the comfort rating, i.e. the discomfort rating is equal to the 10 minus comfort rating.
In general, the results obtained for audio and audio-visual samples were the same, with only a few exceptions that were dependent on sound level. No statistically significant differences were found for the loudest stimuli, but there were some exceptions for the softener cases. Based on the results, we show that only for visual stimuli both scales are totally interchangeable. When presenting audio and audio-visual samples, only one scale should be applied – either discomfort or comfort, depending on the context and the character of the stimuli.

Authors and Affiliations

Jan Felcyn

1

e-mail:

ORCID:

Anna Preis

1

Marcin Praszkowski

1

Małgorzata Wrzosek

2

Department of Acoustics, Faculty of Physics, Adam Mickiewicz University, Poznan, Poland
Institute of Philosophy, Szczecin University, Szczecin, Poland

Influence of Loudspeaker Configurations and Orientations on Sound Localization

Shu-Nung Yao

Archives of Acoustics | 2022 | vol. 47 | No 1 | 57-70 | DOI: 10.24425/aoa.2022.140732

Keywords audio quality ambisonics immersive sound loudspeaker array spatial effect virtual reality

Download PDF Download RIS Download Bibtex

Abstract

As the virtual reality (VR) market is growing at a fast pace, numerous users and producers are emerging with the hope to navigate VR towards mainstream adoption. Although most solutions focus on providing highresolution and high-quality videos, the acoustics in VR is as important as visual cues for maintaining consistency with the natural world. We therefore investigate one of the most important audio solutions for VR applications: ambisonics. Several VR producers such as Google, HTC, and Facebook support the ambisonic audio format. Binaural ambisonics builds a virtual loudspeaker array over a VR headset, providing immersive sound. The configuration of the virtual loudspeaker influences the listening perception, as has been widely discussed in the literature. However, few studies have investigated the influence of the orientation of the virtual loudspeaker array. That is, the same loudspeaker arrays with different orientations can produce different spatial effects. This paper introduces a VR audio technique with optimal design and proposes a dual-mode audio solution. Both an objective measurement and a subjective listening test show that the proposed solution effectively enhances spatial audio quality.

Authors and Affiliations

Shu-Nung Yao

1

Department of Electrical Engineering, National Taipei University, No. 151, University Rd., Sanxia Dist., New Taipei City 237303, Taiwan

Single-ended quality measurement of a music content via convolutional recurrent neural networks

Kamila Organiściak Józef Borkowski

Metrology and Measurement Systems | 2020 | vol. 27 | No 4 | 721-733 | DOI: 10.24425/mms.2020.134849

Keywords audio data analysis artefacts detection convolutional neural networks recurrent neural networks classification model

Download PDF Download RIS Download Bibtex

Abstract

The paper examines the usage of Convolutional Bidirectional Recurrent Neural Network (CBRNN) for a problem of quality measurement in a music content. The key contribution in this approach, compared to the existing research, is that the examined model is evaluated in terms of detecting acoustic anomalies without the requirement to provide a reference (clean) signal. Since real music content may include some modes of instrumental sounds, speech and singing voice or different audio effects, it is more complex to analyze than clean speech or artificial signals, especially without a comparison to the known reference content. The presented results might be treated as a proof of concept, since some specific types of artefacts are covered in this paper (examples of quantization defect, missing sound, distortion of gain characteristics, extra noise sound). However, the described model can be easily expanded to detect other impairments or used as a pre-trained model for other transfer learning processes. To examine the model efficiency several experiments have been performed and reported in the paper. The raw audio samples were transformed into Mel-scaled spectrograms and transferred as input to the model, first independently, then along with additional features (Zero Crossing Rate, Spectral Contrast). According to the obtained results, there is a significant increase in overall accuracy (by 10.1%), if Spectral Contrast information is provided together with Mel-scaled spectrograms. The paper examines also the influence of recursive layers on effectiveness of the artefact classification task.

Authors and Affiliations

Kamila Organiściak

Józef Borkowski

Non-intrusive method for audio quality assessment of lossy-compressed music recordings using convolutional neural networks

Aleksandra Kasperuk Sławomir Krzysztof Zieliński

International Journal of Electronics and Telecommunications | 2024 | vol. 70 | No 2 | 331-339 | DOI: 10.24425/ijet.2024.149549

Keywords objective audio quality assessment non-intrusiveaudio quality evaluation convolutional neural networks

Download PDF Download RIS Download Bibtex

Abstract

Most of the existing algorithms for the objective audio quality assessment are intrusive, as they require access both to an unimpaired reference recording and an evaluated signal. This feature excludes them from many practical applications. In this paper, we introduce a non-intrusive audio quality assessment method. The proposed method is intended to account for audio artefacts arising from the lossy compression of music signals. During its development, 250 high-quality uncompressed music recordings were collated. They were subsequently processed using the selection of five popular audio codecs, resulting in the repository of 13,000 audio excerpts representing various levels of audio quality. The proposed non-intrusive method was trained with the data obtained employing a well-established intrusive model (ViSQOL v3). Next, the performance of the trained model was evaluated utilizing the quality scores obtained in the subjective listening tests undertaken remotely over the Internet. The listening tests were carried out in compliance with the MUSHRA recommendation (ITU-R BS.1534-3). In this study, the following three convolutional neural networks were compared: (1) a model employing 1D convolutional filters, (2) an Inception-based model, and (3) a VGG-based model. The last-mentioned model outperformed the model employing 1D convolutional filters in terms of predicting the scores from the listening tests, reaching a correlation value of 0.893. The performance of the Inceptionbased model was similar to that of the VGG-based model. Moreover, the VGG-based model outperformed the method employing a stacked gated-recurrent-unit-based deep learning framework, recently introduced by Mumtaz et al. (2022).

Authors and Affiliations

Aleksandra Kasperuk

1

Sławomir Krzysztof Zieliński

1

Faculty of Computer Science, Białystok University of Technology, Poland

1
2

This page uses 'cookies'. Learn more