- Category: Business , Information Science and Technology
- Topic: Corporations , Entrepreneurship , Management , Marketing
I am pleased to submit to you the report titled “Evaluation of Emergency Audio Calls.” It gives an overview of the project I worked on during my third work term at the Spring/Summer 2021 co-op term with Dr. Howard Hamilton. Not only did this project allow me to apply my programming skills in a practical setting, but it also provided me with valuable experience in debugging and problem-solving.
The project's objective is to analyze audio files containing emergency phone conversations. The work involves transcribing hundreds of audio files that can be up to 30 minutes long. We then use Natural language processing syntactical and semantical analyzing tools to train a model that recognizes whether the required protocol was followed by the operator. We used established protocols to observe the transcripts generated and analyze the emergency calls.
The document enclosed is an independent report created solely by me. The report is based on the research and development I was involved in under Dr. Hamilton's guidance.
Sincerely,
Chaitanya Patel
Research Assistant
Table of contents
Executive Summary
Report
Introduction
Background
Methodology
Conclusion
Executive Summary
Manually analyzing every emergency audio file to verify whether the operator followed the required protocol was a significant time and labor-intensive task. Given the situation, there is a need for a software system that can evaluate the audio calls and determine whether the protocol is being followed.
Our method uses open-source speech-to-text software called DeepSpeech, which was created by the non-profit company Mozilla. DeepSpeech uses Baidu's research to create a customized speech recognition system. To start, we used the default DeepSpeech speech recognition system to transcribe the audio files, which automatically converts WAV files into preliminary transcriptions. The speech recognition system was designed to provide the best performance for audio files from low-noise environments, but it works moderately well with recordings of telephone conversations.
The Idmon-1 software comes into play, which extracts topic-related words or keywords from the preliminary transcriptions by topic modeling. Word lists, LDA, and HLDA are three types of topic modeling used. Users can compare and rank the preliminary transcriptions, with Idmon-1 providing a list of audio files in order of relevance.
In conclusion, this report outlines a methodology for evaluating emergency phone conversations, which can be further improved using machine learning. Our method provides a time-efficient and accurate way to assess emergency calls and ensure that the protocol is being followed.