DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Secure Your Angular Apps: End-to-End Encryption of API Calls
  • Designing High Performant Responsive Web Application With AWS Services and Finetuning for Performance
  • Secure Your Web Applications With Facial Authentication
  • How to Automatically Detect Multiple Cybersecurity Threats from an Input Text String in Java

Trending

  • Revolutionizing Financial Monitoring: Building a Team Dashboard With OpenObserve
  • Transforming AI-Driven Data Analytics with DeepSeek: A New Era of Intelligent Insights
  • Unmasking Entity-Based Data Masking: Best Practices 2025
  • How to Practice TDD With Kotlin
  1. DZone
  2. Data Engineering
  3. Databases
  4. Introduction to Web Audio API

Introduction to Web Audio API

The first of our Web Audio series, this blog introduces you to Web Audio API, including its use cases, complexities, and the concepts behind it.

By 
Madhu Balakrishna user avatar
Madhu Balakrishna
·
Mar. 31, 23 · Analysis
Likes (1)
Comment
Save
Tweet
Share
3.8K Views

Join the DZone community and get the full member experience.

Join For Free

A critical part of WebRTC is the transmission of audio. Web Audio API is all about processing and synthesizing audio in web applications. It allows developers to create complex audio processing and synthesis using a set of high-level JavaScript objects and functions. The API can be used to create a wide range of audio applications, such as music and sound effects in games, interactive audio in virtual reality, and more.

Let us take a look at various concepts behind Web Audio API.

Capture and Playback Audio

Web Audio API provides several ways to capture and playback audio in web applications.

Here's an example of how to capture audio using the MediaStream API and play it back using the Web Audio API:

First, we need to request permission to access the user's microphone by calling navigator.mediaDevices.getUserMedia().

TypeScript
 
navigator.mediaDevices.getUserMedia({audio: true})
  .then(stream => {
    // The stream variable contains the audio track from the microphone
  })
  .catch(err => {
    console.log('Error getting microphone', err);
  });


Next, we create an instance of the Web Audio API's AudioContext object. We can then create a MediaStreamAudioSourceNode by passing the MediaStream object to the audioCtx.createMediaStreamSource() method.

TypeScript
 
const audioCtx = new AudioContext();
const source = audioCtx.createMediaStreamSource(stream);


Once we have the source, we can then connect the source node to the audio context's destination node to play the audio.

TypeScript
 
source.connect(audioCtx.destination);


Now when we call start() method on the audio context, it will start capturing audio from the microphone and playing it back through the speakers.

TypeScript
 
audioCtx.start();


Autoplay

Browsers handle web audio autoplay in different ways, but in general, they have implemented policies to prevent unwanted audio from playing automatically. This is to protect users from being surprised by unwanted audio and to prevent abuse of the autoplay feature.

  • Chrome, Edge, Firefox, and Safari have implemented a "muted autoplay" policy, which allows autoplay of audio only if the audio is muted, or if the user has previously interacted with the website.
  • Safari goes further by requiring user interaction (click) before allowing audio to play.
  • Firefox has the option to set audio autoplay with sound disabled by default, and the user needs to interact with the website to allow audio playback.

Developers can use the play() method to initiate audio playback. This method will only work if the user has interacted with the website and if the audio is not set to autoplay.

Also, the Web Audio API provides the AudioContext.resume() method, which can be used to resume audio playback after it has been suspended by the browser. This method is useful for situations where the user has interacted with the website, but the audio has been suspended due to a lack of user interaction.

Overall, to ensure that web audio autoplay works as expected, it's important to understand the different browsers' policies and provide a clear user interface that allows users to control audio playback.

WebRTC Call Quirks

Other than the autoplay restriction listed above, there are a few specific quirks associated with Web Audio when using it in WebRTC calls.

  • Safari will not let you create new <audio> tags when the tab is in the background, so when a new participant joins your meeting, you can not create a new audio tag.
  • WebRTC Echo cancellation does not work with AudioContext API on Chromium.
  • You can create one <audio> tag and add all AudioTracks to a common stream, but every time you add a new track.
  • In Safari, you have to call play() again.
  • In Chromium, you have to set srcObject again.

Autoplay policy in Chrome











Codecs

The Web Audio API is designed to work with a variety of audio codecs. Some of the most common codecs that are supported by web browsers include:

  • PCM: Pulse-code modulation (PCM) is a digital representation of an analog audio signal. It is a lossless codec, which means that it does not lose any audio quality during compression. PCM is the most basic and widely supported audio codec on the web.
  • MP3: MPEG-1 Audio Layer 3 (MP3) is a widely used lossy audio codec that is known for its high compression ratio and good audio quality. It is supported by most web browsers but is not supported by some of the more recent ones.
  • AAC: Advanced Audio Coding (AAC) is a lossy audio codec that is known for its high audio quality and low bitrate. It is supported by most web browsers, but not all.
  • Opus: Opus is a lossy codec that is designed for low-latency, high-quality, and low-bitrate audio; it's designed to work well on the internet, and it is supported by all modern browsers.
  • WAV: Waveform Audio File Format (WAV) is a lossless audio codec that is widely supported by web browsers. It is commonly used for storing high-quality audio files, but it has a larger file size than other codecs.
  • Ogg: Ogg is an open-source container format for digital multimedia, it's supported by most web browsers, and it's often used for Vorbis codec.
  • Vorbis: Vorbis is an open-source and patent-free lossy audio codec that is known for its high audio quality and low bitrate. It is supported by most web browsers, but not all.

Using the codecs that are widely supported by web browsers will ensure that the audio content can be played on different devices and platforms.

Permissions

To handle various web audio permissions issues, you can use the Permission API and the MediaDevices.getUserMedia() method to request permission to access the microphone or camera.

Here's an example of how to request microphone permission and handle the various permission states:

TypeScript
 
navigator.permissions.query({name:'microphone'}).then(function(permissionStatus) {
    permissionStatus.onchange = function() {
        if (permissionStatus.state === 'granted') {
            // Access to microphone granted
            // create an audio context and access microphone
        } else if (permissionStatus.state === 'denied') {
            // Access to microphone denied
            // handle denied permission
        }
    };
});


For the MediaDevices.getUserMedia() method, you can use the catch method to handle errors and implement fallbacks:

TypeScript
 
navigator.mediaDevices.getUserMedia({ audio: true })
    .then(function(stream) {
        // Access to microphone granted
        // create an audio context and access microphone
    })
    .catch(function(error) {
        console.log('Error occurred:', error);
        // handle denied permission or other errors
    });


You can also check for browser support for the navigator.permissions.query() and navigator.mediaDevices.getUserMedia() before calling them.

In addition to handling permission issues, it's important to provide clear instructions to users on how to grant permission and to make sure that the website's functionality doesn't break if permission is denied or if the Web Audio API is not supported by the browser.

Audio Processing

Audio processing is the manipulation of audio signals using Signal processing. It is used in a wide range of applications, such as music production, audio effects, noise reduction, speech processing, and more.

There are two types of processing that we can do on audio, frequency-based and time-based.

We can add processing nodes to the audio processing graph, such as a gain node to control the volume or a filter node to change the frequency response of the audio.

TypeScript
 
const gainNode = audioCtx.createGain();
source.connect(gainNode);
gainNode.connect(audioCtx.destination);


We will cover more specific audio processing use cases in the future.

Examples

Here are a few examples of the Web Audio API use cases:

Voice Chat and Conferencing

Web Audio API allows you to capture audio from a user's microphone and process it in real time. This can be used to build voice chat and conferencing applications like Dyte that run directly in the browser.

Voice Recognition

Web Audio API can be used to process audio input from a user's microphone and analyze it to recognize speech. This can be used to create voice-controlled interfaces for web applications.

Visualizations

Web Audio API can be used to generate data from the audio input. This data can be used to create various visualizations. For example, a music player application could use the Web Audio API to generate a visualization of the frequency spectrum of the currently playing song.

Music and Sound Effects in Games

Web Audio API can be used to create interactive audio experiences in browser-based games. Developers can use the API to play background music and sound effects and even generate audio on the fly based on game events.

Music and Audio Editing

Web Audio API provides a powerful set of tools for manipulating audio, including filtering, mixing, and processing. This allows developers to create web-based audio editing tools that can be used to record, edit, and export audio.

Conclusion

We covered the basics of Web Audio transmission and concepts around it in the case of WebRTC in this blog post. There is more to catch up on this topic, and we will post it in the coming weeks. Stay tuned.

API Web application

Published at DZone with permission of Madhu Balakrishna. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Secure Your Angular Apps: End-to-End Encryption of API Calls
  • Designing High Performant Responsive Web Application With AWS Services and Finetuning for Performance
  • Secure Your Web Applications With Facial Authentication
  • How to Automatically Detect Multiple Cybersecurity Threats from an Input Text String in Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!