Embedding sound effects in web apps

I've had two occasions in my life to embed sound effects into web pages, one for fun and one for work. This is an attempt at documenting what I did and how I did it. I make no guarantees this will work consistently since the web landscape changes quite frequently and quite drastically.

Kaiyo is a marketplace for quality used furniture, we help buyers find designer pieces at affordable prices and help sellers get rid of their furnitures. We have a small technical team doing everything from programming QR code printers and e-commerce sites to writing logistics software to manage our teams on the ground and on the road.

In order to track our inventory, the team on site scans a whole lot of things in their day to day. We noticed that our crew would get so absorbed in their task that they would sometimes ignore the result of these scans and either miss items or bypass resulting errors. We decided to add audio feedback to our web app in an attempt to maintain the pace while giving a quick notification that doesn't require reading the screen.

When we decided to add sound effects to our app, I went back to the work I had done making sound effects for Lost Notes. The site initially had nothing to do with sound effects but I really wanted to mimic the feel of a typewriter using sound clips. I ended up spending most of the coding time playing with the audio elements.

Avoid lag by pre-loading the sound effects

It's not a good strategy to pre-load if your app requires a large number of sound effects or if they are large in size (due to duration or encoding). The idea of this technique is to pre-load in the same way images are often pre-loaded in order to appear faster when you scroll or interact with the pages.

There are two parts to pre-loading sound-effects:

  1. encode the audio clip to be loaded with your page (ideally)
  2. actually load that clip into an audio element

Since interaction is typically not a problem immediately after loading the page, you may be able to get away with loading the sound effects from URL instead of embedded base64 encoded files, it all depends on your constraints. It could be practical in some situations to ship a single HTML file with no dependencies.

Supported formats, supported browsers

Most modern browsers are capable of playing sound, however they all support different file formats. The uncompressed WAV format is the reference in terms of compatibility because of the simplicity of the implementation but since the format is uncompressed, files are rather large. The next best thing in terms of compatibility happens to be the MP3 format, which is actually quite good in compression. Your mileage may vary, make sure you test your target platform.

Step by step instruction

  1. Convert the file in your target format, the following examples will assume MP3. If you need to do some conversion, I recommend FFmpeg, which you can install on a Mac with Homebrew via brew install ffmpeg.
    $ ffmpeg -i kick.wav kick.mp3
  2. Base64 encode that file using your trusty Unix tools:
    $ base64 -i kick.mp3 -o kick.mp3.b64
  3. Put that base64-encoded in a string passed to a new Audio element. data:audio/mpeg;base64, (with the comma):
    var encodedSound = "data:audio/mpeg;base64,//uQxAAAAAAAAAAAAAAAAAAAA[snip]";
    var audio = new Audio(encodedSound);
  4. the sound when appropriate:
    audio.play();

You can download kick.wav, kick.mp3 and kick.mp3.b64 if you want to play with the snippets on this page with the same file I used.

Playing multiple sounds concurrently

Since each player cannot play concurrently, if you intend to play multiple sounds in parallel, you will have to create one Audio element for each sound bite. Check the source code of this page to see the typewriter implementation and how I had to setup multiple concurrent Audio tag for the same sound and rotated through them as the user types on their keyboard.

Try clicking a few times on quickly, you will see that the sound will play fully consecutively instead of playing multiple times. Compare it with the TEXTAREA above where you could type quickly and multiple keystrokes can be heard (set to 4 concurrent element for this demo).

iOS, Safari and the user interaction requirement

On iOS and particularly on Safari, you will not be able to play these sounds without a prior user interaction (e.g. a click). If the sounds should play at the conclusion of an Ajax request or after a delay of some sort, you will not be able to call .play(). A way to work around that issue is to call .play() on a previous interaction and pause immediately, which will have for effect to tell the OS/browser that the sound is indeed allowed to play. Any call to play after the first one will be allowed.

Last updated: 2019-08-18 18:56:15 EDT