Wallpaper Engine

Wallpaper Engine

76 ratings
Audio Processing Tips for Web Wallpapers
By Squee
A guide based on my experience with webwallpapers and audio processing for them. It includes some tips when it comes to the difference in framerate between audio data and rendering frame as well as some tips about processing your audio data.
   
Award
Favorite
Favorited
Unfavorite
Introduction
Read this first!

This guide is for people wanting to make a web wallpaper that reacts to audio and already know the basics. It is mostly based off techniques I have used so far in my own wallpapers and might not be things you immediately think about. You don't have to use everything in here, but they might have in certain situations.

I have really skimmed thru it all, and tried to keep it short. If you want more details or explanation on a certain section, please ask and I will try to expand on that subject. I don't feel like writing screens full of text if nobody is going to read or be interested :D

Feel free to comment on my horrible explanations too or spelling mistakes

Hope it makes sense, any feedback is welcome! Writing is not my best skill, unless it's writing sloppy code, so just post your question in the comment section if I was not clear enough and I will try to clarify more.

Chapters
  • When testing in a browser - generating fake audio data
  • Normalizing the data - adjusting to different volume situations
  • Calibrating the data - fixing the response for different frequencies
  • 30 FPS - basically the audiodata frame rate
  • Multiple audio frames withing one render frame - touching on the fact that multiple webwallpapers at the same time can cause fps loss and has to be considered.
  • Frequency scale does not match octaves scale - frequency test results
  • Motion Blur: A way to avoid flickering - an idea to deal with big value differences between audio frames
  • Smoothing of the current audio frame - a type of "blur" effect on the data
  • Sqrt & Power! - adjusting values to emphasis peaks
  • Advanced audio processing settings ( only works locally ) - some secrets for testing locally
Example Wallpaper / Related Code
I've setup a wallpaper demonstrating several of the things I have mentioned in this wallpaper. I think if you can see it visually it might make a bit more sense :)

The current version demonstrates:
- Normalizing ( can adjust volume with settings to test this )
- Smoothing
- Motion Blur ( its not really motion blur but its a good indication of what it does )
- Sqrt/Pow effects
- Calibration based on pinknoise

Wallpaper can be found here: http://steamproxy.net/sharedfiles/filedetails/?id=843457589
What you should already know
Skip this part if you want, but I will touch up on some basics first.

WE gives us the FFT data in an array of 128 floating point values. Values 0-63 are the left audio channel and 64 - 127 are right channel. So each channel has 64 values. The frequencies are sorted from low to high. Frequency scale goes from about 50hz to 22khz in ( i believe, but will verify later ) a nearly linear fashion. Emphasis on "nearly". I can't pinpoint what the exact scale is to be honest :)

The basics of hooking into the audio data handler:
if( window.wallpaperRegisterAudioListener ) { window.wallpaperRegisterAudioListener(function(data) { /* data is an array with 128 floats */ }); }

For more details:
http://steamproxy.net/sharedfiles/filedetails/?id=786006047
When testing in a browser
For testing purpose it can be useful to test in chrome. But you wont have any audio data. Below are 2 options for testing in the browser, one will use a local mp3 file, the other is an example of how to generate random noise as data.


Playing music for testing
This is a really nice solution written by steam user ClassicOldSong. Code can be found at at: https://github.com/ClassicOldSong/WPE-Audio-Simulator

The code includes a small audio playing in the top left of the page as well as a polyfill for the audio events. You can just include the script into your wallpaper so it's a no hassle "install" to use it. Download, include, done.


Just random test data
This bit of code fakes the function supplied by WE and simulates some random data. This is not really to represent audio as it will only represent noise, but it will help show your data while working in a webbrowser.

You can also replace the Math.random() function with some sin wave generator.

/* must be before you register the audio listener */ if( !window.wallpaperRegisterAudioListener ) { var wallpaperAudioInterval = null; window.wallpaperRegisterAudioListener = function( callback ) { if( wallpaperAudioInterval ) { // clear the older interval clearInterval( _wallpaperAudioInterval ); wallpaperAudioInterval = null; } // set new interval var data = []; wallpaperAudioInterval = setInterval( function() { for( var i = 0; i < 64; i++ ){ var v = Math.random() * 1.5; // real data can be above 1 as well data[] = v; data[]= v; } callback( data ); }, 33 ); // wallpaper engine gives audio data back at about 30fps, so 33ms it is }; }
Normalizing the data
One of my best tips is to normalize your data. This does not just mean amplifying the output, but adjusting the data to low and high volume situations to create a more constant output ( in our case graphic effects ). This really helps with low volume situations which might otherwise not register.

The basic idea behind normalizing is to take the past peak/max values and use those to adjust the values to fit within the 0-1 range. We don't have to consider the low end of our range as that is always 0.

The following is some code to demonstrate a very basic normalisation method. I have not tested this specific code however.

var peakValue = 1; window.wallpaperRegisterAudioListener( function( audioData ) { var max = 0, i ; // find max value for current frame for( i = 0; i < 128; i++ ) { if( audioData[] > max ) max = audioData[]; } // adjust ratio to how fast or slow you want normalization to react volume changes peakValue = peakValue * 0.99 + max * 0.01; // normalize value for( i = 0; i < 128; i++ ) { audioData[] /= peakValue; } });


Now, if you want to go more advance .. You can consider the following setup:
1. Remember the last 200 peak values, which is about 6 seconds of history.
2. Sort them .
3. Only use the average from value 20 - 180 and ignore the bottom 20 and top 20.

This will trim off any extremes that might occur in the data and find a more accurate average. Feel free to raise the amount you trim off, but leave enough values to average ;) This type of trimming is based off the bell shape curve. https://en.wikipedia.org/wiki/Standard_deviation#/media/File:Standard_deviation_diagram.svg

Calibrating the data ( with pinknoise )
Ok, I might make some mistakes here as I am not really a sound engineer ;) But I believe I am fairly accurate and correct me if I am wrong.

I have done some testing with several different types of noise as I kept seeing a certain pattern in the long term average of the data. The long term average on almost any type of music showed that the data we get from wallpaper engine has a tendency to peak at low and high notes. Even if the music was using a wide spectrum of different tones/notes.

After some testing with white/pink/brown noise that should carry all frequencies at the same level ( or balance in some way ), I decided to record the average values using pink noise and create a function to compensate for this curve.

The idea was to get an array of values I can use to compensate for the curve I saw in the data and to try and have all values respond equally. What I did was play pink noise ( which from what I understood is what they use to calibrate EQ's ) and measure the average of values over 20 seconds. When dividing the audio data by this array of averages, thus compensating for the curve seen, the values seem a lot more balanced and with many songs the response seems a lot more accurate.

Below are the results of some tests ( 20 second averages, left and right channels in same graph next to each other ). The left hand side is the unprocessed data ( only scaled to fit between the 2 bars ), the right hand side has been corrected with the code below. You can see the curve I was talking about at the top, the left one under pink noise has this low point in the data, which also occurs with music. You can see a bit of that curve in all unprocessed results. Look at the results of the different songs and you will notice the same type of result, while on the right hand the "corrected" values are more evenly balanced and the higher, almost unhearable frequencies have been cut off ( or it was already cut off in the recording like the metallica one ).

A fun little detail is that the ophidian song, which many people would consider noise not music, is the one most closely related to the results of noise. So the data does seem adjusted to perception ;)




// I have left the results as I measured them, but if you use them you might want to pay attention to the first value. // It represents the correction for the 26hz. For some reason it feels like that value should be lower as it usually has very low values. // Just something to be aware of. var pinkNoise = []; function correctWithPinkNoiseResults( data ) { for( var i = 0; i < 64; i++ ) { data[] /= pinkNoise[]; data[] /= pinkNoise[]; } return data; }
30 FPS
You will receive audio data at ~30fps. But your wallpaper might run at a different FPS. For that reason be sure to:
  1. remember old data, or your next frame might not have any audio data to show.
  2. in some cases you might want to average your data if you received 2 frames of audio data in 1 render frame.
  3. be careful with designs that make stuttering noticable. My "the pulse" wallpaper is a good example where microstuttering can be rampant and overly noticable to users. It is however not always avoidable and don't hesitate to publish something as there are probably still people out there that will like it.

    If you want to see the unstability in action, try my "the pulse" wallpaper and at the bottom of the settings turn on the FPS & V-Sync setting ( bottom2 ). CPU usage by code is only 1-2fps, but you will still see the framedrops happening. It is unavoidable when working with a webwallpaper.

    There is also a good chance it will only work at 15fps when 2 webwallpapers run at the same time. A problem that seems to lie within CEF itself.
Multiple audio frames withing one render frame
Rendering happens at a different FPS than you receive the audio data. In case you are rendering at 15fps ( which is almost guaranteed when running 2 webwallpapers next to each other ) and you receive multiple audio data frames within a single render frame, be sure to know what you really want when dealing with multi audio frames per render frame. Keep in mind that CEF does not guarantee 60 or even 30fps. So in case you might receive multiple audio frames before rendering them, consider if you really want the average, or the highest value, or maybe even the lowest of the different audio data frames. Or maybe you just want to ignore the everything but the latest data. But be sure you consider the different options.
Frequency scale does not match octaves scale
This is somewhat less relevant after recent changes, but leaving it for consideration purposes. Will most more results later. It is about the ( currently small ) difference in frequency curve between the audio data as we get it, vs octave scale. But current changes have made this very similar to each other.

The scale below is based on using a tone generator to measure and link each array index to the closest F note so approximately each range is from C to C and corresponding to an octave:

audioData[0-1] = ~F#1 ( 42hz )
audioData[2-4] = ~F#2 ( 95hz )
audioData[5-9] = ~F#3 ( 190hz )
audioData[10-20] = ~F#4 ( 380hz )
audioData[21-31] = ~F#5 ( 760hz )
audioData[32-37] = ~F#6 ( 1520hz )
audioData[38-45] = ~F#7 ( 3040hz )
audioData[46-54] = ~F#8 ( 6080hz )
audioData[55-63] = ~F#9 ( 12160hz )

You can see that some octaves only have 3-4 values, while some others might have 10.

Also consider you might not want your grouping to based off octaves. If people want I will supply a frequency table for all 64 values.

For reference, I used these sites to test the frequency range: http://www.szynalski.com/tone-generator/ & http://onlinetonegenerator.com/ .

Current test results
The axis of the graph are the array indexes for the audio data on the x axis and y axis ranges from 0hz to 22khz . The colors have the following meaning ..
- cyan: show approx the octave scale, showing the scale for 9 octaves, starting from 32hz ( c1 ) to ~16.7khz ( c10 )
- red: current response to frequencies ( marked to frequencie the data value responded best to )








And here is some code with all the frequencies per index:
class AudioFrequencies { this.freqIndex = [ 26,48,73,93,115,138,162,185, 207,231,254,276,298,323,346,370, 392,414,436,459,483,507,529,552, 575,598,621,644,669,714,828,920, 1057,1173,1334,1472,1655,1840,2046,2253, 2483,2735,3012,3287,3609,3930,4275,4665, 5056,5493,5929,6412,6917,7446,7998,8618, 9261,9928,10617,11352,11996,12937,13718,14408]; this.idxToFreq = function( idx ) { return this.freqIndex[]; } this.freqToIdx = function( freq ) { for( var i = 0; i < 63; i++ ) { var f1 = this.freqIndex[]; if( freq < f1 ) { if( i == 0 ) return 0; var f2 = this.freqIndex[]; var f2 = freq - f2; var f1 = f1 - freq; if( f1 < f2 ) return i; return i-1; } } return 63; } }
Motion Blur: A way to avoid flickering
Sometimes your audio data values from frame to frame might go from a value of 1 to 0 to 1 again. That can cause flickering which might be annoying. To solve that you can mix the current audio data with the previous audio data. In simple math: val = prevVal * 0.33 + currentVal * 0.66;
Smoothing of the current audio frame
If, like in my "VU meter" wallpaper you show the channels next to each other, it can be nice to smooth out those values. Sometimes you will see very high and low values next to each other and it can look nicer if you smooth that out a bit to avoid such peaks ( Try the smoothing option in my VU meter to see what I mean ).

Smoothing values can be done using a 1D convolution matrix and is similar to what you could use to blur images. You can also use this method to to the opposite ( called sharpen in graphics processing ) to emphasis the difference.

To do this, create a new array to save the results in. Then for each index in the array, newValue[] = (audioData[] * 1 + audioData[] * 2 + audioData[] * 1) / 4 .. The dividing value 4 is based on the muliples used ( 1 + 2 + 1 ). Adjusting those values will smooth more or less. Using the values -1, 4, -1 to multiply is the opposite of smoothing and will emphasis diferrence ( aka sharpening ).

Example of smoothing. Notice how the red colored results are slightly smoother overall than the gray original data. Trying to remove the rough peaks and even things out a bit more while still showing generally where the peak in the data was.

Sqrt & Power!
This one is a simple trick you might not think of. This sections is more focus on increasing a value difference between lower of high values. That is if you have normalized the values first.

It's a very simple tricky but by using the power of a value, you will cause low values to be compressed and stretch out the higher values. This can help emphasis peaks. In the multiplication table below you can see how the original range from 0.5 to 1 becomes a range from 0.25 - 1 while the lower values get squashed.

  • 0 * 0 = 0
  • 0.1 * 0.1 = 0.01
  • 0.2 * 0.2 = 0.04
  • 0.3 * 0.3 = 0.09
  • 0.4 * 0.4 = 0.16
  • 0.5 * 0.5 = 0.25
  • 0.6 * 0.6 = 0.36
  • 0.7 * 0.7 = 0.49 ( almost 0.5 )
  • 0.8 * 0.8 = 0.64
  • 0.9 * 0.9 = 0.81
  • 1 * 1 = 1

On the other hand using the square root of the value, would emphasis the lower values and squish the higher values together. Like the table above, but reversing the scale.

You might recognize this in "easing" functions for animations adjusts the actual progress of the animation ( between 0 - 100% ) based on the time it been playing ( 0-100% duration ) to create non-linear movement. The rescaling I am talking about here is identical to the calculations used on the cubic/quad ease in and ease out functions, as show on this page. http://easings.net/ . Just using them for different purposes than animation.

Some basic example code:

// preferably used after normalizing the audio data so the values are within the 0-1 range // that way the values will remain between 0 - 1 even after such calculations val = Math.pow( val, 3 ); // power of 3 and emphazises peaks / squash low values val = Math.pow( val, 1 ); // unchanged val = Math.pow( val, 1/3 ); // root of 3 and emphazises low values / squash high vaues

You could compensate for not normalising by using the highest value in the audio data for that frame, and using it to temporarily adjust the values to a 0-1 range. Like below.

val = maxValue * Math.pow( val/maxValue, 3 ); // power of 3 and emphazises peaks / squash low values val = maxValue * Math.pow( val/maxValue, 1 ); // unchanged val = maxValue * Math.pow( val/maxValue, 1/3 ); // root of 3 and emphazises low values / squash high vaues

You can see this effect in action when playing with the "sensitivity" option in my "Vinyl Radar" wallpaper.
Advanced audio processing settings ( only works locally )
This is not very useful for your wallpapers as these settings will only effect your local copy, but you might still find them interesting to play around with. I've included them to the guide to have them documented somewhere ( or they will just disappear in the comment section ).

I've included Biohazards descriptions in case you were interested.

In the config.json in the wallpaper engine directory you can optionally add the following settings in the user block. These will change the audio data values you get back from wallpaper engine.
"user": { .... "audioprocessing" : { "frequencydomainsizescale" : 10, "hammingparam" : 0.5, "spreadpower" : 0.25, "timedomainsizescale" : 30, "usekissfft" : false, "zeropaddingpercent" : 0.33 }, ... }

Originally posted by Biohazard:
Explanation:
timedomainsizescale -> How many samples to take. Higher values means it'll take longer but you get more precision. This value is also scaled proportionally to the device sampling rate: MAX(1.0f, pwfx->nSamplesPerSec / 44100) capped at 44.1khz.

frequencydomainsizescale -> How big of a range from the FFT output will be mapped to the 64 buckets (e.g. highest visible frequency).

spreadpower -> Controls how the FFT output is mapped to the buckets. Adjusting (decreasing) this should make it possible to get closer to the ideal octave range. There is also some logic involved to avoid skipping buckets (e.g. having permanent zeros), which explains why the function is piecewise linear and exponential.

hammingparam -> Parameter to adjust Hamming window which controls amplitude per-frequency bucket.

Originally posted by Biohazard:
So I noticed that the new settings 'completely' tanked performance (I had 6% CPU usage where I had < 1% before).

I implemented a second FFT library (FFTS) which is now the default, that brought all the perf back that was lost. For now it's still possible to switch back to the old one with (inside the audioprocessing object):

"usekissfft" : true

Originally posted by Biohazard:
Another problem popped up, with these new settings it's not possible to achieve 30 FPS anymore (It was producing 44100 / (64 * 30) = 23 FPS). So I implemented zero padding for the input signal and it did work out the way I was expecting it to, but it also causes some visible 'side lobes' when testing with sines < 30Hz or so.

I set the zero padding to 33% by default and added another option for it: 'zeropaddingpercent' (so it's 0.33) that makes 34 FPS.
40 Comments
Squee  [author] 10 Feb, 2017 @ 6:32am 
You should be able to do something like that, yes ..
Epic 10 Feb, 2017 @ 6:01am 
Is it possible to make a visualizer like this: https://github.com/MarcoPixel/monstercat-visualizer
Pak Visen 5 Feb, 2017 @ 6:13am 
i will trying that
Biohazard  [developer] 1 Feb, 2017 @ 7:54am 
I don't know the intricate workings of course so I wouldn't know what to look out for, but you don't have to rush yourself I think :eaglegag:

The analysis you covered for the beta in the guide so far should still be accurate, the only change I made after that was the zero padding, but it shouldn't affect those results.
Squee  [author] 1 Feb, 2017 @ 7:33am 
It's live? Now I am even further behind on updates! :spazhorror: It shouldn't be broken, but I do have some code here and there that is tied in to the old frequency curve that I wanted to correct. But will work on updating the guide tonight so whatever I wrote is still accurate.
Biohazard  [developer] 1 Feb, 2017 @ 7:21am 
It's actually live already, but did these changes break anything severely? I tested most of the ones you two made and didn't see anything broken. Though yeah, if you would for example expect index 0 to contain all the bass, I see how that would not work anymore...
Squee  [author] 1 Feb, 2017 @ 4:52am 
Same, will have to update my wallpapers too . But it's still in beta so no rush yet :)
mkvdb 1 Feb, 2017 @ 1:36am 
Awesome guide Squee, looking forward to seeing more audio resposive papers and it seems I will have to tweak my old ones for the new patch anyway :)
Biohazard  [developer] 31 Jan, 2017 @ 3:43am 
Another problem popped up, with these new settings it's not possible to achieve 30 FPS anymore (It was producing 44100 / (64 * 30) = 23 FPS). So I implemented zero padding for the input signal and it did work out the way I was expecting it to, but it also causes some visible 'side lobes' when testing with sines < 30Hz or so.

I set the zero padding to 33% by default and added another option for it: 'zeropaddingpercent' (so it's 0.33) that makes 34 FPS.
Squee  [author] 30 Jan, 2017 @ 10:52am 
Okidoki .. I am curious to see the impact those settings might have. Haven't seen the cpu usage impact yet, but I am still running on an older beta, havent updated in days.