Thursday, March 21, 2019

Basic DSP with Octave.

Matlab or Octave? If you have budget, I recommend Matlab. I used it professionally in my company. They have support. You can export your Matlab code as C code with an extension (although it was problematic when I used it). I can fix problems of open source software by myself because I've been coding for almost 29 years. You can use Octave for a lot of things. It is free. As in free beer and free speech. I usually install and use Octave because it comes with my package manager's supported programs list in my beloved Ubuntu. I used to have Matlab especially under Windows. It works under Linux too. But today I have found an old note in my backups and I'll install Octave to have fun with DSP after a break (fun? yes, let's bring fun to the programming again).

Today it is all about Octave. I also used FreeMat as it was supporting Wav file read functions. Let's work with Octave for now.

It is easy to install it under Ubuntu. All I do is to type this:
sudo apt install octave
It is installed in half a minute. When you run it, it welcomes you like this:




It looks like my Ubuntu repos are a little bit old. It doesn't matter. What we are going to do here is very basic stuff.

Octave has a nice plotting system. I would write pages of C++ code to do these things. This one-liner would have been pages in a formal language. Simply copy and paste this code into your Octave (or Matlab, or FreeMat by the way) and you'll have a nice plot:

x=1:10; y=x.^2; plot(x,y)

Here you can get the exponential graph:



Here I create an array, named a, with some arbitrary numbers.

a=[1,2,3,4,5,4,3,2,1,0,-1,-2,-3,-4,-5,-4,-3,-2,-1,0,1,2,3,4,5,4,3,2,1,0,-1,-2]
a =

 Columns 1 through 21:

   1   2   3   4   5   4   3   2   1   0  -1  -2  -3  -4  -5  -4  -3  -2  -1   0   1

 Columns 22 through 32:

   2   3   4   5   4   3   2   1   0  -1  -2

We can think of this signal as data captured with an ADC (which is an Analog to Digital Convertor). You might ask "what is a signal and why do we capture it". You can record analog signals to a tape recorder and then play them back. You can actually make a copy of the electrical signals "analogous" to the original electrical values. An analog recording medium is like recording the signal and it is very different than a digital record. What is a digital record then? If I play a piece of music and record it to an analog tape, it is something that you can play back through the speakers but you cannot run mathematical formulas on it. If you need to raise the volume you can change the gain of the amplifier circuit or by applying a higher voltage to the input of the amplifier circuit. If you need to raise the volume on a digital medium, however, it is easier. Just multiply it with a real number greater than 1. For example, to double the output just multiply the values in the array with 2. It is easy and you can do math on it. If I need to decompose the signal into frequency components I would use fft function (which is a DFT but optimized to work faster when the number of samples are in power of 2. By the way these two are my favourite FFT education videos. I usually suggest juniors to watch them first, then ask me which part they still need to understand.
Here is how we calculate the fft on an array in a math suit (whether it is Octave, Matlab or FreeMat).

>> b=fft(a)
b =

 Columns 1 through 3:

   22.00000 +  0.00000i   34.83869 + 14.85299i  -33.56345 - 32.48106i

 Columns 4 through 6:

   -5.85866 - 11.24051i   -1.41421 -  8.24264i   -2.70218 +  2.17818i

 Columns 7 through 9:

   -0.09438 -  2.51875i    0.48831 -  2.32545i    4.00000 -  2.00000i


You can think of a digital signal as levels captured by the sensors. For example, if I record the level (or position) of  your eardrum with a super speed camera and measure the movement in scale, say 65536 units of depth, 44100 times a second, I would encode a CD directly with that data. A CD is in fact a bunch of data files filled with this numbers. Your eardrum moves back and forth due to the sound signal's power, which is composed of different frequencies of the instruments and human voices.
FFT function decomposes the frequency components of the signal and IFFT function creates the original signal back from the frequency bins.
>> c=ifft(b)
c =

 Columns 1 through 8:

   1.00000   2.00000   3.00000   4.00000   5.00000   4.00000   3.00000   2.00000

 Columns 9 through 16:

   1.00000   0.00000  -1.00000  -2.00000  -3.00000  -4.00000  -5.00000  -4.00000

 Columns 17 through 24:

  -3.00000  -2.00000  -1.00000  -0.00000   1.00000   2.00000   3.00000   4.00000

 Columns 25 through 32:

   5.00000   4.00000   3.00000   2.00000   1.00000   0.00000  -1.00000  -2.00000

>>

Here you can see that we got our original array back. Don't stuck with the trailing zeroes. The numbers are the same with the original values. I will come to this values again. The symmetry and other beautiful properties of FFT/IFFT somehow put me under a spell which I still don't want to be freed of.
I remember the first time I've seen this phenomenon: I was in awe! I produced a 50 Hz signal and stored it into a real valued array. Then I produced a 100 Hz signal and added it to the array. I plotted the signal onto a canvas using Turbo Pascal 5.5 (yes, there was a graphical programming library which I can't remember the name of it right now called BGI Graphics Library). I first plotted 50 Hz signal. Then plotted 100 Hz. I put them together and plotted it. There was nothing special. I was the best stutent in my high school when I studied electronics. I learned to use every equipment in the electronics lab and was familiar with the signals on the screen of an oscilloscope.
It all started with removing one of the sine waves with just writing a zero into the corresponding frequency bin. And Voila! My life changed forever.
Let me show you the magic. Here is how we produce the 50 Hz signal:
--> t = (0:0.001:1)';
--> y = sin(2*pi*50*t)
y =
         0
    0,3090
    0,5878
    0,80
...
and plot of it:



No big deal. Let's create a new array filled with a 100 Hz sine wave. Just change the 50 with 100 in the function parameters.
--> z = sin(2*pi*100*t)
z =
         0
    0,5878
    0,951

and plot of it:



Note that the frequency of it is two times the first one.
Ok, let's put them together:
--> w=y+z;
--> 


There is nothing  magical yet, right? Just get the fft of the signal and plot it:
--> f=fft(w);
--> plot(t(1:1000),f(1:1000))



You can see that the FFT is symmetrical and our 50 Hz and 100 Hz peaks are easy to spot.

The amazing thing about FFT is it is completely reversible. Let's reverse it, it is called an inverse FFT and you can use ifft command to achieve this.

g=ifft(f)
plot(t(1:100),g(1:100))




Lets clear the 50 Hz signal (from both ends):
--> f(40:60)=0
--> f(940:960)=0

and plot it:



 Here we modified  our frequency components. Yes, in frequency domain. And now get the signal back. Did you notice that this is our 100 Hz signal? Yes, it is.


 Here is the magic started for me. The first time I saw this I knew that my life is going to change. It has. Forever.

I have created sound recognition algorithms for advertisement monitoring, music playlist logging, Shazam like mobile application for music identification etc. I have created High Efficiency Audio Recognition System (HEARS). I worked with defense companies, media monitoring agencies, publisher associations, broadcasters. I went to Berlin to present my Sound And Image Recognition Engine for Advertisement and News (SIRENA). It is far more useful. I used it to detect and register visual advertisements on TV. Yes, frequency analysis can be used to find the images in other images. I think I have found the next subject to blog about.

By the way, in case you need to replicate what I have done here, here is the source code for this basic (and fun) application

t = (0:0.001:1)';
y = sin(2*pi*50*t)
z = sin(2*pi*100*t)
w=y+z;
f=fft(w);
g=ifft(f)

figure(1)
plot(t(1:100),y(1:100))
title("50 Hz Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(2)
plot(t(1:100),z(1:100))
title("100 Hz Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(3)
plot(t(1:100),w(1:100))
title("Mixed Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(4)
plot(t(1:1000),f(1:1000))
title("FFT of the mixed signal")

figure(5)
plot(t(1:100),g(1:100))
title("Signal after inverse fft")
xlabel("time")
ylabel("amplitude")

f(40:60)=0
f(940:960)=0

figure(6)
plot(t(1:1000),f(1:1000))
title("Pruned FFT")

g=ifft(f)
figure(7)
plot(t(1:100),g(1:100))
title("Signal after inverse fft")
xlabel("time")
ylabel("amplitude")
 The only thing I did not tell you about here is t. It is the time component to plot the signals.

Nowadays I am working in a job that lets me make good money but I dislike to write finance software. I am an expert in C#, I use TFS, Entity Framework etc technologies along with databases I have designed and implemented. I create Web sites using MVC and JavaScript. I learned MVC in a month to write a Web site. The point is that I can make money, be successful in other areas of computer science but really don't want to work with these. I really miss working with my DSP R&D projects. I used a lot of tools to create proof of concept systems. I remember that I have created an algorithm to find songs just like Shazam and first time I have shown proof that it works was with a SQL code I have written and executed against a SQLite database. Yes, SQL code. I used a SELECT command to find the sample in a music database and reporting its position.

No comments:

Post a Comment

A Survey of Body Area Networks