Thursday, March 21, 2019

Basic DSP with Octave.

Matlab or Octave? If you have budget, I recommend Matlab. I used it professionally in my company. They have support. You can export your Matlab code as C code with an extension (although it was problematic when I used it). I can fix problems of open source software by myself because I've been coding for almost 29 years. You can use Octave for a lot of things. It is free. As in free beer and free speech. I usually install and use Octave because it comes with my package manager's supported programs list in my beloved Ubuntu. I used to have Matlab especially under Windows. It works under Linux too. But today I have found an old note in my backups and I'll install Octave to have fun with DSP after a break (fun? yes, let's bring fun to the programming again).

Today it is all about Octave. I also used FreeMat as it was supporting Wav file read functions. Let's work with Octave for now.

It is easy to install it under Ubuntu. All I do is to type this:
sudo apt install octave
It is installed in half a minute. When you run it, it welcomes you like this:




It looks like my Ubuntu repos are a little bit old. It doesn't matter. What we are going to do here is very basic stuff.

Octave has a nice plotting system. I would write pages of C++ code to do these things. This one-liner would have been pages in a formal language. Simply copy and paste this code into your Octave (or Matlab, or FreeMat by the way) and you'll have a nice plot:

x=1:10; y=x.^2; plot(x,y)

Here you can get the exponential graph:



Here I create an array, named a, with some arbitrary numbers.

a=[1,2,3,4,5,4,3,2,1,0,-1,-2,-3,-4,-5,-4,-3,-2,-1,0,1,2,3,4,5,4,3,2,1,0,-1,-2]
a =

 Columns 1 through 21:

   1   2   3   4   5   4   3   2   1   0  -1  -2  -3  -4  -5  -4  -3  -2  -1   0   1

 Columns 22 through 32:

   2   3   4   5   4   3   2   1   0  -1  -2

We can think of this signal as data captured with an ADC (which is an Analog to Digital Convertor). You might ask "what is a signal and why do we capture it". You can record analog signals to a tape recorder and then play them back. You can actually make a copy of the electrical signals "analogous" to the original electrical values. An analog recording medium is like recording the signal and it is very different than a digital record. What is a digital record then? If I play a piece of music and record it to an analog tape, it is something that you can play back through the speakers but you cannot run mathematical formulas on it. If you need to raise the volume you can change the gain of the amplifier circuit or by applying a higher voltage to the input of the amplifier circuit. If you need to raise the volume on a digital medium, however, it is easier. Just multiply it with a real number greater than 1. For example, to double the output just multiply the values in the array with 2. It is easy and you can do math on it. If I need to decompose the signal into frequency components I would use fft function (which is a DFT but optimized to work faster when the number of samples are in power of 2. By the way these two are my favourite FFT education videos. I usually suggest juniors to watch them first, then ask me which part they still need to understand.
Here is how we calculate the fft on an array in a math suit (whether it is Octave, Matlab or FreeMat).

>> b=fft(a)
b =

 Columns 1 through 3:

   22.00000 +  0.00000i   34.83869 + 14.85299i  -33.56345 - 32.48106i

 Columns 4 through 6:

   -5.85866 - 11.24051i   -1.41421 -  8.24264i   -2.70218 +  2.17818i

 Columns 7 through 9:

   -0.09438 -  2.51875i    0.48831 -  2.32545i    4.00000 -  2.00000i


You can think of a digital signal as levels captured by the sensors. For example, if I record the level (or position) of  your eardrum with a super speed camera and measure the movement in scale, say 65536 units of depth, 44100 times a second, I would encode a CD directly with that data. A CD is in fact a bunch of data files filled with this numbers. Your eardrum moves back and forth due to the sound signal's power, which is composed of different frequencies of the instruments and human voices.
FFT function decomposes the frequency components of the signal and IFFT function creates the original signal back from the frequency bins.
>> c=ifft(b)
c =

 Columns 1 through 8:

   1.00000   2.00000   3.00000   4.00000   5.00000   4.00000   3.00000   2.00000

 Columns 9 through 16:

   1.00000   0.00000  -1.00000  -2.00000  -3.00000  -4.00000  -5.00000  -4.00000

 Columns 17 through 24:

  -3.00000  -2.00000  -1.00000  -0.00000   1.00000   2.00000   3.00000   4.00000

 Columns 25 through 32:

   5.00000   4.00000   3.00000   2.00000   1.00000   0.00000  -1.00000  -2.00000

>>

Here you can see that we got our original array back. Don't stuck with the trailing zeroes. The numbers are the same with the original values. I will come to this values again. The symmetry and other beautiful properties of FFT/IFFT somehow put me under a spell which I still don't want to be freed of.
I remember the first time I've seen this phenomenon: I was in awe! I produced a 50 Hz signal and stored it into a real valued array. Then I produced a 100 Hz signal and added it to the array. I plotted the signal onto a canvas using Turbo Pascal 5.5 (yes, there was a graphical programming library which I can't remember the name of it right now called BGI Graphics Library). I first plotted 50 Hz signal. Then plotted 100 Hz. I put them together and plotted it. There was nothing special. I was the best stutent in my high school when I studied electronics. I learned to use every equipment in the electronics lab and was familiar with the signals on the screen of an oscilloscope.
It all started with removing one of the sine waves with just writing a zero into the corresponding frequency bin. And Voila! My life changed forever.
Let me show you the magic. Here is how we produce the 50 Hz signal:
--> t = (0:0.001:1)';
--> y = sin(2*pi*50*t)
y =
         0
    0,3090
    0,5878
    0,80
...
and plot of it:



No big deal. Let's create a new array filled with a 100 Hz sine wave. Just change the 50 with 100 in the function parameters.
--> z = sin(2*pi*100*t)
z =
         0
    0,5878
    0,951

and plot of it:



Note that the frequency of it is two times the first one.
Ok, let's put them together:
--> w=y+z;
--> 


There is nothing  magical yet, right? Just get the fft of the signal and plot it:
--> f=fft(w);
--> plot(t(1:1000),f(1:1000))



You can see that the FFT is symmetrical and our 50 Hz and 100 Hz peaks are easy to spot.

The amazing thing about FFT is it is completely reversible. Let's reverse it, it is called an inverse FFT and you can use ifft command to achieve this.

g=ifft(f)
plot(t(1:100),g(1:100))




Lets clear the 50 Hz signal (from both ends):
--> f(40:60)=0
--> f(940:960)=0

and plot it:



 Here we modified  our frequency components. Yes, in frequency domain. And now get the signal back. Did you notice that this is our 100 Hz signal? Yes, it is.


 Here is the magic started for me. The first time I saw this I knew that my life is going to change. It has. Forever.

I have created sound recognition algorithms for advertisement monitoring, music playlist logging, Shazam like mobile application for music identification etc. I have created High Efficiency Audio Recognition System (HEARS). I worked with defense companies, media monitoring agencies, publisher associations, broadcasters. I went to Berlin to present my Sound And Image Recognition Engine for Advertisement and News (SIRENA). It is far more useful. I used it to detect and register visual advertisements on TV. Yes, frequency analysis can be used to find the images in other images. I think I have found the next subject to blog about.

By the way, in case you need to replicate what I have done here, here is the source code for this basic (and fun) application

t = (0:0.001:1)';
y = sin(2*pi*50*t)
z = sin(2*pi*100*t)
w=y+z;
f=fft(w);
g=ifft(f)

figure(1)
plot(t(1:100),y(1:100))
title("50 Hz Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(2)
plot(t(1:100),z(1:100))
title("100 Hz Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(3)
plot(t(1:100),w(1:100))
title("Mixed Sine Wave")
xlabel("time")
ylabel("amplitude")

figure(4)
plot(t(1:1000),f(1:1000))
title("FFT of the mixed signal")

figure(5)
plot(t(1:100),g(1:100))
title("Signal after inverse fft")
xlabel("time")
ylabel("amplitude")

f(40:60)=0
f(940:960)=0

figure(6)
plot(t(1:1000),f(1:1000))
title("Pruned FFT")

g=ifft(f)
figure(7)
plot(t(1:100),g(1:100))
title("Signal after inverse fft")
xlabel("time")
ylabel("amplitude")
 The only thing I did not tell you about here is t. It is the time component to plot the signals.

Nowadays I am working in a job that lets me make good money but I dislike to write finance software. I am an expert in C#, I use TFS, Entity Framework etc technologies along with databases I have designed and implemented. I create Web sites using MVC and JavaScript. I learned MVC in a month to write a Web site. The point is that I can make money, be successful in other areas of computer science but really don't want to work with these. I really miss working with my DSP R&D projects. I used a lot of tools to create proof of concept systems. I remember that I have created an algorithm to find songs just like Shazam and first time I have shown proof that it works was with a SQL code I have written and executed against a SQLite database. Yes, SQL code. I used a SELECT command to find the sample in a music database and reporting its position.

Friday, March 8, 2019

Poor man's copy protection

As the title implies, this is a basic copy protection scheme. I start with a basic example and keep it as simple as possible. I have an intention to keep extending it over time.

Once a life ago (*grin*) I used to buy software dongles and kept my licenses under control. This is both good and bad at the same time. It is good because it needs a minimum effort to handle the scheme. Whether it is time-based demo or machine locked proprietary system, a dongle simply works. It keeps most of the promises it makes: your program only runs on a machine where the dongle is inserted.

One of the bad reasons for using a dongle is that when a hacker breaks its protection scheme it becomes trivial even for script kiddies. There are emulators and crackers. Just download the exploit, crack the software. Please note that I used to be a paid cracker at the time I went to the university and I used to cover my expenses this way. Sometimes people use dongles in a present/not-present fashion and they think they are safe. I really love this. I would make one thousand euro in a weekend's work. It is still a good amount of money for a student. Here is how:

An 8086 based CPU Assembly uses CMP mnemonic to compare a register with another one (or a memory value) and  store the result in the flags register.

The generic form of the instruction is:
cmp     dest, src     perform dest - src and set flags

The most important flag in the register to hack a present/not-present scheme is ZERO flag. You can take action using JZ (Jump if Zero) or JNZ (Jump if Not Zero) jump commands according to the bits of it. For example, if you check a number is in AX (or EAX, it is always B8 register) you would use this code:

cmp AX, 1234h ; or cmp EAX, 12345678h
jz theNumberIsOk
jmp theNumberIsWrong

The opcode for JZ is 74h and the opcode for JNZ is 75h. So all you need is to change one bit. Yes, with a hex editor, find the right JZ (74) and change it to JNZ (75) and you are done. It also works with the password entries. There is just one side effect: you cannot use the right password anymore. All the other words will work but the original password won't.

(Rumour has it that there is a secret jump code for 80x86 family. They say it is Jump if Programmer Is Not Looking and it is what you experience your debug code runs and production code doesn't)

Also it is the same with that dongle protection(!). If the dongle is on the machine, software won't work. By the way this is not a recommended use of copy protection dongles. They come with PKI based protection keys, binary encrypters, hash checksums, anti debuggers, anti tracers etc. Yet I've cracked lots of software due to the false sense of security the developer feels when they use a dongle. "Hey just check if it is present, we have lots of things to do!" says the manager. A dongle is not a magic bullet. It has to be used right. Remember: there is no 100% guarantee. Every software protection can be cracked IN TIME. A dongle gives you 6 months, for example. You should publish a new version in 6 months, a superior one, one with amazing abilities that your users would buy or upgrade to new one. You can eliminate the leeches this way. This kind of people are always greedy. They want it free and they want the NEW VERSION!

Our simple copy protection can save your software from unauthorized use of your little brother or sister. Nothing more.

Hence the title! We are going to develop a very basic copy protection scheme that a seasoned hacker can break in minutes. I said it. It is the poor man's copy protection.

We are so poor that we cannot afford a separate hardware for the present/not-present scheme. We will use Ethernet MAC address. Yes, I know, Ethernet MAC address can be changed. (In fact I used to change my Ethernet MAC address to a random value every time I boot my Ubuntu, I searched the bash script but couldn't find it, I must have deleted it) Here is the header file for the MAC address helper.

#ifndef CMACADDRESSHELPER_H
#define CMACADDRESSHELPER_H
#include <QtCore>
#include <QtNetwork/QNetworkInterface>

const int MACLoc = 3557; // a random number

class CMACAddressHelper
{
public:
    CMACAddressHelper();
    static QString GetFirstMACAddress();
    static QStringList GetAllMACAddresses();
    static QByteArray CreateControlBlock(QString MACAddress);
    static bool CheckMACAddress(QByteArray &ControlBlock, QString MACAddress);
};

#endif // CMACADDRESSHELPER_H
It has some static methods and an empty constructor that I forgot to remove. I developed a few different copy protection schemes in a few language e.g. Delphi, C, C++ and of course my beloved Assembly. One of my favourite copy protections was written in Assembly and it was self modifying its code while working. Try to step over a call and boom, you failed. I really like this kind of puzzles.

Ok, keep the long word short, here is how we get the Ethernet MAC address with Qt (code is taken from somewhere else, and honestly I don't remember)

QString CMACAddressHelper::GetFirstMACAddress()
{
    foreach(QNetworkInterface netInterface, QNetworkInterface::allInterfaces())
    {
        // Return only the first non-loopback MAC Address
        if (!(netInterface.flags() & QNetworkInterface::IsLoopBack))
            return netInterface.hardwareAddress();
    }
    return QString();
}

I love Qt for this simplicity. Qt has an answer for all kinds of everyday problems. QString is better that std::string, for example, especially with numbered arguments and such, which std::string lacks of. Last time I checked there were more than a few thousand classes in the framework, designed and implemented by smart people.

Now we need to save this MAC address to somewhere. A file, registry, a database. Just a persistent memory is enough. Next time our program runs, we will check if we are running on the same hardware.

Let's write this to a file for now. We are using a scheme that is very easy to crack (changing the file contents with the new MAC address) we would use a little bit privacy. I decided to fill the file with random values. And then I will place the MAC address into the file to a place I know, then XOR the file with a known value. Actually I do all this operations in the memory and then save the file at once. Here is the memory part. I get 8192 bytes of memory (2^13) and fill it with random bytes, place the MAC address to a place called MACLoc, and XORing whole buffer with 0xCC (11001100b)

QByteArray CMACAddressHelper::CreateControlBlock(QString MACAddress)
{
    QByteArray result(8192, static_cast<char>(0xFF));
    QByteArray macBytes = MACAddress.toLocal8Bit();
    for (int i = 0; i < result.size(); ++i)
    {
        int rand = qrand();
        result[i] = static_cast<char>(rand % 256);
    }

    for (int i = 0; i < macBytes.size(); ++i)
    {
        result[MACLoc + i] = macBytes[i];
    }

    for (int i = 0; i < result.size(); ++i)
    {
        result[i] = result[i] ^ static_cast<char>(0xCC);
    }

    return result;
}
I would write this memory buffer to a file. It is not a big buffer, so I can write it at once.

void MainWindow::on_recordButton_clicked()
{
    QFile
dossier("config.dat");
   
dossier.open(QIODevice::ReadWrite);
    auto macAddress = CMACAddressHelper::GetFirstMACAddress();
    if (macAddress.size() == 0)
        throw "No ethernet card found!";
    auto block = CMACAddressHelper::CreateControlBlock(macAddress);
   
dossier.write(block);
   
dossier.close();
    ui->label_Status->setText(macAddress+" saved...");
}
This program would run from a USB stick and then the file config.dat would be copied to hard disk. You cannot leave this program to the client. Now here is the code we will use to check if we are running on a licensed machine:

void MainWindow::on_pushButton_2_clicked()
{
    QFile
dossier("config.dat");
    dossier.open(QIODevice::ReadOnly);
    QByteArray block =
dossier.read(8192);
   
dossier.close();

    if (block.size() != 8192)
    {
        ui->
label_Status->setText("License not found!");
        return;
    }

    auto macAddress = CMACAddressHelper::GetFirstMACAddress();
    if (macAddress.size() != 17)
    {
        ui->
label_Status->setText("License not found!");
        return;
    }

    if (!CMACAddressHelper::CheckMACAddress(block, macAddress))
    {
        ui->
label_Status->setText("License not found!");
        return;
    }
    ui->
label_Status->setText("Thank you for using licensed software");
}



There are a few problems with this code. First of all, obfuscation is not protection. We basically filled a buffer with random data, put our MAC address to a place only we know (ha ha ha) and XORed the buffer to make the MAC address disappear (to the naked eye).

This is called obfuscation. This is why I hate writing proprietary systems in .NET. When I say it is prone to decompilation and reversing .NET advocates say the same "but there are good obfuscators". I was cracking binary code without any structural information (for loops, whiles, switch/case idioms, nothing, just disassembled binary code) and yet I was making money reversing algorithms. One of the oldest things I remember I had a hard time was Eizi Nakamura's Best Kit (or something like that, it was 26 years ago) where I learned a few good tricks.

What we did here is we obfuscated a basic MAC address data and saved it into a garbage-like-looking file. An experienced programmer would use AES or XTEA algorithms to protect it. A die hard coder would use public/private key encryption to protect the secret. But we have an Achilles' Heel here:

if (!CMACAddressHelper::CheckMACAddress(block, macAddress))


Remember the JZ/JNZ conversion? We can find the comparison, toggle the bit, and voila! Our program won't run in the original machine anymore but it will work everywhere else. Our code would work as if there is no (!) at the beginning, therefor not negating the result of CheckMACAddress.

Oh, by the way, here is the code for CheckMACAddress:

bool CMACAddressHelper::CheckMACAddress(QByteArray &ControlBlock, QString MACAddress)
{
    if (ControlBlock.size() != 8192)
        return false;

    if (MACAddress.size() != 17)
        return false;

    QByteArray macBytes = MACAddress.toLocal8Bit();

    for (int i = 0; i < ControlBlock.size(); ++i)
    {
        ControlBlock[i] = ControlBlock[i] ^ static_cast<char>(0xCC);
    }

    for (int i = 0; i < macBytes.size(); ++i)
    {
        if (ControlBlock[MACLoc + i] != macBytes[i])
            return false;
    }

    return true;
}

I think it is enough for today. I will improve this basic copy protection in time. At least I have this intention. This program has a purpose. A simple one. Teaching the first bit of copy protection. It has many flows. You can crack in minutes. Altering one bit to crack a copy protection is not a copy protection at all. The only thing that saves us from ordinary people is that they are too lazy to learn basic machine code :P

In my opinion, a good copy protection scheme should involve in encrypting the data and it should need a private key scheme to encrypt the data in order to protect it from altering. If you run the code on different hardware, you will not get the correct keys and your data would look like garbage. Yet I have seen many hard guys beaten by the kids on the block. Never trust one aspect of security. Always keep an eye on your enterprise. Security is a protection of self from the leeches.

Now I'm off to listen my dear Dolores, do I have to let it linger?

Tuesday, March 5, 2019

My blog, now in English, with a rhyme and a reason.

I found this interesting aphorism on the Internet and roughly translated it into English. "A man is born three times: First from his mother, then from his choices at 18. Finally from his mistakes at 40".

I finally changed my life after working in my own company for 20 years. Now I'm looking for a job abroad. My best friend and my business partner suddenly has changed into something that I cannot recognize. He became a zombie of a political figure, swindled me, lied to everybody including his wife. He started supporting a known politician known as a thief and he accused me of being a traitor for supporting somebody else. This is called New Turkey. People were naive here. Everybody would help the other. We were living in harmony even if we support different football clubs or political parties. A lot has changed under the nepotism of a political figure. Now Turkish people thinks in terms of "us" and "others". Like my former friend. I cannot believe that he is that stupid. I was angry with his political stupidity. I was forcing myself to ignore the political bullsh*t and continue to work. I had my comfort zone. I wouldn't want to leave it. I was in a Techno Park facility and what I was doing there would be considered science. I left the company after learning that he used to sell fake invoices and other fake documents to make more money. In our New Turkey, the god is money (and I just realized I'm a double Atheist then). He bought a new house next to Bosphorus bridge in Istanbul. A new BMW. I was under constant humiliation although I was the brain in the company. The sole reason he was doing that is that I denied to vote his semi-god. He might have believed that he is making the value of the company by himself. After the government cut the ticket for the fake documents, I decided to leave the company. They charged the company 1.5 million. You would say 1.5 million in what? US dollars or Euros? Well, I think 1.5 million is *very* much in almost every currency (maybe not in Bitcoins, eh? It is called infinity). To make the long word short, I was being swindled. This happens to science-oriented people a lot. We are not merchants. We do not think in terms of money making.

I used to blog here in Turkish. They say, a man is considerably more reasonable in a foreign language. I decided to blog in English for a while. Two reasons for that: First, I am losing it. A foreign language is like a flower: it should be watered from time to time, it fades, yes, it fades like a flower. Second, I need more practice because I decided to leave this open sanitarium as soon as possible. I used to speak a decent Russian, I have finished A1, A2, B1 and B2 on busuu.com. Now all I can say is basic sentences that could be useful in daily life, a few bad words and "I forgot a lot" with a decent accent. I was in Berlin to attend FIBEP congress where I was one of the speakers at the WMIC of October, 2017. I decided to learn some basic German. I studied German on the public transport for 3 months in Istanbul and I would manage my way in Berlin, IN GERMAN! Thanks to Duolingo for making it possible.

I always improved myself and learned new things. Even after leaving the university, where the shadows of my professors were longer than their own bodies, I concluded that I need to learn all the things by myself. I was writing viruses at that time in Assembly language. Not only viruses but rabbits, chameleons and experimental anti-virus programs. I have published a few programs in PC World magazine. There was no internet. Only press and BBS systems to some extent. It would take half an hour to get a scanned front page of a Playboy magazine over a 9600 modem (c'mon, I was 17 at the time). I used to note machine codes into my notebook in dormitory to type them in the next day at computer lab. I memorized opcodes and data lengths for that codes and that would help me to write basic programs without a computer.

What I learned in the university then? COBOL. Yes, the accounting language of computer languages. Somehow they decided to change the course of the computer programming class and they put Pascal and C into the arsenal of lessons they teach. But there was a problem: All the professors and lecturers at that time were experts of COBOL and they did not understand the memory locations, byte allocations, word sizes of a CPU etc. I shined. I was a student in the whole university who knew more that his professors. Lecturers used to ask me which command to use to truncate a file in Pascal. I was getting full scores. 100 over 100.

A professor of mine, Cavit Tezcan, was teaching the class word sizes of different CPUs. He said that an 8086 (not an 80x86) is a 32 bit processor. I said no, it is not, it is 16 bits and I am using its DS, CS, ES etc segment registers for this reason. His mouth turned into an, erm, I'd say something different but let's say a vacuum fish, you get the idea. I also used to get 100 points till that time from this professor. After correcting him a few times later I used to get 30s-40s from the written exams. I objected and told him that it is not fair, I would get more, he said "you would have thought about it when mocking me in front of the class". What I understood there is that this school is not gonna end in a nice way. I have established my own company before I left the school since most of the businessmen in Silicon Valley are college dropouts.

I passed some of his classes later because new teachers were assigned to ones he used to teach. I repeated even BASIC class! He insisted to fail me with the finishing project which I have finished 5 times. Then I left the university because I noticed that diploma is not something that does the real job. It is me. I used to do the real job and I need to educate myself. That was a wise decision. I learned C++, Delphi etc languages later only by myself. I used to teach myself computer languages, frameworks, databases, algorithms etc. I like the example of Leonardo Da Vinci. He was born out of wedlock and was not accepted into school for this (as if it is his decision). He thought "all they do is learning Latin and Greek and reading science papers, I can do that by myself". And he did. We know the rest.

I learned digital signal processing, C++, C#, version control systems. I adopted writing styles and taught them to young people to work in peace and harmony together within a team. I have created one of the best sound and image recognition systems for advertisement monitoring. I left my own company just getting my jacket and slamming the door. Now I start all over again. I am born at 40 again, this time from my mistakes. 

I used to be a citizen of Old Turkey. I was idealist and naive. I thought the professor has no right to teach the class wrong things. I still think this way. Our families always told us to be a nice person, never lie, never let anybody do harm to the innocent etc. New Turkey is more religious and more indecent at the same time. No country for decent people here.









A Survey of Body Area Networks