Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in the data normalization to unit energy #24

Open
rutrilla opened this issue Aug 2, 2019 · 1 comment
Open

Issue in the data normalization to unit energy #24

rutrilla opened this issue Aug 2, 2019 · 1 comment

Comments

@rutrilla
Copy link

rutrilla commented Aug 2, 2019

Hi there!

I've realized that in the dataset generation, the energy of the 128-sample data vectors is being normalized to unity as follows (lines 65 and 66):

energyNormalization

However, to the best of my knowledge, the energy Es of a discrete-time signal x(n) is defined mathematically as:

energyEq

Once you have calculated the Es, the sampled_vector must be divided by the square root of the energy, not only by the energy itself. In code, it should be something like this:

energy = np.sum(np.abs(sampled_vector) ** 2)
sampled_vector = sampled_vector / math.sqrt(energy)

I've plotted both versions and these are the results.

Before:

beforeCorrection

After:

afterCorrection

Therefore the signals are being unnecessarily compressed, which can make it harder for (some) models to extract meaningful information, or even prevent it altogether.

Do my findings make sense to you or is there anything that I may have not understood properly? Please, check it and update us with your conclusions if you are so kind.

I look forward to hearing from you.

Regards,

Ramiro Utrilla

@rutrilla
Copy link
Author

rutrilla commented Aug 6, 2019

Actually, in addition to the previous energy normalization, what is really working for me is to scale the IQ samples between -1 and 1. So that's how this part of my code looks like:

energy = np.sum(np.abs(sampled_vector) ** 2)
sampled_vector = sampled_vector / math.sqrt(energy)
max_val = max(max(np.abs(sampled_vector.real)), max(np.abs(sampled_vector.imag)))
sampled_vector = sampled_vector / max_val

And that's the appearance of the signals after both normalization processes:

afterSecondCorrection

As far as I know this kind of normalization is pretty common as some models are more sensitive to the scale of the input data than others. Was there any reason not to originally do this in the dataset? Am I missing something?

It'd be great if someone could give further details on the best practices for normalizing this kind of data.

Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant