Another post-publication review of a paper that is weak on the ‘RF’ in RF machine learning.

Let’s take a look at a recently published paper (The Literature [R148]) on machine-learning-based modulation-recognition to get a data point on how some electrical engineers–these are more on the side of computer science I believe–use mathematics when they turn to radio-frequency problems. You can guess it isn’t pretty, and that I’m not here to exalt their acumen.

The Machine Learners think that their “feature engineering” (rooting around in voluminous data) is the same as “features” in mathematically derived signal-processing algorithms. I take a lighthearted look.

One of the things the machine learners never tire of saying is that their neural-network approach to classification is superior to previous methods because, in part, those older methods use hand-crafted features. They put it in different ways, but somewhere in the introductory section of a machine-learning modulation-recognition paper (ML/MR), you’ll likely see the claim. You can look through the ML/MR papers I’ve cited in The Literature ([R133]-[R146]) if you are curious, but I’ll extract a couple here just to illustrate the idea.

What happens when a cyclostationary time-series is treated as if it were stationary?

In this post let’s consider the difference between modeling a communication signal as stationary or as cyclostationary.

There are two contexts for this kind of issue. The first is when someone recognizes that a particular signal model is cyclostationary, and then takes some action to render it stationary (sometimes called ‘stationarizing the signal’). They then proceed with their analysis or algorithm development using the stationary signal model. The second context is when someone applies stationary-signal processing to a cyclostationary signal model, either without knowing that the signal is cyclostationary, or perhaps knowing but not caring.

At the center of this topic is the difference between the mathematical object known as a random process (or stochastic process) and the mathematical object that is a single infinite-time function (or signal or time-series).

A related paper is The Literature [R68], which discusses the pitfalls of applying tools meant for stationary signals to the samples of cyclostationary signals.

What are the unique parts of the multidimensional cyclic moments and cyclic cumulants?

In this post, we continue our study of the symmetries of CSP parameters. The second-order parameters–spectral correlation and cyclic correlation–are covered in detail in the companion post, including the symmetries for ‘auto’ and ‘cross’ versions of those parameters.

Here we tackle the generalizations of cyclic correlation: cyclic temporal moments and cumulants. We’ll deal with the generalization of the spectral correlation function, the cyclic polyspectra, in a subsequent post. It is reasonable to me to focus first on the higher-order temporal parameters, because I consider the temporal parameters to be much more useful in practice than the spectral parameters.

This topic is somewhat harder and more abstract than the second-order topic, but perhaps there are bigger payoffs in algorithm development for exploiting symmetries in higher-order parameters than in second-order parameters because the parameters are multidimensional. So it could be worthwhile to sally forth.

What modest academic success I’ve had in the area of cyclostationary signal theory and cyclostationary signal processing is largely due to the patient mentorship of my doctoral adviser, William (Bill) Gardner, and the fact that I was able to build on an excellent foundation put in place by Gardner, his advisor Lewis Franks, and key Gardner students such as William (Bill) Brown.

Using CSP to find the exact values of symbol rate, carrier frequency offset, symbol-clock phase, and carrier phase for PSK/QAM signals.

In this post I discuss the use of cyclostationary signal processing applied to communication-signal synchronization problems. First, just what are synchronization problems? Synchronize and synchronization have multiple meanings, but the meaning of synchronize that is relevant here is something like:

syn·chro·nize: To cause to occur or operate with exact coincidence in time or rate

If we have an analog amplitude-modulated (AM) signal (such as voice AM used in the AM broadcast bands) at a receiver we want to remove the effects of the carrier sine wave, resulting in an output that is only the original voice or music message. If we have a digital signal such as binary phase-shift keying (BPSK), we want to remove the effects of the carrier but also sample the message signal at the correct instants to optimally recover the transmitted bit sequence.

A PSK/QAM/SQPSK data set with randomized symbol rate, inband SNR, carrier-frequency offset, and pulse roll-off.

Update September 2020. I made a mistake when I created the signal-parameter “truth” files signal_record.txt and signal_record_first_20000.txt. Like the DeepSig RML data sets that I analyzed on the CSP Blog here and here, the SNR parameter in the truth files did not match the actual SNR of the signals in the data files. I’ve updated the truth files and the links below. You can still use the original files for all other signal parameters, but the SNR parameter was in error.

Update July 2020. I originally posted signals in the posted data set. I’ve now added another for a total of signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.

We learned it using abstractions involving various infinite quantities. Can a machine learn it without that advantage?

This post is just a blog post. Just some guy on the internet thinking out loud. If you have relevant thoughts or arguments you’d like to advance, please leave them in the Comments section at the end of the post.

How did this come about? Is it even interesting to ask the question? Well, it is to me. I ask it because of the current hot topic in signal processing: machine learning. And in particular, machine learning applied to modulation recognition (see here and here). The machine learners want to capitalize on the success of machine learning applied to image recognition by directly applying the same sorts of image-recognition techniques to the problem of automatic type-recognition for human-made electromagnetic waves.

The machine-learning modulation-recognition community consistently claims vastly superior performance to anything that has come before. Let’s test that.

UPDATE

I’ve decided to post the data set I discuss here to the CSP Blog for all interested parties to use. See the new post on the Data Set. If you do use it, please let me and the CSP Blog readers know how you fared with your experiments in the Comments section of either post. Thanks!

Reconsidering my first attempt at teaching a machine the Fourier transform with the help of a CSP Blog reader. Also, the Fourier transform is viewed by Machine Learners as an input data representation, and that representation matters.

I first considered whether a machine (neural network) could learn the (64-point, complex-valued) Fourier transform in this post. I used MATLAB’s Neural Network Toolbox and I failed to get good learning results because I did not properly set the machine’s hyperparameters. A kind reader named Vito Dantona provided a comment to that original post that contained good hyperparameter selections, and I’m going to report the new results here in this post.

Since the Fourier transform is linear, the machine should be set up to do linear processing. It can’t just figure that out for itself. Once I used Vito’s suggested hyperparameters to force the machine to be linear, the results became much better:

Unlike conventional spectrum analysis for stationary signals, CSP has three kinds of resolutions that must be considered in all CSP applications, not just two.

In this post, we look at the ability of various CSP estimators to distinguish cycle frequencies, temporal changes in cyclostationarity, and spectral features. These abilities are quantified by the resolution properties of CSP estimators.

Then the temporal resolution of the estimate is approximately , the cycle-frequency resolution is about , and the spectral resolution depends strongly on the particular estimator and its parameters. The resolution product was discussed in this post. The fundamental result for the resolution product is that it must be very much larger than unity in order to obtain an SCF estimate with low variance.

How do we efficiently estimate higher-order cyclic cumulants? The basic answer is first estimate cyclic moments, then combine using the moments-to-cumulants formula.

In this post we discuss ways of estimating -th order cyclic temporal moment and cumulant functions. Recall that for , cyclic moments and cyclic cumulants are usually identical. They differ when the signal contains one or more finite-strength additive sine-wave components. In the common case when such components are absent (as in our recurring numerical example involving rectangular-pulse BPSK), they are equal and they are also equal to the conventional cyclic autocorrelation function provided the delay vector is chosen appropriately.

The more interesting case is when the order is greater than . Most communication signal models possess odd-order moments and cumulants that are identically zero, so the first non-trivial order greater than is . Our estimation task is to estimate -th order temporal moment and cumulant functions for using a sampled-data record of length .

Gaussian and binary signals are in some sense at opposite ends of the pure-impure sine-wave spectrum.

Remember when we derived the cumulant as the solution to the pure th-order sine-wave problem? It sounded good at the time, I hope. But here I describe a curious special case where the interpretation of the cumulant as the pure component of a nonlinearly generated sine wave seems to break down.

Since I wrote the paper review in this post, I’ve analyzed three of O’Shea’s data sets (O’Shea is with the company DeepSig, so I’ve been referring to the data sets as DeepSig’s in other posts): All BPSK Signals, More on DeepSig’s Data Sets, and DeepSig’s 2018 Data Set. The data set relating to this paper is analyzed in All BPSK Signals. Preview: It is heavily flawed.

Modulation recognition is the process of assigning one or more modulation-class labels to a provided time-series data sequence.

In this post, we start a discussion of what I consider the ultimate application of the theory of cyclostationary signals: Automatic Modulation Recognition. My relevant papers are My Papers [16,17,25,26,28,30,32,33,38,43,44]. See also my machine-learning modulation-recognition critiques by clicking on Machine Learning in the CSP Blog Categories on the right side of any post or page.

Higher-order statistics in the frequency domain for cyclostationary signals. As complicated as it gets at the CSP Blog.

In this post we take a first look at the spectral parameters of higher-order cyclostationarity (HOCS). In previous posts, I have introduced the topic of HOCS and have looked at the temporal parameters, such as cyclic cumulants and cyclic moments. Those temporal parameters have proven useful in modulation classification and parameter estimation settings, and will likely be an important part of my ultimate radio-frequency scene analyzer.

The spectral parameters of HOCS have not proven to be as useful as the temporal parameters unless you include the trivial case where the moment/cumulant order is equal to two. In that case, the spectral parameters reduce to the spectral correlation function, which is extremely useful in CSP (see the TDOA and signal-detection posts for examples).

I recently came across a published paper with the title Cyclostationary Correntropy: Definition and Application, by Aluisio Fontes et al. It is published in a journal called Expert Systems with Applications (Elsevier). Actually, it wasn’t the first time I’d seen this work by these authors. I had reviewed a similar paper in 2015 for a different journal.

I was surprised to see the paper published because I had a lot of criticisms of the original paper, and the other reviewers agreed since the paper was rejected. So I did my job, as did the other reviewers, and we tried to keep a flawed paper from entering the literature, where it would stay forever causing problems for readers.

The editor(s) of the journal Expert Systems with Applications did not ask me to review the paper, so I couldn’t give them the benefit of the work I already put into the manuscript, and apparently the editor(s) did not themselves see sufficient flaws in the paper to merit rejection.

It stings, of course, when you submit a paper that you think is good, and it is rejected. But it also stings when a paper you’ve carefully reviewed, and rejected, is published anyway.

Fortunately I have the CSP Blog, so I’m going on another rant. After all, I already did this the conventional rant-free way.

PSK and QAM signals form the building blocks for a large number of practical real-world signals. Understanding their probability structure is crucial to understanding those more complicated signals.

Let’s look into the statistical properties of a class of textbook signals that encompasses digital quadrature amplitude modulation (QAM), phase-shift keying (PSK), and pulse-amplitude modulation (PAM). I’ll call the class simply digital QAM (DQAM), and all of its members have an analytical-signal mathematical representation of the form

In this model, is the symbol index, is the symbol rate, is the carrier frequency (sometimes called the carrier frequency offset), is the symbol-clock phase, and is the carrier phase. The finite-energy function is the pulse function (sometimes called the pulse-shaping function). Finally, the random variable is called the symbol, and has a discrete distribution that is called the constellation.

Model (1) is a textbook signal when the sequence of symbols is independent and identically distributed (IID). This condition rules out real-world communication aids such as periodically transmitted bursts of known symbols, adaptive modulation (where the constellation may change in response to the vagaries of the propagation channel), some forms of coding, etc. Also, when the pulse function is a rectangle (with width ), the signal is even less realistic, and therefore more textbooky.

We will look at the moments and cumulants of this general model in this post. Although the model is textbook, we could use it as a building block to form more realistic, less textbooky, signal models. Then we could find the cyclostationarity of those models by applying signal-processing transformation rules that define how the cumulants of the output of a signal processor relate to those for the input.

How does the cyclostationarity of a signal change when it is subjected to common signal-processing operations like addition, multiplication, and convolution?

It is often useful to know how a signal processing operation affects the probabilistic parameters of a random signal. For example, if I know the power spectral density (PSD) of some signal , and I filter it using a linear time-invariant transformation with impulse response function , producing the output , then what is the PSD of ? This input-output relationship is well known and quite useful. The relationship is

Because the mathematical models of real-world communication signals can be constructed by subjecting idealized textbook signals to various signal-processing operations, such as filtering, it is of interest to us here at the CSP Blog to know how the spectral correlation function of the output of a signal processor is related to the spectral correlation function for the input. Similarly, we’d like to know such input-output relationships for the cyclic cumulants and the cyclic polyspectra.

Another benefit of knowing these CSP input-output relationships is that they tend to build insight into the meaning of the probabilistic parameters. For example, in the PSD input-output relationship (1), we already know that the transfer function at scales the input frequency component at by the complex number . So it makes sense that the PSD at is scaled by the squared magnitude of . If the filter transfer function is zero at , then the density of averaged power at should vanish too.

So, let’s look at this kind of relationship for CSP parameters. All of these results can be found, usually with more mathematical detail, in My Papers [6, 13].

SRRC PSK and QAM signals form important components of more complicated real-world communication signals. Let’s look at their second-order cyclostationarity here.

Let’s look at a somewhat more realistic textbook signal: The PSK/QAM signal with independent and identically distributed symbols (IID) and a square-root raised-cosine (SRRC) pulse function. The SRRC pulse is used in many practical systems and in many theoretical and simulation studies. In this post, we’ll look at how the free parameter of the pulse function, called the roll-off parameter or excess bandwidth parameter, affects the power spectrum and the spectral correlation function.