SPECTRAL ANALYSIS OF UNEVENLY SPACED DATA
This tutorial covers the spectral analysis capabilities of AutoSignal when the data are unevenly spaced. The main focus will be upon Fourier procedures that use data that have not been uniformly sampled. A secondary focus will be upon interpolation procedures that can generate uniform data without altering spectral properties.
Importing An Unevenly Sampled Data Set
Select the Import option in the File menu or main toolbar. Change the format to Excel (xls) and select sample.xls from the Signals subdirectory. Check the Import Preview option. Click on (5)Uneven!A: Uneven Time. This column in the Excel worksheet is used as the X or time variable. Click on (5)Uneven!C: S1,SN20dB. This column is used as the Y or signal variable.
Click OK to accept the data and OK once again to accept the default titles.
This data set contains three spectral peaks, one at a frequency of 2 (amplitude 100, phase 3p/2), another at a frequency of 5 (amplitude 100, phase p), and a third peak at the frequency of 8 (amplitude 100, phase p/2). 10% random Gaussian noise was added. The average Nyquist frequency is 4.75. There are thus two peaks beyond the average Nyquist limit.
One important difference between unevenly sampled data relative to uniformly sampled data is that information beyond the average Nyquist limit is not automatically aliased to lower frequencies. It is thus possible to extract information beyond this average Nyquist frequency since some of the data are spaced more closely and support a much higher "local" Nyquist frequency.
Similarly, there will be widely spaced points whose local Nyquist frequency is smaller than the overall average. The information within the average Nyquist range is thus incomplete.
Detrending The Data
Note that the data appear to evidence an upward trend. Further, the algorithm that generates a Fourier spectrum using unevenly spaced data does not generate a zero frequency channel. We will thus remove the upward trend and subtract the mean prior to processing the data.
Select the Detrend option from the Time menu or toolbar. The following items should be checked: Subtract Fit, Subtract Mean, Linear (model), and Least Squares (minimization).
Although AutoSignal offers a variety of background models, you should use the higher order and non-linear models very cautiously.
Click OK to close the Detrend procedure and answer Yes to update the data table.
The Lomb-Scargle periodogram is an algorithm that specifically generates a Fourier spectrum for the instance where data are not uniformly spaced.
Select the Fourier Spectrum of Unevenly Sampled Data option in the Spectral menu or toolbar. Be sure the algorithm is set to Fast and the window is set to None. Set the spectrum size (n) to 8192 and the frequency upper limit (end) to 9.0. Be sure the plot is set to Lomb Spec and set the peak count (sig) to 3.
Note that the spectrum readily recovers all three components, including those that exist beyond the average Nyquist limit. Unlike a traditional FFT, there is no zero frequency channel. In most respects, there are few other differences between this type of spectrum and a conventional Fourier spectrum that uses uniformly sampled data. All of the expected power and amplitude plot formats are available.
AutoSignal extends the algorithm to include windowing. All windows that can be created using unevenly sampled time values are included. The Chebyshev and Slepian (DPSS) windows are not available, although a special Chebyshev approximation window is available for creating the sharpest possible spectral peak for a given sidelobe level.
The high dynamic range processing that is available to Fourier analysis using the better data tapering windows is thus available in this procedure.
The Lomb-Scargle periodogram normally includes a traditional confidence limit based upon an exponential distribution. This is not used in AutoSignal. Instead, full critical limits are available. As with the evenly spaced Fourier procedures, separate critical limit models are used for each of the data windows and these are based on extensive Monte-Carlo trials using the exact algorithm in AutoSignal.
Since the distribution of abscissae can impact significance, these critical limits should be considered approximate.
Click on the Show Significance Levels button to enable the critical limits.
The first two peaks are shown to be significant beyond a 99.9% critical limit. The peak at frequency 8 is significant at a 95% critical limit. This means that of twenty white noise data sets having an equivalent variance, one would be expected to evidence a peak of this magnitude strictly due to random chance.
Click OK to exit the Lomb procedure.
Interpolation by Harmonic Retrieval
We will now address the two alternatives available for interpolating a uniform data set from unevenly spaced data. The first involves fitting a parametric model to the time-domain data in order to extract the harmonic components. This approach is only useful when the data consist of one or more sinusoids or damped sinusoids.
Select the Parametric Interpolation and Prediction option in the Process menu or toolbar. Change the algorithm to Lomb 2x. Be sure the model is Undamped, that the Signal Subspace is set to 6 (this resolves three harmonic components) and the NL Optimization is enabled. The values in the Data Processed fields start with the full data range. Since we are not interested in prediction, change the n in the Output to 1024 and change the x end value to 10.
Click on the Set Confidence/Prediction Intervals button. Be sure Prediction Intervals is checked and that a 95% Confidence is selected. Click OK to close the Intervals dialog.
The upper graph confirms the amplitude and frequency of the three peaks isolated by the Lomb procedure.
The lower graph plots the three sinusoidal components on the Y axis, and the model (the sum of the three components) and the prediction intervals in the upper graph. Note that the prediction intervals look respectable.
To be certain we have a valid interpolation, we will inspect the non-linear fit statistics and check the residuals to insure that they are normally distributed.
Validating The Parametric Model
Click the Numeric Summary button and inspect the fit statistics.
|r² Coef Det
||DF Adj r²
||Fit Std Err
Although the parameters are not recovered perfectly, the fit is an accurate one and the r² goodness of fit value is high.
Close the Numeric Summary window.
Click the View Residuals button. Be sure the Stabilized Normal Probability Plot option (the second from right in the toolbar) is selected.
All of the residuals are shown to be within a 90% critical limit, an excellent indication that they are normally distributed. When residuals lack this Gaussian distribution, the model is often insufficient or incorrect. There may be a missing component, or the fit may have failed to achieve the true least-squares minimum.
Close the Residuals window.
Click OK to exit the Parametric Interpolation procedure. Answer Yes to update the data table. Answer No when asked to save the current data table.
The 1024-point interpolated uniform data certainly bears little resemblance to the unevenly sampled data that were imported. Although there is no practical benefit to a Fourier analysis at this point, we will make one to confirm the presence of the three desired spectral components.
Confirming The Parametric Interpolation
Select the Fourier Spectrum with Data Window option in the Spectral menu or toolbar. Set the window to cs4 BHarris min, Nmin to 1024, set the plot to dB Norm, and set the signal count (sig) to 3.
The three components are present as expected. Note that the parametric interpolation procedure also functions as a noise filter, preserving only the harmonics.
Close the Fourier procedure.
Time-domain alternatives are available for interpolating a uniform data sequence from unevenly sampled values.
The Spline Estimation option offers a variety of interpolating and smoothing splines.
The Non-Parametric Estimation option offers locally-weighted regression.
Both procedures are of value only when high frequency components are absent. Time-domain interpolation can also introduce spurious spectral components. If only low frequency information is present, however, these forms of interpolation can be very straightforward and effective. In general, the harmonics should be in the lower quarter of the Nyquist range.
To illustrate the danger of time-domain interpolation when high frequency components are present, we will reload the original data set and create uniformly spaced data using a cubic spline.
Select sample.xls from the most recently used files list at the bottom of the File menu. Click on (8)Uneven!A: Uneven Time and (8)Uneven!C: S1,SN20dB. Click OK to accept the data and OK once again to accept the default titles.
Select the Spline Estimation option from the Time menu or toolbar. Be sure the Cubic spline is selected and that the Function is being output. Set the output n to 1024.
It should be readily apparent that time-domain interpolation requires a sufficient number of points to define each oscillation in the spectrum. That is clearly absent here.
Click OK to exit the Spline Estimation procedure. Answer Yes to update the data table.
Again select the Fourier Spectrum with Data Window option in the Spectral menu or toolbar. Zoom in the range between 0 and 10 to include the frequency band where the peaks are known to be present.
Although the lowest frequency peak at 2.0 is preserved, the remainder of the spectrum is nonsense. Again, time-domain interpolation routines should only be used when the spectral content is in the lower quarter and ideally the lower eighth of the Nyquist range.
Close the Fourier procedure.