Discussion Forums

Kurtosis as the fitness function for a Genetic Algorithm to determine the repeat period of a Pulsar

2 replies [Last post]
Dave Robinson
Dave Robinson's picture
Offline
Joined: 2010-04-29
Posts: 196

 

An interesting alternative to automatically obtaining the repetition rate of a Pulsar using the Frequency Domain method I previously published, appears to be possible using a Genetic Algorithm. This has the advantage that although the computation process is relatively long winded, at can easily be broken down in parallel computations, each operating on its own part of the repeat period 'spectrum'

 

You start off with a population of random guesses regarding the repeat period; you then resample the data such that there is say 4096 samples across the current guess period, then reshape or fold the data vector into a matrix containing 1 guess period wide, by however many samples of this period your data will allow you to extract, then simply find the mean value of each column (i.e. synchronous integration). The resulting waveform is the average profile of the data at the guessed period.

 

The closer the guess is to the real repeat period, the less like Gaussian noise the profile appears. As Rob and Anders has pointed out, Kurtosis is a very good method of determining the lack of Gaussianity a signal has, so this makes an excellent fitness function to drive the Genetic Algorithm, The methodology will selectively choose the selected periods which has a higher Kurtosis, and continue to drive the estimates to closer to the real repeat period.

 

The following graph shows what I have called the Kurtosis Spectrum which is simply a plot of normalized Kurtosis against proposed Pulsar Period (in Setisecs, i.e. 2^23 samples of the original data, that had been low-pass filtered at 128 cycles/Setisec, and decimated to 4096 samples/Setisec) , for the B0329+54 Pulsar; I am cheating here as I already know the true pulse rate from my previous work, so it is relatively easy to ensure that the range I have selected goes either side of the true sample rate.

 

 

 

The x axis is the number of pulses per Setisec, and the y axis is the Normalized Kurtosis value.

 

The profile corresponding to the peak Kurtosis is shown below: -

 

 

 

Which is virtually identical to the profile I published earlier.

 

Hope that somebody finds this concept of interest.

 

Regards

 

Dave Robinson 

gerryharp
Offline
Joined: 2010-05-15
Posts: 365
pretty interesting

Hi Dave

I'm interested in this nice result. What is the ratio between the delay peak and the "noise" in the delay region below the peak? A log scale would answer this question. This gives a measure of the "SNR" for this algorithm.

I don't understand why the high-delay points are so noisy. Is there some re-normalization going on that enhances the noise? 

I guess you did this over only a small range because it took time? It would be interesting to look on much shorter (10x) or much longer (10x) timescales for unexpected variations...

Thanks

Gerry

Dave Robinson
Dave Robinson's picture
Offline
Joined: 2010-04-29
Posts: 196
I have to admit that I

I have to admit that I haven't done any further work on this concept, as my poor little laptop isn't the ideal machine to tackle a genetic algorithm search. The primary motive for publishing this letter was thinking that it may well be an ideal algorithm for implementing over a large network of machines (SETI@home style) whereby each node of the network would be allocated a region of the kurtosis spectrum to analyse.

I have often wondered about the algorithm noise further upstream from the main spike, regarding whether it was indicating real data repeating at a different rate from the main pulsar period, or a measurement of the noise floor of the algorithm. However when there was zero feedback from when I initially published the note, I had assumed that once again I was shouting in an anachoic chamber, and that the idea was of no interest to anyone else but me, and hence dropped this line of research.

I am currently looking at applying Wavelet analysis on the waterfall plots arguing that although the signal spikes are almost indistinuishable from noise spikes in the frequency axis - i.e. their energy occures in the fast single 'pixel' scale - in the time or vertical axis their energy will be distributed in a much slower scale regions, for example a single vertical line will provide a hit at the slowest scale, as it is contributing to every 'pixel' down the column. off axis slope signals will provide energy into the intermediate scales, depending on their slope. I hope to have some results in the very near future, and will publish it as soon as possible.

Regards

Dave Robinson