I did a detailed signal analysis of the Kepler-Exo4 data file and posted it to my blog. This file has a lot of interesting signal features in it; a stationary tone, several drifting random walks, hydrogen, and most amazing of all is a FSK modulated signal.
Q: What was the celestial source of this data file?
The source for this observation was Kepler 4b (the star associated with the newly discovered exoplanet, not the Kepler satellite observatory), see:
That *is* an interesting signal! It is the type of signal that various SETI projects have searched for traditionally: having little to no modulation such that the energy is not spread too widely which, as you know, maximizes (if not spread) SNR in frequency space. The repeating "10" (or "01") bit pattern has a Huffman encoding "feel" to it. I agree it would be nice to have a longer observation. As more data is posted, a re-observation of Kepler 4b will be made available. I don't know when that will be.
Thank you for confirmation that the signal source was the Kepler 4b star. I'm surprised by the presence of all the strong tones. The one in the left side filter roll-off is very strong. I thought the water hole was supposed to be a quiet microwave window?
The demodulated "10" string is definitely the most prevalent and it makes the bit stream have more of a carrier training look to it. I noticed the complete lack of any "11" strings which has me thinking something like NRZI coding.
I recently added some spectrogram images with baudline's periodicity bar overlays that can be used to manually demodulate the signal if anybody is interested. I also translated the bit stream into hexadecimal for a different point of view and to make it easier for others to have a crack at decoding the stream.
For fun I plugged the bit stream into a programmer's calculator application that has bit rotate, 1's complement, ... and an ASCII display. It is entertaining to press the buttons and try all the permutations but so far nothing. I really need better bit exploration tools.
Interesting. I support your idea that it might be something like NRZI coding. NRZI coding will allow "11" (and "00") from my understanding, but perhaps instead it is a related type of coding that helps the receiving station maintain bit synchronization at the cost of additional transitions.
Unfortunately for SETI and radio astronomy in general, the span from hydrogen (1420 MHz) to hydroxyl (~ 1670 MHz), the "waterhole," is littered with radio signals. Only small slivers of spectrum are protected for radio astronomy and even those are violated. See this image of the relevant section clipped from a U.S. Frequency Allocation chart.
I re-demodulated the FSK signal again and fixed a couple bit errors. All this decoding bits from the spectrogram and building that reduction grammar got me thinking. If I was an ET maybe I wouldn't do FSK with mark and space bits but instead do a form of differential count coding where each mark/space toggle is a count increment separated by non-activity. It would be a waste of bandwidth but maybe it would travel long distances through hydrogen clouds and stuff better.
So here are a couple possible sequences with multiple lines being different possible bit error permutations of count indexes 3 and 4. The dash represent a longer pause. The 10+ is because of the end of file and it means 10 or more.
2 1 5 1 - 5 15 10+
2 1 5 2 - 5 15 10+
2 1 4 1 - 5 15 10+
2 1 4 2 - 5 15 10+
So do these sequences mean anything to anybody?
I just found this AT&T On-Line Encyclopedia of Integer Sequences and I just had to share it. You type in a numeric sequence and it searches its database.
Too cool. No hits for my sequence though.
Just some thinking. The data is from the array pointed at Kepler4b, discovered January 4th 2010. When the data was recorded, was the Kepler satellite in the proximity?
The Kepler satellite is using X band and Ka band to communicate. (see wiki: http://en.wikipedia.org/wiki/Kepler_Mission ). The data sequence Kepler-Exo4 is sampled with a base frequency of 1420Mhz, wich is at the end of the IEEE L Band. I read in another search that "The L band refers to the frequency range of 950 MHz to 1450 MHz. It is the result of the downconversion of the received downlink satellite signals (C, Ku or Ka) by the LNB (Low-noise block converter)." (see wiki: http://en.wikipedia.org/wiki/L_band )
Could we be observing a sat nav signal? Is there any way to check if at the precise time the signal was recorded, there was no satellite in the proximity? Could even be a Glosnass nav sat at this frequence... just a tought. I will continue thinking and searching.
Still reading a lot, I'm a noob in the domain :P ... we all have to start somewhere! Just found out that Kepler4b is at 1787.5 light years (?) ... if they are listening to earth now, they would be getting what we were transmitting in 222 AD = nothing. Hummm...
Hello again, where can we find the exact date (2010-01-22?) and time (start/end) the data was recorded for Kepler4b data files? I looked at the text file (description) but there is no reference. Thanks.
start capture: UTC 2010-01-23 00:53:56
J2000 RA/Dec. 19.04102, 50.13574
HCRO ATA location WGS84 long., lat., height (m):
-121.47180, 40.81736, 1018.69
Just to mention that the Kepler 04b data was taken, literally, on the first day we began (attempting) to observe with the system. Since that time, Billy Barott has updated the firmware in the beamformer to fix bugs that appeared (or sometimes not) in the data.
I'm not saying that this data is necessarily bad. Just that we should be skeptical.
This time of year, that star is up after 4pm and until 8 am. This is almost perfectly incommensurate with our observation schedule. This set of data has been very interesting, so we will try to get another look at this star when we can.
I was wondering, doppler shifting effect due to star's radial velocity is equally large in all frequencies?
I am only guessing that a small change in small wavelengths (around 5500 A) would be much bigger on large wavelengths
Is that correct?
If yes I would really like to compare sigblips's analysis of another dataset of Kelper 4b with the older one!!
Ok, according to calculations of doppler effect, keeping relative speed constant and changing the wavelength to a larger number gives a larger Δf. So the effect would be larger in lower frequency. If star Kepler-4 "transmits" any signal within our
bandwidth we will observe a large shift in frequency . With it's planet orbital period about 3.2 days it would be possible to record data from a different part of its orbit. Let's see...
The SETI Institute did a re-observation of Kepler-4, they uploaded the data, and my analysis of it is here:
and the setiQuest discussion thread is here:
I have Windows operating system and would like to investigate the archives "dat" with the program Matlab. I want to ask what is the process in order to interpret these data with Matlab.
I have no idea what kind of process has created this signal. But I keep wondering about the mark and space transition. The signal seems to chirp between the two states, it should not appear in a true FSK modulation, but a fast fade-in and fade-out of the tones, but not chirping.
That chirp added to the slow baudrate and the drifting, make me think of some mechanical process, like the turning on and off of a motor, turbine, fan, thermostat, or something like that.
What do you think?
The transitions between mark and space are actually quite sharp. They just look like have some slope because of the point of view. We're really pushing the limits of what can be resolved here. It's a time-frequency duality thing.
It's interesting that you mentioned a mechanical process that has some rotational inertia causing an exponential chirp shape between transitions. I thought that too at first. You are right that something mechanical with mass would generate exponential like chirps. For educational purposes, here are two great examples of just that:
Error-correction coding is another consideration.
Yeah, could be, but it is kind of difficult to know much with only 94 bits. All those alternating "01" strings has me thinking that this is a modulator training sequence or it is something other than FSK.
Just found out what the message transmitted by the Arecibo antenna in 1974 looked like... interesting to see the usage of binary. Tought I would share it with you and here seemed a good place:
and I also enjoyed this link:
How awesome. Sure you didn't just put it in there to get us excited about the project? :) If so, it worked.
I don't know by which criteria Avinash selected the first two observed sources; however, I suspect, ironically, that it was the other source - the AMC-7 communications satellite - that was "put in there" because it contains many strong example signals.
I think the signal in the Kepler 4 data is interesting, but at the risk of diminishing your enthusiasm, I must reveal that we occasionally see signals like this.
To date, no signal considered a candidate has passed subsequent tests: such as persisting, disappearing and reappearing in the data as the telescope is pointed away and back towards the source multiple times, having a believable Doppler signature, and appearing in the data of another observatory pointed towards the source.
The most common explanation for a signal that fails these tests is that the signal was detected from a nearby source in a sidelobe of the telescope beam pattern or the radiating object, such as a satellite, coincidentally passed through the main lobe of the telescope beam pattern.
Another thing about this data file that is suspicious are all the tones scattered about the 8.7 MHz of spectrum and how strong they are. All those tones would waste a lot of energy to broadcast as opposed to just one. Also the tones are all doing very different things, one is stationary, three are drifting at +0.089 Hz/sec, and the FSK-like tone is drifting at +0.0132 Hz/sec. Another odd thing is, except for the stationary tone, why are the signals doing a random walk as they slowly drift? Is that normal?
What has me puzzled is the FSK-like signal. The baud rate is extremely low (0.5061 baud) and the mark / space frequencies are very close together (1.2 Hz). Nobody transmits this slowly, even the extremely low submarine communications stuff is two orders of magnitude faster. The signal is also drifting. What could cause something like that?
I should mention that I re-demodulated the FSK-like signal by hand and I had a couple of bit errors so I fixed those. Just in case anybody looked at the bits and wants to try again, the bit stream is slightly different now.
I recently discovered a technique to improve frequency resolution in a time-frequency domain (spectrogram). It's called spectrum reassignment, the math is here http://en.wikipedia.org/wiki/Reassignment_method but I suppose you could find a more educational source.
Maybe you are interested in implementing it in baudline. The limitations are that they cannot separate two signals if there are too close and that it needs a high carrier/noise ratio in the FFT bins close to the signal.
Maybe you want to take a look to the "Spectrum Laboratory" software, for windows, by DL4YHF (ham callsign). It's free and you can download it from http://www.qsl.net/dl4yhf/spectra1.html
I'm currently trying to ask if Spectrum Laboratory can read a I/Q signed char file, because it's main purpose is to use a sound card, but it can read ADC files so maybe the author can implement the I/Q file reading.
Spectrum Lab supports decimation, high-resolution FFT, I/Q, and so on. It's worth a look.
I've never heard of the Reassignment Method before so thank you for the link. I'll read up on it. It's funny that you mention if I'm "interested in implementing it in baudline." Why? Check out my Kelper Exo4 blog post again. I just added three new sections:
I've been working on the new transform mentioned in the Enhance Resolution section for over the past year and it will be in the next version of baudline 1.08 that should be released real soon.
I find this all very interesting! I'm curious about these subesequent tests you mentioned: have any been conducted on this signal? Have other observatories looked in this area and checked for a signal like this? And the Doppler signature--this is something I know absolutely nothing about, by the way--what we've seen so far: is it discouraging, encouraging, or inconclusive?
ET or not, this is fascinating stuff, even to an outsider like myself. I can't wait to read more about this, and I hope that more data comes out soon!
Hi, I'm relatively new to looking at unidentified signals, decoding, and the like. But I took a look at the bits you extracted and I found some interesting patterns:
At first, I didn't see any patterns in 8-bit segments (like what you posted on your site, in hex). So when I started to break it up differently, patterns emerged. In Analysis #1 it's separated by every 5 bits (who knows, maybe ET doesn't use power's of 2 for byte representation?). One of the most important things I noticed was that segments of 20 bits (possible commands or headers of some sort) are followed by differing amounts of data.
Before I got ahead of myself, I decided to see if any other patterns emerge while splitting the data up in 10's (10's seem more "natural" than 5's, so maybe I had it wrong). For Analysis #2 I split it up into groups of 10 bits. No outstanding patterns seemed to show up. So, I took thought about the possibility of error (since you had corrected the code yourself). It seems that two of the 0's in the later half of the code are out of place. Is it possible that the signal drifted or distorted in some way that one 0 was "stretched" into two? I removed these possibly-out-of-place 0s and a small pattern emerges. But it seems that for such a long signal, relatively little information shows up here. I find this to be the least likely usage, but decided to post it anyway.
For Analysis #3 I wanted to see if groups of 8 bits had more pronounced patterns. I found nothing. However, when I returned to the idea of error that I had used in Analysis #2, I removed those out-of-place 0s and another small pattern emerges. Still relatively little information, but possibly a code non-the-less. Additionally, I swapped a 1 on the third line of Analysis #3. I mostly did this to explore the possibility of error breaking up a genuine pattern or code. I understand that some people may think so much bit manipulation is wishful thinking, but I think we should be open to the possibility of such errors. I'm merely exploring possible patterns.
I believe that Analysis #1 contains the most useful information. But, whatever that information is, is a mystery to me. I'm no cryptoanalyzer, but I like solving problems, so if you think my work is complete BS just say so. Hopefully my code poking will help someone else decode the entire thing :) Also, if SETI could release the data set that completes this signal, that would be really helpful. I'm interested to see what comes later in the code.
Breaking the bit stream into 5 bit words seems like a good approach. You might want to also try 6 and 7 bit words since they are used by some text communication protocols.
Yes, there definitely could be some demodulated bit errors. The signal was drifting and undergoing a random walk so the bit decision point is moving, vague, and at times ambiguous. Not an easy demodulation. It was more difficult than counting hanging chads in Florida! (:
It would be great if the SETI Institute released a longer version of the data file but that might be all they collected.
Along with different bit word lengths; inverting the bits, reversing the bit stream, and different starting bit offsets are also some useful ideas. Add in a few possible bit errors and you get a huge number of permutations to try. It sounds like a good job for a computer! (:
sigblips, you have the signal pegged at 1420.586133 MHz but I don't see anything there: http://22.214.171.124/analysis/2010-01-22-kepler-exo4/waterfalls.php?cha...
I do however see something at 1420.032 MHz: http://126.96.36.199/analysis/2010-01-22-kepler-exo4/waterfalls.php?cha...
The file name is 2010-01-22-kepler-exo4-1420mhz-1-of-*.dat so I wrongly assumed that 1420 MHz is the base frequency. When you find out what the correct base frequency is just add that to be +Hz -Hz offsets I mentioned.
It would be nice if the .dat files had header files with this information.
I recently added a Listen and Autocorrelation section to my exo4 blog post. Some people might like to hear what the down mixed FSK signal sounds like. I'd also be interested in what people think about the Autocorrelation images. Definitely patterns and structure but of what? I have no idea.
I am curious as to why you conclude that the 'walk' of the signal contains information?
Also, I am wondering what the straight line of quite regularly spaced dots could signify.
The better question is do the patterns and structure in the Autocorrelation images have any significance? Because if they do then the 94 FSK bits are not enough information to create that. Or is the Autocorrelation image just random noise?
I guess another good question is if the demodulated bits contain information or are they just random bits?
Which straight line of regularly spaced dots? Horizontal or vertical? In the Autocorrelation section?
I see. Does autocorrelation of noisy parts of the spectrum produce very different images? Without the patterns?
As for the straight line of dots, I'm referring to the series of dots starting just above the 750 marker on the horizontal axis on the higher beta value images. Most of the rest of the image looks fairly random.
Autocorrelation is a measure of self similarity. Noise by its nature is random and has no self similarity. The Autocorrelation transform of white noise is a flat line except for a sharp spike at time zero.
As a sanity test I just looked at a bunch of Autocorrelation responses with baudline from different random parts of the decimated spectrum. As expected they all looked flat expect for the spike.
Those 5 dots are from a region slightly above were I started demodulating (see the first 2 images in the Demodulation section of my blog post). The mark space frequencies were much closer together there so I didn't think the FSK had started yet. The baud rate is the same though. I'm going to have to look and revisit the demodulation, there might be something else.
I created a short movie called:Kepler exo4 FSK Kaiser beta variations on Autocorrelation
Make sure you watch it in 720p HD.
It is a repeating loop of baudline Autocorrelation screenshots as I adjust the Kaiser window beta value from 0 to 40 and back to 0. Patterns and structure pop out of the noise and become redefined as the beta changes. Baudline is a very interactive program and this animation is actually how it looks when you adjust the beta parameter. The graphics are this fast and responsive.
The sound in the movie is the drifting random walking FSK signal that has been remixed a bit, speed and shifting, for the audio band. The audio DSP work was done in baudline real-time.
Nothing new to report on that FSK signal but the pulsar idea is intriguing. Since more people might see it here in this forum I thought I'd repost this from the Conclusion section of my blog post:
Some important questions about the FSK modulated signal:
* Is the signal's proximity of -500 kHz to hydrogen significant or is it an aliasing artifact?
* Are the other tones related in any way? (harmonically or temporally)
* Why is the signal drifting at a +0.0132 Hz/second rate? What should it be drifting at?
* Why is it undergoing a random walk?
* Why are the mark and space frequencies so close together? (1.2 Hz)
* Why is the 0.5061 baud rate so low?
* Do these modulation parameters match any known modem or system?
* Do the demodulated bits match any known line coding, preamble, or training sequence?
* Why is there not a single run of ones (11) in the bit stream?
* Are there any "interesting" sequences or patterns in the demodulated bits?
* Is there any significance to the Autocorrelation images?
* Will this signal ever be seen or collected again?
Does anyone have any answers or ideas?
I generated a sky view from the perspective of the ATA observatory at the 2010-01-23T00:53:56Z start time of the Kepler Exoplanet 4 observation that contained the interesting signal. The synthesized beam occupied the area and location of the purple circle next to the "KepExo4" tag within the dashed light-blue larger circle. Kepler Exoplanet 4 is far from the equatorial plane, where many satellites are found, and far from the ecliptic plane where most space probes are found.
It would be interesting to compute dfreq/dt due to acceleration of the ATA with respect to the source location due to Earth rotation to see if it matches the measurement made in your analysis. Other local contributions to Doppler shift, such as Earth orbit and Sun motion, would effectively have constant velocity over the period of this observation. But, the transmitter potentially could have a contributing acceleration.
A new observation of Kepler Exoplanet 4 was made last Friday. No analysis has been done. The data will be posted soon after it is copied to the cloud and formatted.
That is a very nice map. It would be great if every data set had an accompanying map like that. It would also be great if every data set had an expected df/dt due to Earth rotation and motion around the sun.
What is bothering me are the drifting-random-walks. Why so much wiggle? Could vibrations or variations in the interstellar medium cause something like this?
The Hz-level "wiggle" on the interesting Kepler Exoplanet 4 data signal, which is larger than one would expect given the modulation, may be further evidence, in conjunction with the mismatch between predicted and observed Doppler shift-rate, that this signal was from a nearby transmitter received in the sidelobes. Unlike the main lobe, which has a flat phase response, signals received in the sidelobes from a moving object can be subject to rapid and abrupt phase changes in the response pattern which can look like a frequency change in spectral analysis.
It could be. The mismatch between predicted and observed Doppler drift rate can also be interpreted that the source was rotating.
Here is an idea on a possible interstellar cause of the "wiggle." Imagine a source 100 light years away transmitting a very narrowband "pure tone" signal at Earth. Chances are that the difference of the velocities between the source and the target will be large. This means that each sample will sweep through a slightly different path of the interstellar medium. Assuming the interstellar medium has a non-uniform density, I mean it is lumpy, each sample will undergo a different dispersion. Now integrate these different effects of dispersion over the entire path and you'll get different time of arrivals or phase shifts for each sample. This could cause a random walk sort of wiggle.
Note that the wiggle's long-term effect on the Doppler drift is zero but the short-term would be, well, random and wiggly. Also note that I used the term "sample" because conceptually I thought it was easier thinking of time being discrete instead of continuous.
If the source was rotating it would have some initial Doppler drift applied to it and since different frequencies travel at different speeds through a medium, like a prism, more wiggle would be expected. Now if the transmitting source Doppler corrected for its own rotation there wouldn't be the prism dispersion but the compressions / expansions (phase shifts) would still cause some wiggle, just less of it.
Does any of this seem at all plausible? Another explanation is that the wiggle is just the bleeding and modulation of this:
That text above is actually a slyly disguised double integral that describes ISM signal propagation. I know I'm not the only person who has taken an Astrophysics class here. Does anyone have any thoughts on my narrowband ISM wiggle theory? Maybe I should draw a picture?
I found a picture and sort of answer of what I was describing here:
Thin screen scintillation model
Any news when the new data will be made available? Can't wait to see if the signals are still there.
Any news on this item by any chance?
The SETI League spreadsheet referenced by Anders for computing Doppler shift due to Earth rotation looks good to me. It matches closely, in this case, with one of my internal tools.
Using the spreadsheet, the predicted Doppler shift rate due to acceleration (0.0028 Hz/s) differs from the measured rate (0.0132 Hz/s).
For those interested in understanding the spreadsheet, the coefficient value 1.546111 found in the row 8 formulas, is the Doppler v/c term multiplied by 1e6 Hz/MHz. v is Earth rotation speed at the equator calculated:
(2 * pi * radius-of-earth-at-equator)/seconds-per-day
rackrman@dev:~$ octave -q
ans = 1.5461
The cosine and sine terms project the velocity vector onto the line-of-sight from the observatory to the source (in this case, Kepler Exoplanet 4).
I'm curious, what is the "Hour angle"?. Don't you need the time, date, longitude of observatory and RA of the target? I understand that with this spreadsheet you can calculate some kind of maximum doppler.
So doppler due to Earth's rotation around Sun is not taken into account. The page says it can be about 1/5 of the total doppler, so it can be an important contribution if Earth's rotation doppler is small.
We are not concerned here with absolute Doppler shift due to the sum of all velocity vectors. We are interested in attempting to explain the observed change in frequency with respect to time of the Exoplanet 4 signal which if contributed by Doppler would be due to an acceleration (e.g., Earth rotation). Over an observation of only a few minutes, the change in velocity (acceleration) due to Earth orbit around the Sun and motion of the Sun in the galaxy, will be negligible.
Just to put a number on it, at 1.4 GHz our on-line system looks for signal drift rates up to 1 Hz / second (either positive or negative). This is considered to be a wide enough range to cover the combined rotation doppler due to earth and most exoplanets. ET transmitters that doppler much faster than this are expected to be rare.
But remember, the doppler drift rate is proportional to frequency, so higher drift rates will appear at higher frequencies.