Discussion Forums

Quickest way to go through a data set?

warning: call_user_func_array() expects parameter 1 to be a valid callback, function 'mimemail_incoming_access' not found or invalid function name in /www/setiquest.org/public/includes/menu.inc on line 453.
3 replies [Last post]
Joined: 2010-08-27
Posts: 65

How long does it take a professional to go through one data set and find all the signals and how is this done? What information do they generate at the end of this process? If someone did go through all 181 data sets what would this accomplish? After going through some of the data sets I feel like I am looking for waldo without knowing what he looks like. Here is how I have been going through the data sets. Convert the raw data to 8738 1000 pixel wide images. Then go through the pictures as fast as possible. The fastest way I have come up with is to use infranview. Infanview is a free image viewer/editor program. You can get it at http://www.irfanview.com/ . To set up infranview to go through the pictures fast you click on the sideshow button and set the slide advancement Automatic after box to 0.200 or whatever works for you. You might want to click on full screen options and set it to show images in original size in case the data set is to long and it tries to re-size it. Then open the first picture from the data set and hit Shift-A to start the slide show. If you see a signal hit Sift-A to stop then back up to the image with the signal and then hit F8 then a number which copies the current picture to whatever directory you designate. After that you will have a folder filled with all the signals you saw. If you kept the file names sequential when you generated them you can figure out the frequency of the signals. For example if you have a picture file named 4405.bmp you can figure out the frequency of the left side of the picture and since each pixel is 1 hz you can figure out the exact frequency of the signal. If the center freq was 1420000000hz then: (4405 * 1000) + (1420000000-(8738132/2)) = 1420035934 then you just add however many pixels deep it is in the picture but since they move around so much it doesn't seem really important to get it exact. Here is a link to some pictures from a data set with pictures of signals and pictures of just noise so you can try out infranview without having to convert a whole data set. http://sharesend.com/bncsa If you go through 5 pictures per second you can do a whole data set in about 40 minutes while watching TV. The signals even faint ones seem to just pop out even in your peripheral vision. After going through 4 datasets: 2010-1-05-HD060848-1420_1-8bit 2010-1-05-moon-1420_1-8bit 2010-11-05-BD114586-1420_1-8bit 2010-11-05-HD172175-1420_1-8bit I tried to tally all the signals and here is the picture I came up with using a rough script and gnumeric. http://imageshack.us/photo/my-images/846/fcount.png/ It just shows if a data set has a frequency at a given frequency range. If it gets a 1 it means at that frequency range there was only one data set with a signal. If it gets a 4 that means all 4 had at least one signal at that range. So if a signal shows up in more than one data set it is probably RFI? There are a few that only show up in one out of four data sets. Is that the main goal? Find signals that only show up in one data set then go back and look to see if it is still there? Maybe in addition to the raw data you could have a zip file filled with the pictures of the data so people could go through the waterfall plots really fast however they choose then submit the file names that they saw signals in without having to convert the data themselves.

sigblips's picture
Joined: 2010-04-20
Posts: 733
The problem of SETI is not

The problem of SETI is not that of finding signals, that's easy. The problem of SETI is determining which signals are not RFI. There is an incredible amount of RFI in the setiData files. To determine if a signal is RFI or not requires more information than is available to you. The SETI Institute uses a real-time system called SonATA to do this automatically.

The SETI Explorer app allows a person to quickly zip through a bunch a spectrogram images just like how you described.

jrseti's picture
Joined: 2010-07-22
Posts: 252
Your Waldo analogy. After

Your Waldo analogy.

After going through some of the data sets I feel like I am looking for waldo without knowing what he looks like.

I lke that. Following that analogy, I would be able to give you a hint - he looks like he does not belong with the others.

As far as our signals go - we are looking for signals that look like they do not belong. That could look like almost anything.


We do have the waterfall pictures for most of the data already created an on like for use. So you really do not have to convert the raw data unless you want to create different kinds of visual representations or write your own algorithm that attempts to detect signals. The zip file you metioned not be best, you can just download whatever ones you want from our servers. Each image has it's own URL.

The goal is to try to find signals that are possibly NOT RFI. If someone finds one that we have not detected before, we will either try to determine what it is from it's characteristics, or go back and look at that target again to see if the signal is still there.

From the SETI side, our goals are:

  • Have people discover signals that our software missed.
  • If our software did miss signals, improve the software.
  • If a signal is found in the data, we will go back and observe the target again to see if the signal is still there.


sigblips's picture
Joined: 2010-04-20
Posts: 733
That seems like a good plan.

That seems like a good plan. For the goal "have people discover signals that our software missed" the following needs to happen:

1) All of the setiData signal files need to be fed into SonATA.

2) The SonATA signal report files need to be posted on-line.

3) A tutorial needs to be written to help people interpret the SonATA report results.

4) People need to use a tool like baudline to discover signals that SonATA missed.

5) For each setiData set an informative report needs to be written that discusses the missing signals like http://baudline.blogspot.com/search/label/SETI A simple screenshot is a start but by itself it really isn't that useful. To be of value some level of analysis is required.

6) A lively conversation between SETI Institute staff and the setiQuest community about each missing signal needs to take place. Is it really a signal? Why was it missed? ...

7) If it is determined that these signals should not have been missed by SonATA then a fix is needed (parameters tweaked, new algorithms designed, code changes, ...) Then run the setiData files through the modified SonATA again, compare, and repeat to step #1.