Discussion Forums

First look at setiQuest Explorer results

No replies
Joined: 2011-04-06
Posts: 1

Hi all,

During the dot Astronomy meeting in Oxford last week ( www.dotastronomy.com ), Jill Tarter encouraged us to have a look at the results produced by the new setiQuest Explorer app ( http://setiquest.org/wiki/index.php/SetiQuest_Explorer ).

Below is a brief summary of the analysis we carried out, which was based on data kindly uploaded by Francis Potter at: https://github.com/hathersagegroup/seti/tree/master/exported_data/20110404

1. Merging the data
The data uploaded by Francis consists of separate CSV (text) files containing:
a) targets.csv: places in the sky that have been observed for the app;
b) observations.csv: times when a given target was observed;
c) assignments.csv: times when a portion of an observation was shown to a user;
d) pattern_marks.csv: places where users have marked a signal.

The data is stored in separate tables to avoid data duplication (data from a single observation are assigned to multiple users, but the details of the observation are only stored once). The tables refer to eachother using unique keys: target_id, observation_id, assignment_id, user_id...

For the purpose of inspecting the results, we merged the tables using the "crossmatching" function in the excellent TopCat data analysis tool ( http://www.star.bris.ac.uk/~mbt/topcat/ ).

This resulted in a single merged table containing one row per "marked signal", with columns that contain all details on the assignment/observation/target (table uploaded here: http://www.arm.ac.uk/~gba/dotastro/seti/merged.fits ).

2. Visualizing the signal marks
Having obtained the merged table, we inspected the results using scatter plots and histograms in TopCat. For example, a visualization of the "signal marks" as a function of user_id and target_id looked like this:


The plot shows that most of the "marked signals" are clustered near 1500 Mhz, 4000 Mhz and 8500 Mhz. This is simply due to the fact that the beta-version of the setiQuest Explorer app uses a limited number of observations which were obtained in only a few ~6 Mhz-wide frequency bands. The clustering will naturally disappear when observations in more bands of the spectrum are added in the future.

When zooming in on a particular band (e.g. 1415-1425 Mhz), a more uniform frequency distribution becomes apparent:


Now, the interesting question is: are any of the "marked signals" found by multiple users at the same frequency? Or across multiple targets at the same frequency? (the latter being an indication of local interference)

3. Binning the "signal marks" by frequency
To answer the question, we checked how many of the marked signals occur within the same ~0.5 kHz region that is shown to the user in a single screen. We did this using a "quick & dirty" Python script uploaded here:


In brief, the Python script checks how many "signal marks" are contained within a moving 1 kHz window in the frequency spectrum. When marks are found in a given window, the script writes the corresponding user_id's and the 'signal categories' to a new CSV file.

We spent some time interpreting the output of the script. However, we found only few "signal marks" in the same frequency window. This may be explained from the fact that the number of times multiple users have looked at the same piece of data is still too small at this early stage. (Unfortunately, the exported data does not yet contain the necessary info to compute how many users have looked at the same frequency... Francis; do you think you can add the "inspected frequency window" to assignments.csv ? That would be very interesting!)

I look forward to a more extensive analysis when more data is available. Keep marking signals people! :-) 

Geert Barentsen
(student at Armagh Observatory - UK)