Discussion Forums

Discussion of roadmap in wiki

36 replies [Last post]
jill
Offline
Joined: 2010-01-26
Posts: 27

Here are my questions and comments about the roadmap published in the wiki.  Anders suggested this forum would be the right place for discussion.  I’ve just embedded them in [] within the wiki text below in case the color coding doesn't work for you.

jill

The roadmap described here is a proposal laid out by the community. It is not endorsed by the SETI Institute or the setiQuest project.

We envision a roadmap for the project which progresses through a number of well-defined stages to ultimately conclude in citizen access to near-real-time data from the ATA. [This is a limited roadmap.  There are really two directions we’d like to go. First enable citizen scientists’ direct participation by using their cognitive skills to find patterns in portions of the data that SonATA can not now process – here it is OK to detect and report the narrowband type of signals that SonATA already finds, because we are missing them otherwise.  There will be MANY such narrowband signals, that’s why SonATA cannot process in real-time.  The challenge will be to set up a system to validate reports and compare them with known interference in order to sift out those that may be extraterrestrial and worth following up on.  There may be other types of signals as well that human pattern recognition can detect, before algorithms can developed.  That’s the second direction, developing new signal detection algorithms for complex signals with large number of degrees of freedom.]
 

  1. The definition and documentation of a clean API for access to the beamformer and SonATA data streams from special client modules running locally at the ATA. An API for letting client modules suggest new observation targets and schedules may also be defined. The APIs are implemented in OpenSonATA to allow client modules to be tested against OpenSonATA, outside of the 'production environment' at the ATA.
  2. Development of new client modules by the community. The primary purpose of client modules is to drastically reduce the stream of data emanating from the ATA beamformer. [Let me ask about your statement of the primary purpose.  I would have thought that the motivation for developing new client modules was to process the data in new ways in order to enable detection of signals that would otherwise be missed.  For the citizen scientist, that involves supplying data from frequency bands that current SonATA can’t complete in near-real-time (~1% of the spectrum we observe).  At the moment, that’s too much raw data to ship out, but it is straight forward to think of an application that would send out only the number of badbands (subchannels with too many candidate signals) that the network will support.  These badbands would be otherwise ignored, so any that can be studied by citizen scientists are a net improvement in the search.  The other new way to process the data is by developing new algorithms for different types of signals.  The eventual goal is new clients running in parallel with the current SonATA.  Although the development work could take place off site, on stored data, ultimately the processing would be done on site in real to near-real time.] When data have been reduced to a sufficient degree, it may either be sent to the cloud for further processing or to other subsystems that will decide whether the current observation pipeline should be broken to reobserve the target in question. [If I am correct about the primary purpose, then I wonder about the need to reduce the datastream sufficiently to get it off site.  That throttling process will inevitably make some assumptions about the nature of the new type of signals to be explored.  In the simplest case, reduction of bandwidth to achieve a viable datarate then precludes looking for classes of signals that have larger bandwidths as many signals with large number of degrees of freedom or complexity will have.  We have been putting large, raw, unprocessed datasets up in the cloud – data we have not run through our detection systems.  At the moment even this direct-to-disk-and-then-to-cloud data capture is limited to 8 MHz of the 104 MHz available – we are hoping that new hardware donations will broaden what we can do in near term.  Anything going out on fiber would be far far less bandwidth.  Building the API throttle that will choke down the datarate is in fact very complex and inevitably introduces significant biases.  Since all the development and debugging and testing and validation of new detection algorithms will be non-real-time, I wonder why your first approach is to decimate the data severely enough to get it off site in realtime?   Wouldn’t you rather have the closest thing to the full bandwidth raw data that we can provide since it opens up so many more signal processing opportunities?]
  3. Once the community has prepared a set of stable and worthwhile modules, they are to be installed locally at the ATA. Some integration on part of SI staff may be necessary - depending on the APIs and processes that has been established - to ensure that the modules performs nominally, without interfering with other operations at the telescope. [I think you can count on a significant effort for testing and validation in collaboration with the SI staff.  Observing time with the ATA is a valued resource, and we always test a lot off line before we risk running new applications that could end up wasting observing time.]We envision a model where the hardware to host the new client modules at the ATA is supplied by the teams behind the modules themselves, e.g. via corporate sponsorships. [This is a most generous model – do you think it is likely to be possible?  Will the folks interested in being on software development teams have access to the corporate folks?]In the earliest phases of the project, SI may choose to lend the teams some hardware, to get the process started.
  4. When the new client modules are running in the production environment at the ATA, and the most interesting data is continually and automatically uploaded to the cloud, the community is to develop applications that further process the data (this can also be done in parallel with the previous steps). [See my previous comments – with the exception of the badband selector, the construction of this sort of a client is actually doing the signal detection to select what you are defining as ‘interesting’.]Some of these applications should feature citizen science opportunities for the non-technical public. [And this involves precisely the same building-a-throttle difficulty that is discussed above.  As a first idea, it should be possible to send out some or all of the permanent-RF masked channels – see above.]The applications may feed digested results from the cloud back to the client modules at the ATA, as a guidance for reobservations.[To be useful in the very dynamic RFI environment we live in, this sort of feedback needs to happen with the same sort of near-real-time cadence that we work with now, ~ a few minutes.]
  5. Citizen scientists and computers in the cloud scan the radio waves for evidence of technosignatures. If a statistically significant signal is detected, experts are notified for further analysis and, hopefully, confirmation. [I think we would invoke the same sort of automated follow-up observing process we now go through before any human resources are invoked.]

Questions

  • Which primitives in the ATA environment can be opened up as part of an API for client modules?
  • What are our internet bandwidth requirements? Do they exceed current bandwidth at the observatory?
  • What are the requirements for client modules? What will it take to install them at the ATA?
  • What is needed for citizen science applications to be developed?
  • How are signal detections to be handled? Should we establish a notification interface for manual operators at the ATA?
Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
Okay, many tough questions. I

Okay, many tough questions. I will try to address them in seperate posts to enable discussion of them individually. (To appear above this one, because forum sorts by most recent first.)

jill
Offline
Joined: 2010-01-26
Posts: 27
No - the questions aren't

No - the questions aren't tough, what's tough is how to try to distill 30 years of observing experience with radio astronomy facilities and RFI into a forum discussion.  We now have effective CW detectors and narrowband pulse detectors, and they've been integrated into a near-real-time system with knowledge of what's been seen from all observations over the last week, and good models of how a strong source in the antenna sidelobes can mimic a weak source in the main lobe,plus the numerical ability to predict how observed drift rates should change during an observation and subsequent follow ups if the source is really moving on the sky at sidereal rate.  further we are beginning to experiment with generating offset nulls for each of our beams to further discriminate against RFI. the first lesson we learned in 1992, when we launched the NASA HRMS was that one telescope is not enough.  for project Phoenix we made use of pseudo-interferometers, and now with the ATA, we've built the real thing.  

if we want to focus on narrowband detectors, then the discussion should revolve around effieciency - for a particular implementation, wha's thet limiting SNR for how many CPUs at what PFA?  if there are algorithms that can offer improved performance against our DADD or triplet pulse detectors then there can be a discussion of the tradeoffs of implementing those rather than implementing other features we don't yet have. 

what i hoped when we began to develop the setiQuest community is that people would be eager to develop new ways of analysing the data, to provide sensitivity to different classes of signals altogether.  BEFORE such a new detection activity ever gets close to being a near-real-time signal detection client we need to look at different types of algorithms --- is it better to run many algorithms in parallel, each of which picks up one type of complex signal, or as you suggested a while ago, might the better approach be to look for one meta-algorithm that does an OK job on many different classes of complex, noise-like signals.  indeed those discussions and arguments can and do take place in a classroom, without any data at all.  we hoped that by making large datasets available in the cloud, the discussion could be informed by the real-world nature of the ATA observing site - the noise we make ourselves,  and the satellites above it.  we're also trying to attact the problem from the other end by asking whether there's a preferred set of signals that propagate relatively undisturbed through the ISM (a circularly polarize sinusoid is one such).  maybe the community has no interest in that at all.  so be it.  we'll creep towards this goal at the slow pace that our small team can manage.  maybe the community wants to do something altogether different with the data.  so be it.  but our telescope resources are limited and will be available for supporting those interactions that are mutually beneficial.  so my roadmap would be to first seek out good algorithms, improve their efficiency, and integrate them into a near-real-time observing pipeline with whatever the appropriate RFI discrimination tests are for that type of signal.

it's the citizen scientist branch of the setiQuest roadmap that gets into the real-time access of data off-site.  a throttle is needed since available datarates are too high.  this path has lots of additional complexities and many layers of APIs to build.  why not start with the signal detection client we already have, and the frequency/time visualization that is appropriate for narrowband signals, and use the citizen scientists to explore those frequency regions in which we are currently blind?  this allows the whole scaffolding to be built for serving up visualizations, having multiple tiers of vetting, and finally comparing against Rogue's Gallery of visualized known interference, with results traveling back to telescope while citizen scientists are rewarded and engage in the followup process.  other visualization patterns for different types of complex signals can be added as new algorithms are developed, and of course the unexpected anomaly might present itself.   ~1% fo the frequency span of the terrestrial microwave window has so many narrowband signal detections that SonATA is unable to complete its processing in the time allowed.  this is still probably too large to get off site, or have looked at until we build up a large core of citizen scientists.  and yes selection of badbands is a frequency bias, but our focus with SonATA on the entire terrestrial microwave window (about 9 GHz of spectrum) implies all frequencies are equally promising, and these are places we are currently skipping.  this has in its favor the ease of implementation.  when the scaffolding is ready, this selection criterion can be applied without much effort or enhanced computational load on existing near-real-time system. if anomalies are seen and described this provides a theoretical basis for study of what throttling criteria and pre-processing stages might optimize selection for them. 

i doubt that i've persuaded you, but in reality our team is so small that we cannot spend all our time feeding a voracious community -  we have a seach to keep running and improve upon one way or another.  we'll move forward best when there are mutually beneficial goals.  at first, i'll claim the privilege of age and experience, though never any intellectual or innovative advantage.  we are reaching out to the world because not all smart people interested in SETI work for the SETI Institute or SETI@home, but we have thought about this a lot. 

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
No, I understand your

No, I understand your resources are limited, and I can't claim to speak for the community. So if this plan is impractical, maybe we should just wait and see if someone else can come up with a better idea.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
Say we used this approach of

Say we used this approach of letting citizens look at the badband data - how would we define the application? What would the scientific return be? Since the band is filled with RFI and we can't null it out because the data is not real-time, of what use would the citizen input be? What would citizens look for and how would the results not be corrupted by all the RFI?

maxs-pooper-scooper
Offline
Joined: 2010-07-28
Posts: 58
My main disappointment

i doubt that i've persuaded you, but in reality our team is so small that we cannot spend all our time feeding a voracious community -  we have a seach to keep running and improve upon one way or another.  we'll move forward best when there are mutually beneficial goals.  at first, i'll claim the privilege of age and experience, though never any intellectual or innovative advantage.  we are reaching out to the world because not all smart people interested in SETI work for the SETI Institute or SETI@home, but we have thought about this a lot.

My main disappointment is lack of algorithm disclosure.  I think the roadmap said that the code will not be opened for another year or so.  That's fine; I can live with that.  But in the meanwhile I had hoped to look at the algorithms used.  I also read on this forum that the book SETI 2020 would outline the algorithms.  After reading the book, again, I was disappointed.  I appreciated the history and background. But the discussion on the signal processing used was very light.

I do not have any interest in trying to speed-up the GUI or other wrapper functions.  However, I would like to help in the signal processing and algorithm development.  To be honest, the lack of this type of information has reduced my interest at moment.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
If the algorithms were

If the algorithms were released, what would it take to make you interested enough to help improve them?

maxs-pooper-scooper
Offline
Joined: 2010-07-28
Posts: 58
If the algorithms were

If the algorithms were released, that in itself would be interesting.  Block diagrams or equations or code.  A few number of any of those would do much more than thousands of words of description.  The problem with word descriptions is that you have to assume everybody uses the same nomenclature. That is never the case. Math is math. Nice and clear.

As I mentioned in the introduction thread, I have signal processing & communications knowledge.  I am also doing a machine learning / pattern recognition topic for my Ph.D. dissertation.  Hopefully this will not come across too argonant, but I would like to think I am one of those smart people that Jill referred to that is not working on SETI at the moment.  I'd like to help.  Like I said working to speed up a GUI is not interesting to me.  That's not where the real science is located.  The science is at looking for ways to improve the current mathematical processes or develop new methods.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
Would you be willing to

Would you be willing to submit improvements to the algorithms without getting anything else in return? This would essentially be pro-bono consultancy work.

maxs-pooper-scooper
Offline
Joined: 2010-07-28
Posts: 58
isnt that

Would you be willing to submit improvements to the algorithms without getting anything else in return? This would essentially be pro-bono consultancy work.

Isn't that what all this work is ( for those not employed by SETI )?

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
Well, right now it is,

Well, right now it is, because we aren't doing any really big intellectual excersises yet. But in open source, some kind of reciprocity is usually expected. If I contributed a new algorithm, I would expect to be able to see what kind of results that algorithm is generating. I get raw science, they get better search - seems like a fair deal.

If you are willing to contribute either way, don't let me discourage you, of course. But I would assume that most developers in the long-term will expect this kind of reciprocity.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
This is a limited roadmap. 

This is a limited roadmap.  There are really two directions we’d like to go. First enable citizen scientists’ direct participation by using their cognitive skills to find patterns in portions of the data that SonATA can not now process – here it is OK to detect and report the narrowband type of signals that SonATA already finds, because we are missing them otherwise.  There will be MANY such narrowband signals, that’s why SonATA cannot process in real-time.  The challenge will be to set up a system to validate reports and compare them with known interference in order to sift out those that may be extraterrestrial and worth following up on.  There may be other types of signals as well that human pattern recognition can detect, before algorithms can developed.  That’s the second direction, developing new signal detection algorithms for complex signals with large number of degrees of freedom.

I would call it "general" or "very high-level" rather than "limited". What I meant with "citizen access to near-real-time data" includes, but is not limited to, the kind of activities like the two you describe above - in near-real-time. Do you disagree with this roadmap item/end goal? And if so, which part do you disagree with? Is it too broad? Or do you not agree that access should be near-real-time?

As you say, the challenge is to filter the RFI, and as far as I can comprehend, there is no really good way of doing this other than in near-real-time. Do you see any other ways?

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
Let me ask about your

Let me ask about your statement of the primary purpose.  I would have thought that the motivation for developing new client modules was to process the data in new ways in order to enable detection of signals that would otherwise be missed.

That is the motivation. Does the stated primary purpose contradict this motivation? Provided that we want to process the data in real-time, do we not agree that it is it necessary to throw away a lot noise in order to register/detect signals?

The reason I wrote 'data reduction' and not 'signal detection' is that I see client modules as possibly being only the first stage in a longer process, parts of which could run off-site: the client module could reduce the data, send the reduced data product to a server in the cloud, and only then when processed on the server would actual 'signals' be registered. In this case, the client module itself does not detect signals, it only acts as the first step in a signal detection [no-glossary]process[/no-glossary].

My point is that client modules will not always do signal detection themselves, but in all cases they will do data reduction.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
If I am correct about the

If I am correct about the primary purpose, then I wonder about the need to reduce the datastream sufficiently to get it off site. That throttling process will inevitably make some assumptions about the nature of the new type of signals to be explored. In the simplest case, reduction of bandwidth to achieve a viable datarate then precludes looking for classes of signals that have larger bandwidths as many signals with large number of degrees of freedom or complexity will have. We have been putting large, raw, unprocessed datasets up in the cloud – data we have not run through our detection systems. At the moment even this direct-to-disk-and-then-to-cloud data capture is limited to 8 MHz of the 104 MHz available – we are hoping that new hardware donations will broaden what we can do in near term. Anything going out on fiber would be far far less bandwidth. Building the API throttle that will choke down the datarate is in fact very complex and inevitably introduces significant biases. Since all the development and debugging and testing and validation of new detection algorithms will be non-real-time, I wonder why your first approach is to decimate the data severely enough to get it off site in realtime? Wouldn’t you rather have the closest thing to the full bandwidth raw data that we can provide since it opens up so many more signal processing opportunities?

I fully agree, the throttling does introduce biases. But my key point behind proposing this approach is that so does the current observing program (setiData) - only the biases are in time and space rather than bandwidth. Do we agree? If so, then it becomes a matter of selecting which bias to apply.

My point then is that if we (the community) had modules/clients running in real-time at the telescope, we would be able to pick the bias on the fly (we would send a command to the client at the telescope from the cloud), whereas now the bias is fixed.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
This is a most generous model

This is a most generous model – do you think it is likely to be possible?  Will the folks interested in being on software development teams have access to the corporate folks?

To be honest, I don't know, but I do think there is a pretty good chance (maybe sigblips can comment since it is really his idea). These corporations routinely engage the open source community anyway, because of their commercial interests in for instance Linux, and if someone in our community could come up with some very novel and compelling algorithms for SETI, I think there is a decent chance that one of these corporations would like to see their name associated with the experiment running in practice on a real radio telescope.

If not, we would have to do with what machinery we can get; we would use some kind of system for managing the limited hardware resources, as proposed elsewhere.

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
There is a lot of money in

There is a lot of money in the HPC market and the manufacturers are always competing for who has the best performance density.  A project with name recognition and a clever algorithm coupled with the right hardware seems like something a marketing department would love. The key here is that it's not just some software running on commodity hardware, that's boring. It needs to be special. Something like "The amazingly complex new XYZ algorithm runs best on Intel Sandy Bridge AVX extensions, IBM Power7 processors, Nvidia GPU's, ..."

It is unfortunate that this wasn't discussed before I went to the Intel Developers Forum 2 weeks ago. The place was swarming with marketing people from most of the big manufacturers, except AMD of course.  I didn't bother talking about setiQuest with anybody because I didn't think the SETI Institute would be interested in this sort of arrangement. "Open data access" would change the current level of control a bit in a read-only manner. I wouldn't be surprised if the SETI Institute wouldn't go for this since it is their project and their telescope.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
I sense a lot of skepticism

I sense a lot of skepticism in Jill's responses too, and I think sigblips is exactly right that it comes down to worries about ownership, so let me just state this very frankly: none of what has been proposed expects or requires that SETI Institute resigns any measure of ownership over anything (physical or otherwise). It would be a system running just as if it were your (SI's) own - you can deny it access, alter it or throw it out as you see fit if it is no longer in line with your mission.

All we, or at least I, am saying is that if we do it like this or some similar way it would be a whole lot easier for the whole world to help you.

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
My point about open data

My point about open data access stems from frustration related to my desire to help.  I try to help, I encounter a barrier, I find a way around the barrier, try to help some more, encounter another barrier, ...

Here is a recent example. A modulated signal was discovered in data from the January 2010 observation of the Kepler-4 system. That was April. The Kepler-4 system was then [no-glossary]re[/no-glossary]-observed in May for confirmation of this modulated signal near Hydrogen. The observing team missed it. SonATA missed it. Post processing at the [no-glossary]SETI[/no-glossary] Institute missed it. Then last week (September) the data set was uploaded and baudline found a modulated signal close to the same frequency near Hydrogen. If this data had been available 4 months ago I would of found the signal and then a timely follow up could of been scheduled. Now the signal may be gone, never to be seen again. Or maybe it was just RFI?

IMO this Kepler-4b signal is far more significant than the 1977 Wow! signal because it has a more credible target (exoplanet), it was drifting in frequency, the content was modulated, and it had a second [no-glossary]confirmation[/no-glossary] at a later date. These are all attributes that the Wow! signal lacked. It is likely that the Wow! signal will forever remain a mystery because so little data was collected. The Kepler-4b signal on the other hand consists of 8.7 MHz wide data sets collected from two different dates which means it can be studied. All that it will take to disqualify the Kepler-4b candidate signal is for some clever individual to decode the demodulated bits or if a plausible artifact generation mechanism can be found.  Wow, how times have changed!

So my point is that I can only help as much as the SETI Institute allows me to, hence my desire for "open data access."

robackrman
Offline
Joined: 2010-04-15
Posts: 235
The signals were not

[no-glossary]
The signals were not "missed," but rather, rejected as representing a repeat observation.

The following image shows baudline and setiQuest images side-by-side of the "interesting signal" from analysis of the 2010-01-22 Kepler Exoplanet 4 data:

Now we show baudline and setiQuest images side-by-side from analysis of the 2010-05-14 Kepler Exoplanet 4 data of the signal proposed as potential confirmation (reobservation) of the 2010-01-22 "interesting signal" above:

These signals are not similar.  The first (2010-01-22) signal exhibits a precise carrier with obvious minimal modulation that exhibits relative Doppler-shift within the limit of what would be expected for two (xmit and recv) rotating Earth-like planets.  The second (2010-05-14) signal exhibits an imprecise and wandering carrier.  We see many signals like the second which are typical for detections of oscillators radiating from inexpensive electronics equipment.  For example, see the gallery below of just a few of many signals detected in setiQuest analysis of the 2010-05-14 Kepler Exoplanet 4 data.  The top-left and top-middle images were refered to as "hydrogen sidebands" in analysis elsewhere in the forum,  but have nothing to do with hydrogen other than being nearby in frequency.  The others, which were apparently missed by baudline analysis, are signals that were also rejected as confirmation of the original signal.


[/no-glossary]

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
Wasn't the purpose of the

Wasn't the purpose of the Kepler-4 [no-glossary]re[/no-glossary]-observation to see if the "interesting" drifting, modulated signal located approximately -480 kHz left of Hydrogen was still present?

The signal I found at +2044295 Hz and +44234 Hz in the Kepler04(-3) and (-4) data sets match all three of those criteria (drifting, modulated, ~Hz location). Yes it is a different modulation scheme at a different baud rate but what were you expecting to find 4 months later? The exact same thing? It seems plausible that ET would build a beacon that cycles through multiple modulation schemes just in case the first couple couldn't be detected or demodulated.  Cycling through multiple modulation schemes will also make a signal standout as being more artificial. All good things if you are building a beacon you want civilizations to be able to find and understand.

I also didn't miss those 4 weaker drifting signals. Maybe you didn't see the TBD section in my blog post. It means I'm not done yet.  After I discovered that drifting modulated signal all my attention has been focused on that.

Basically how I analyze the setiQuest data files is that I first create a top view spectral map with baudline, then I work my way down focusing on the strongest signals first, and I'm blogging the whole way. I treat my blog entry as a lab book, it is just too much work to write the blog after the fact. What you see in my blog is basically a top-down of the progress of my analysis.  Here were my steps and train of thought:

  1. Create a starting map of the Kepler04-3 data set.
  2. -2422400 Hz unusual oscillating Histogram.
  3. +2044295 Hz drifting signal -480 kHz left of Hydrogen.
  4. Something looks odd. Holy smokes! This signal is modulated.
  5. Measure baud rate = 0.2221
  6. Demodulate bit stream.
  7. Attempt basic reduction and decoding of bits.
  8. +3008373 Hz "sideband" check. This signal isn't a mirror of +2044295 Hz.
  9. Jump to Kepler04-4 data set.
  10. +44234 Hz confirmation of drifting modulated signal that was in the prior data set.
  11. Put champagne in fridge.
  12. The importance of this discovery can't wait until my analysis is finished.
  13. Write up a quick conclusion, post, and wait for someone else to confirm what I found.

Since that last step, all my free time has been spent analyzing the finer details of this drifting modulated signal. I'm not ready to announce anything yet since I'm still not 100% sure but it seems that the vertical frequency drops are not exactly vertical and it also seems like there is some sort of embedded phase modulation at work.  I'm working on building a couple new tools / techniques to help verify this. Like I said in my blog post "I've never seen a modulated signal quite like this." Maybe it is just RFI but it sure is fascinating.

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
The first (2010-01-22) signal

The first (2010-01-22) signal exhibits a precise carrier with obvious minimal modulation that exhibits relative Doppler-shift within the limit of what would be expected for two (xmit and recv) rotating Earth-like planets.

While true, this statement creates an anticipated bias for the Doppler drift between two Earth-like planets. We know nothing about the size, orbit, and rotation parameters of this signal's source. It's -0.169 Hz/sec drift rate seems plausible. In fact SonATA has a ±1 Hz/sec drift range.

.
The second (2010-05-14) signal exhibits an imprecise and wandering carrier.  We see many signals like the second which are typical for detections of oscillators radiating from inexpensive electronics equipment.

Is it an imprecise and wandering carrier? Good question. At http://baudline.blogspot.com/search/label/SETI I've seen many drifting-random-walks. What makes this signal special is that it is modulated and it is located very close in frequency to the left of Hydrogen.

Since the signal candidate was rejected I don't believe that the OFF test described here http://setiquest.org/wiki/system-operation was performed. This brings up the important question of is the [no-glossary]candidate verification[/no-glossary] procedure correct for long durations between observations? Should the modulated signal in the 2010-05-14 been constituted as a verification of the 2010-01-22 FSK signal? If strictly interpreting the [no-glossary]candidate verification[/no-glossary] procedure then the answer is clearly a no. What should be the long-term [no-glossary]candidate verification[/no-glossary] procedure?

robackrman
Offline
Joined: 2010-04-15
Posts: 235
I am not persuaded that the

[no-glossary]

I am not persuaded that the 2010-05-14 signal represents confirmation of the 2010-01-22 signal.  It would be interesting to receive feedback from others at the SETI Institute and in the setiQuest community.

>Since the signal candidate was rejected I don't believe that the OFF test described
>here http://setiquest.org/wiki/system-operation was performed

Did you check to see if a similar "modulated" signal appears in any other observations covering the same frequency range but tracking a  different target?

[/no-glossary]

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
No, I must also confess that

No, I must also confess that seeing a confirmation in this signal makes me think of Percival Lowell's channels on Mars. I'd need something more rigorously mathematical to consider it a confirmation. Or data from an OFF beam, of course, that would have helped.

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
Signal verification and

Signal verification and confirmation are two different things but I admit that I mix them up and frequently use them interchangeably.

The question I was posing was should the new zigzag modulation seen in the 2010-05-14 data have triggered the verification follow up procedure? The purpose of verification is to perform the OFF test. After several ON / OFF test cycles, and possibly human intervention, then a request for confirmation should be sent to an independent telescope.

The important questions are:

  • How does the lapse of 4 months affect the verification procedure?
  • Is the zigzag modulated signal interesting enough to trigger the verification procedure's OFF test?
robackrman
Offline
Joined: 2010-04-15
Posts: 235
Other observations were made

Other observations were made on day 2010-05-14.  For example, there was on observation done on source PSR 0809+74 which is ~ 13 hours right ascension and 7 degrees declination "OFF" of target Kepler Exoplanet 04.  Unfortunately, the candidate signal is apparently found in the "OFF" observation data (see image below):

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
Very clever sleuthing. Good

Very clever sleuthing. Good job. It looks like the drifting-random-walk continued and drifted on over.  This PSR 0809+74 data set is almost as good as an OFF test done real-time.

I was going to take a look to see if the same modulation is present but the data set is not in the http://setiquest.org/wiki/index.php/SetiQuest_Data directory.  There is a PSR 0809+74 data set but for the 2010-06-25 date which is more than a month later.

robackrman
Offline
Joined: 2010-04-15
Posts: 235
That's frustrating.  I will

That's frustrating.  I will ask Gerry/Avinash to post that data.

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
The 2010-05-14 PSR 0809+74

The 2010-05-14 PSR 0809+74 data set hasn't been posted yet. Could you please ask them again.

gerryharp
Offline
Joined: 2010-05-15
Posts: 365
oops?

There may be a mis-communication about the two very different systems we have set up to perform seti observing.

The main system is SonATA (aka Prelude which previous generation). This system is highly automated and performs "off" measurements and more "on" measurements as needed to verify a signal is RFI or not.

Unfortunately, SonATA cannot produce output data files like the ones y'all are lookin at. These data are taken with a new, bubble gum and bailing wire system that records a relatively small bandwidth to disk. In the past few weeks,we have been obtaining "off" measurements for most "on' measurements, but this is not automated. Ususally, it is yours truly who types the commands to run the observations. I alert everyone to the fact that for a given day (especially) any observation on one target can use any observation on another target (at the same frequency) as an "off." And vice versa.

Hopefully this will allow people like sigblips to immeditely check one or more "off" observations to see if the same signals appear.

Thanks!

Gerry

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
signal confirmation?

Here is an image of the 2010-05-14 zigzag modulated signal that has been drift corrected and zoomed in for a 0.024 Hz/bin resolution. There were several processing steps involved so the sample rate is approximate (error < ±1 sample/sec).

This is an extreme amount of zoom. Note the 6 short sequences of FSK modulation. The baud rate and mark/space frequencies are different but are these FSK sequences matching modulation markers to the 2010-01-22 signal? Also of interest are the 5 big groupings that have a 51 second periodicity.  This looks like a modulation within a modulation, very strange.  I'm still verifying my phase analysis conclusions so this might get even more interesting. [headphones on] (:

sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
I've finished my phase

I've finished my phase analysis and my hunch was correct. This modulated signal got more interesting. Three times more interesting in fact. (: You can read the details on my blog in the +2044295 Hz section:

http://baudline.blogspot.com/2010/09/setiquest-kepler-4b-redux.html

This signal turns out to be a modulation in a modulation in a modulation. The clocking is perfect and I've never seen anything like it before. Now the question is how and why would anyone create something like this?  The FSK-pulsing section has about half the baud rate and double the spectral efficiency compared to what I measured in my original Kepler-4 analysis. The spectral efficiency is baud rate divided by the mark/space bandwidth so the half/double comment I made isn't as obvious as it seems. My point is that it scaled in a squared sort of way. Also of interest are the new autocorrelation plots I added. They look very similar to the autocorrelation images I posted in my original Kepler-4 analysis of the January 2010 data. So it's not the same signal but is it verification? It is without a doubt the most interesting signal I've seen in all of the setiQuest datasets.

Note that I still have to rewrite the conclusion and add some sections for the other weaker signals in this data. I might even make a YouTube movie with audio! Maybe I'll get to those items this week.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
To be useful in the very

To be useful in the very dynamic RFI environment we live in, this sort of feedback needs to happen with the same sort of near-real-time cadence that we work with now, ~ a few minutes.

I agree, that was what I had in mind - I was thinking of applications like the mobile phone app featured in the animated video on the front page of setiquest.org.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
I think we would invoke the

I think we would invoke the same sort of automated follow-up observing process we now go through before any human resources are invoked.

Fully agree/understand.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
I hope we can take this issue

I hope we can take this issue up at the next meeting and perhaps reach some kind of agreement on whether to continue work on this roadmap or if it should be dropped. Maybe Jill or someone else at SI can find time to respond to my replies below before then.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
I am retiring this roadmap. I

I am retiring this roadmap. I leave it in unannotated form here for archival purposes:

We envision a roadmap for the project which progresses through a number of well-defined stages to ultimately conclude in citizen access to near-real-time data from the ATA.
 

  1. The definition and documentation of a clean API for access to the beamformer and SonATA data streams from special client modules running locally at the ATA. An API for letting client modules suggest new observation targets and schedules may also be defined. The APIs are implemented in OpenSonATA to allow client modules to be tested against OpenSonATA, outside of the 'production environment' at the ATA.
  2. Development of new client modules by the community. The primary purpose of client modules is to drastically reduce the stream of data emanating from the ATA beamformer. When data have been reduced to a sufficient degree, it may either be sent to the cloud for further processing or to other subsystems that will decide whether the current observation pipeline should be broken to reobserve the target in question.
  3. Once the community has prepared a set of stable and worthwhile modules, they are to be installed locally at the ATA. Some integration on part of SI staff may be necessary - depending on the APIs and processes that has been established - to ensure that the modules performs nominally, without interfering with other operations at the telescope. We envision a model where the hardware to host the new client modules at the ATA is supplied by the teams behind the modules themselves, e.g. via corporate sponsorships. In the earliest phases of the project, SI may choose to lend the teams some hardware, to get the [no-glossary]process[/no-glossary] started.
  4. When the new client modules are running in the production environment at the ATA, and the most interesting data is continually and automatically uploaded to the cloud, the community is to develop applications that further [no-glossary]process[/no-glossary] the data (this can also be done in parallel with the previous steps). Some of these applications should feature citizen science opportunities for the non-technical public. The applications may [no-glossary]feed[/no-glossary] digested results from the cloud back to the client modules at the ATA, as a guidance for reobservations.
  5. Citizen scientists and computers in the cloud scan the radio waves for evidence of technosignatures. If a statistically significant signal is detected, experts are notified for further analysis and, hopefully, confirmation.

Questions

  • Which primitives in the ATA environment can be opened up as part of an API for client modules?
  • What are our internet bandwidth requirements? Do they exceed current bandwidth at the observatory?
  • What are the requirements for client modules? What will it take to install them at the ATA?
  • What is needed for citizen science applications to be developed?
  • How are signal detections to be handled? Should we establish a notification interface for manual operators at the ATA?
sigblips
sigblips's picture
Offline
Joined: 2010-04-20
Posts: 732
I wonder why your first

I wonder why your first approach is to decimate the data severely enough to get it off site in realtime?   Wouldn’t you rather have the closest thing to the full bandwidth raw data that we can provide since it opens up so many more signal processing opportunities?

Yes, that would be great but the problem is caused by the discontinuities between the full 105 MHz wide bandwidth "chunks." These time discontinuities will cause all sorts of DSP artifacts and will also make weak-signal coherent detection of drifting tones and pulses impossible. So while decimating will limit the spectral range it does keep the data continuous which is beneficial for a remote weak-signal monitoring program.

That said, I would love to have some data sets that are full bandwidth but shorter in duration. Also data sets of multiple polarizations and beams would be most welcome as they would allow for the testing of more sophisticated detection ideas.

Another benefit of distributing data sets with 2 beams (designated as ON and OFF beams) is that it would allow for the antenna OFF candidate confirmation test to be done long after the data was collected. For example this would of resolved the recent Kepler-4 conflict discussed in a [no-glossary]thread[/no-glossary] below. Was the interesting zigzag modulated signal that I claim to be a 2nd verification of the original Kepler-4 FSK signal also in the OFF beam? If 2 beams were collected and the modulated signal was in both beams then it's source would be undeniably terrestrial.

Anders Feder
Offline
Joined: 2010-04-22
Posts: 618
A few points in preparation

A few points in preparation for the meeting tomorrow:

  1. What is our goal? Over here Michael Moradzadeh suggests that "The goal is to involve more humans in the search." Is this really the goal? This seems to suggest that you/we are not really interested in new algorithms, since they tend to replace the person wearing the headphones rather than involve more of them. Would a better goal not be something like "enabling people outside the institute to help improve the search"?
  2. If the latter is in fact the goal, and the philosophy of the SETI Institute is that candidate signals should be verified immediately, can it then not be inferred that the holy grail of setiQuest is giving said outside people some kind of opportunity to improve the real-time search at the ATA?
  3. If so, how do we get there? What are the barriers to giving outside people opportunities to contribute to the real-time search? If it is to some degree a matter of trust, is there anything the community can do to earn the trust of SI in this matter?