Open Sourcing of Exploratory Techniques for the SETI Search
From setiquest wiki
The team at SETI Institute has recently been using the Allen Telescope Array (ATA) to capture raw data from our beamformers and storing it on the Amazon cloud as an archive of observations. This archive (called setiQuest Data) is used by professional and amateur SETI enthusiasts who perform their own data reductions, all supporting the search for ET. This archive is especially useful for newly-developed analysis techniques that have never been tried before, which may turn up SETI signals that are overlooked by present dedicated systems that perform only a single type of data analysis (including ATA real-time instrumentation).
Of course, scientists and engineers at the SETI Institute are developing their own suite of algorithms to be applied to this archive of data. Each time a new algorithm is developed, there is a chance that it will turn up a signal that has never been seen before, in fact, our experience shows that they always do. Once signals are discovered, detective work is required to determine if the signals are man-made or coming from outer space. This field of research is very exciting with the chance that some new algorithm will discover SETI "right under our noses." The goal of this project is to lower the barrier to entry for discovering new signals to a wide audience, by providing an open-source toolkit containing an ever-growing number of algorithm building blocks using a consistent interface and open source tools.
Several scientists are currently developing new algorithms which, characteristically, are all developed in different computer languages with different I/O schemes and so forth. Many of these new algorithms are slow to run, since they use scripting languages like MatLab. The initial goal of this project is to take existing ideas and source code for data analysis and port them to a unified code standard (perhaps in C or Java) with a unified I/O interface (perhaps Linux pipes or files).
This unified and flexible code base will be made open to the entire web community, permitting users to perform their own analyses using stock code, or modify code templates to generate new algorithms. It is often true that major code blocks (e.g. Fourier Transform) are re-used in a variety of algorithms. For this reason, we propose to make such code blocks into stand-alone programs. Using such a system, a novice user can generate a signal processing pipeline by chaining together multiple programs.
The intern stands to gain multiple benefits from participation. Firstly, s/he will be working with top scientists and software engineers in the field, and will be exposed to standard and novel techniques in scientific software / digital signal processing. A intern with an interest in scientific programming, and/or SETI science would have a unique opportunity to participate in the real search for extraterrestrial life. Of course, testing of the new system would use real data from the setiQuest Data archive, and the intern could actually be present at the discovery of a signal. The intern would be encouraged to contribute new ideas for algorithms, and would also have the opportunity to interact with a wide audience as their code is promoted on the setiQuest website and in other domains.
Before the beginning of the summer of code, the intern will interact with the mentors to examine example code and co-develop a consistent I/O scheme which will be the basis for all code blocks. Algorithm examples are provided and useful open-source libraries are suggested (such as FFTW for Fourier Transforms http://fftw.org/, or gnu Radio for some signal processing examples http://gnuradio.org/redmine/projects/show/gnuradio). In some cases, Java and C library interfaces already exist, which can be adopted on a rapid time scale.
After choosing an initial platform and set of tools, the intern will develop necessary code blocks to perform at least 4 SETI discovery algorithms: Frequency Power Spectrum plus thresholding (the gold standard for SETI analysis), Autocorrelation Power Spectrum plus thresholding (a technique only recently attempted on a moderate scale - Contact mailto:email@example.com for a preprint of a paper on this technique), where both of these techniques are applied to raw electric field amplitude measurements; and the equivalent power/autocorrelation spectroscopy using the same data after squaring (converting to units of power). This list may well expand or change depending on the research directions interesting to the intern.
At the end of the project, we expect there will be a code base for distribution. This code will incorporate standard tools for configuration and compilation. Users simply download and compile the code base, download some setiQuest Data, and get to work discovering ET!
Value to the Open Source Community
The signal processing code developed here will be useful to all computer-savvy SETI enthusiasts who want to get their hands dirty by examining real signals from outer space. This fills an important need as a variety of observatories are now starting to make raw data available on line, but there are no freeware signal processing packages that address the special needs for SETI analysis.
The signal processing facilities provided by this project will be applicable to a wide range of one-dimensional signal processing problems in e.g. geology, amateur sonar, radar, and high speed diagnostics of digital equipment. The simple to understand platform can be helpful in tutorial setting for training in digital signal processing. But mostly, these tools will open the SETI search to a much wider range of participants than ever before.
The successful candidate will have the following qualifications:
- An interest in digital signal processing or scientific computing, and SETI
- Working knowledge of Linux OS and experience downloading/compiling code on Linux
- Firm base in C (or alternatively, strong Java skills including JNI)
- Willingness to implement command-line compilation tools ("configure" and "make" or similar)
- A willingness to create tutorial documentation for the code
Gerald Harp (astrophysicist, http://www.gerryharp.com, firstname.lastname@example.org) and Robert Ackermann (senior software engineer) will co-mentor this student. This mentor pair has collaborated on a variety of successful projects at SETI and the ATA, combining decades of experience in high quality software development, robotic automation, and theoretical physics. Harp has a track record of mentoring students at all college and postgraduate levels at university as well as ~10 interns during his tenure at the SETI Institute. Robert Ackermann's leadership in scientific software architecture and many accomplishments in software development give the team a firm grounding in best practice and experience with open source projects and development.
For more information, updates and a discussion of this project, follow the discussion thread posted at http://setiquest.org/forum/topic/google-summer-code-gsoc