I started doing my data processing using Matlab, which I found really good; however when I retired I lost access to it. I have tried using Octave, but found its performance, and stone age interface so frustrating to use that I had temporarily stopped working on it.
However someone has recently pointed me to the language 'Python' and its extension 'Scipy' which seems to offer virtually all of the facilities available in Matlab, and it seems to be available as a 64bit language, offering the ability to handle much larger arrays.
My question is, has anyone used this in SETI applications, and how does it compare against Octave wrt processing speed etc?
The reason I am asking this, is that I don't want to spend a great deal of time learning yet another computer language, only to find that it is no better than Octave; my time would be better spent creating a C++ library
At SETI we do not do much Python. We do work with Ruby and found it to be quite powerful. But our use of Ruby has been limited to scripting to run other programs. If you use Ruby or Python you'll be stuck learning a language, of course.
We do have a user "ksnodgrass" who is in the process of creating a tutorial on how to get the setidata into Octave and perform operations on it. See http://setiquest.org/wiki/index.php/SetiData_tutorial_octave
He is in the middle of creating the tutorial, so you'll have to wait, or contact him. I'll ask him to check into this forum topic and join in this discussion
Thanks for your advice. I have down loaded Python 2.72, and started the long climb up the learning curve (seems to get steeper every time I go up it ;-). I haven't yet loaded the other components yet. However to convince myself it was a journey worth taking I tried a simple experiment. I built a loop looking something like
k = 0
while k <= 100000000
k = k+1
both in Python and Octave. The Python completed it in 14 seconds, whereas the Octave took 4minutes & 45 seconds to do the same thing. Methinks that the climb is going to be worthwhile.
Thanks once again, I appreciate it.
BTW, Octave is "slow" for some things, but generally is not as slow as your test would indicate. I don't know why your simple test was this slow, it makes no sense to me. However, for complicated transformations like singular value decomposition or fourier transforms, Octave is not slow at all.
For the FFT, Octave is backed by the floating point FFTW library, and our tests show the computation time is comparable to C programs using the same library. Similarly, Octave is backed by blas and lapack, so matrix operations are fast.
The interpreter may be slow, but is is unusual that this is where you'd spend most of your time in a computation.
I'm not particularly defending Octave, just want to have a realistic discussion.
In addition to Gerry's comment:
The majority of my Octave DSP scripts do not contain iterative loops. For efficient processing, it is best to use vector operations. For example, let's suppose Dave's code, which iterates k 100,000,000 times, was to compute the square of k and store those values sequentially in an array "A" (I made something up because Dave's example iterative loop performs no inner function), that would be done efficiently in Octave like this:
To prove it worked (within the limits of double precision in this case), we'll look at the first ten and last ten values:
ans = 1 4 9 16 25 36 49 64 81 100
ans = Columns 1 through 4:
9.999998200000080e+15 9.999998400000064e+15 9.999998600000048e+15 9.999998800000036e+15
Columns 5 through 8:
9.999999000000024e+15 9.999999200000016e+15 9.999999400000008e+15 9.999999600000004e+15
Columns 9 and 10:
Step 2 (above), which does the work, only took a few seconds on my modest development computer.
I agree that the test is kinda rigged for octave to fail.
Octave will take a speed hit on any iterative loop -- octave/matlab is optimized for matrix operations.
What you want to use is python with numpy. Numpy is optimized for vector operations. Most of the optimization is C/C++ under the hood, but it also takes advantage of BLAS/LAPLACK.
In my experience python with numpy code typically runs faster than equivalent octave code, in some cases a lot faster. Like any language, there are various tricks to optimize the code. There are disadvantages of course to use numpy, but the fact that you can use python as an OOP framework with numpy to handle the numerical crunching is very, very desirable to me.
I would like to thank you all for replying to my enquiry. I am afraid that I find learning a new computer language a real up hill struggle these days. I have use Matlab for many years now, and knew that Octave had the vectorizing capability of Matlab; but on several of my programs I found that Octave was substantially slower than Matlab (by almost an order of magnitude).
I had been advised to look at Python, and its libraries like Numpy and Scipy; but what I was a bit worried about was having put in the effort to learn Python, and trying it out in vengence I would find that it offered no advantage over Octave. After Max's original recommendation I thought the best bet was to try two identically structured programs and benchmark them against each other. I knew that I was putting Octave at a disadvantage doing a simple loop structure; however without learning a great deal of Python, it was the simplest program I could write. I was staggered at the difference in performance which persuaded me it was probably worthwhile to climb the learning curve once more.
What I find really disheatening is that in the past I have had to learn many computer languages, and always claimed that once you knew the basics of programming, you could easily switch to another language, and back then it was true. Python isn't even a difficult language; however at 66 years old and a Brain Tumour I seem to have to eat my words.
Thanks once again for your help and advise - it is much appreciated.
and in particular machine learning ( which I think will need to eventually make it's way into SETI ).
Of course Matlab will continue to exist and be used due to inertia.
Octave will maintain existence, because there will always be people who want a free Matlab, but are hestitant to move over to another language.
All three packages for python+numpy/scipy+matplotlib (PNSM) are free and actively maintained.
Anything you can do in Matlab/Octave you can do in PNSM just as easily -- converse not true, i.e. object-oriented scripting.
Python allows multiple programming styles: functional, sequential, procedual, OO.
Numpy/scipy uses C++ under the hood, so it runs pretty fast ... like Matlab mex files, you can create optimized C/C++ code and swig it into python or numpy.
I've been using Matlab/C/C++ for the last 14 years and Python for the last 10 years ... and numpy/scipy for the last 2 years.
I now do almost all of my non-embedded programming in python+numpy/scipy. This runs fast enough for my DSP/ML needs 99% of the time. The other 1% I wrote optimized C++.
I would definitely consider the PNSM combination for any new code/processes created.
If there is anyone left, and looking for a SETI Data processing Windows environment, I can strongly agree with Max-PooperScoopers recommendation for Python. As you know, having lost access to Matlab, and finding the performance of Octave just to slow, and with a stone age interface I was looking for something better. I have been experimenting with Python for quite awhile and have found the following combination provides a very professional development environment which is ideally suited for experimentation on the SETI data files.
1) 64 bit Python - can handle the very large arrays that you need for SETI work
2) 64 bit Numpy & Scipy to provide access to all of the signal processing functionality you require (and a lot more)
Numpy also provides arrays to handle complex numbers, and most importantly its MemMap function provides
transparent array handling straight from disc, allowing, for example, FFT of huge arrays, that would give the dreaded
"out of memory error" on most other maths packages.
3) Matplotlib gives a fairly good graphics capability, although I have found that its connection to flaky when
continously updating a graph throughout a calculation. Any other window activity causes the link between the graph
and the calculation to get lost, although the calculation continues. This might be my caused by my inexperience, as I
am still climbing up the learning curve.
4) PyQt4 gives the ability to generate relatively nice looking GUI, although I have yet to find a good way of embedding
Plots into the GUI like Guide provides the graphical axis, so currently my plots always hang in seperate windows.
5) A great discovery was that I had down loaded the toolbox that is designed to interface the 'dot net' compatible version
of Python (Iron Python) to the Microsoft Visual Studio development environment, ready to evaluate that, and to my
surprise it works just as well with normal Python. This really provides a really comfortable, and professional interface
giving you access to project structure, and full debugging facilities.
Of course the great plus point for someone like me living off of a pension, it is all free.
I am learning the python+numpy/scipy+matplotlib environment and must admit that I like it. Here is an example of pulsar B0329+54 detection functionality ported from one of my past Octave scripts:
Are you using the Microsoft Visual Studio Shell for doing your program development. I can really recommend it,
It provides a project based development environment into which you can integrate/reuse Python scripts/programs into your development simply by including the file into the project, this provides all of your code into your editor on seperate tabs.
It also provides really powerful debugging capabilities, removing having to include lots of debugging print statements that you need with more basic editors, as Visual Studios provides single stepping through your code, with seamless transition into functions/classes in the other files in your project if you want, else you can elect to jump straight over the function call if you are certain that the function is bug free. It provides a powerful watch facility where you can investigate how a particular variable(s) value changes as the program progresses, and also gives you the very nice feature of tool tips when you hover over a 'call' to one of your own functions, showing you exactly what parameters the function is expecting, removing the need to go hunting through your own source code looking for where you originally defined it.
It also provides colour coding of key words, and automatic indentation tracking.
All in all it provides a very professional user interface, which leaves IDLE looking like a real stone age front end. If you haven't tried it, it is well worth downloading a copy - its 'free'. If you are using it, sorry to waste your time.
Well its 2013 and I somehow stumbled onto this 2010 thread and was so delighted I was compelled to join the forum.! guess its quite appropriate to see such terrestrial intelligence involved in the SETI.
Like Dave Robinson I am a new retiree in extreme Matlab withdrawal with nearly identical experience and concerns on how to keep on living. Octave's plug and play matlab math performance was delightful and fast enough (e.g. sparse matrix eigenvalues) and looked like the panacea but then the sloooow plotting interface and lack of a GUI (no Tetris !!) seemed like insurmountable obstacles to me so I too got caught by the above commentary on PNSM. (In fact for separate reason I am about to install python anyway because its apparently the only application that works with a certain development kit.); I have also loaded MVS 2010 Express to try to repair the C++ application. While I was able to script my way through 30+ years of coporate Matlab fun and games-- aka work, I am sensing that C coding capability will be needed to patch together the open source quilt needed to cover my pensioner math "work".
So Dave, three years have gone by. I do hope this finds you well. Did you climb the PNSM/MVS learning curve ? Do you have any updates to the prescriptions in the above thread? I would be happy to climb a modest learning slope too (at age 66) and give back some modest incremental code improvements but not a whole graphics engine.
By the way you should consider writing the book "My Life After Matlab"...there must be thousands of us in the same sorry state with exactly the same concerns you voiced at the outset to this thread.
Its great to have a new 'voice' on the Forum - indeed it is nice to have any 'voice' on the Forum. Yes I have been ploughing on with learning Python, although I must admit my interest in processing SETI signals has sadly diminished with the lack of interest of other members and the Institute itself in the Quest. The thought of discovering a signal from E.T. and then finding no one on Earth was really interested was just too much.
I am currently using Python as a software front end to talk to a USB to I2C converter to control a series of instrumentation modules I am designing. and am getting quite enthusiastic about it, except after 2 years of working flawlessly Pyserial suddenly stopped talking to my converter giving me a 'Access Denied' exception - I think Microsoft has done something dodgy with one of its updates that has kiboshed Pyserial. So I have now had to change the interface to a very 'bodgy' system consisting of a C# Server connected to the converter via a serial link (which wasn't killed in the update) which talks to the Python via a Socket. A scheme that seems to do the business, and has an advantage that I can talk to my hardware across the internet with just trivial modifications. This is one of the downsides of using such a development environment, which itself is evolving, things totally out of your control can break, leaving you right back at the start, after several years of effort.
I am also playing with iPython (not to be confused with ironPython which is another animal entirely); this attempts to integrate Python, numpy, scipy and matplotlib into an integrated package and provides an interactive environment not to dissimilar to the look and feel of Matlab. There are several interesting books on using iPython for data analysis. If you are interested then I will keep you appraised of my progress on this front.
Your prototyping system wouldn't be a Rasberry Pi would it? Got one myself, but have yet to get to grips with it
Once again thanks for your response to the Python thread - its good to see that it isn't quite dead.
...wow...I finally got a contact..via SETI...
Kidding aside, thanks for the response. I was wondering about SETI after investigating the site the night I joined. Seemed a bil like a ghost town. What is going on with the Array? This was considered as one of the ground stations for our LunarXprize design.. I have spent about half of my research career in signal detection and identification, developing of course, custom matlab tools beginning in the mid-80's. Though intially drawn by your tool discussion on this thread I might have been able to contribute something to SETI...the ultimate 'open-loop' signal ID challenge...we are essentially "clueless" for any kind of matched receiver design.
Seems like we are "veritable jumeaux" re post retirement trajectory. After Octave and Maxima, I too am now booting up on Python and my son (a Disney IT admin) recommended the iPython interface. As I was able to conduct nearly 100% of my reserach career with Matlab script I am also having to slowly nibble into C/C++ etc.. Like you I am also getting into real-time applications using the cornucopia of miniature instruments and internet connectivity to things that we have these days. Haven't yet connected directly by USB/i2C rather am doing so via USB wired or USB/Bluetooth to a remote microcontroller. Have looked at RPI for remote camera and Zigbee for whispernet but have not yet made an investment. Also intrigued to try an ultra cheap SDR chip...(real-time SETI: a millions of SDR's hooked to millions of SETI apertures??)
The Python client I am using to demo some devices does import Pyserial and it works fine on this XP Acer InspireOne Notebook via its butilt in Bluetooth..at least.in receive mode. There is an embedded TI MP430 for the i2C devices and the SDK/firmware comes only with the C source code and device libraries...so I am having to "invert" C programming to figure out the command protocol. I think I have figured out the command header and protocol and it works under Pyserial when the Python client is running. However...I want to open a serial port in Octave , send some configuration commands and plot the received data in real time..PC graphics in any language seems a bit of a chore. My son hacked the Python client which uses Pygame for animation after dinner last week and got a crude pixel-level plot going -- and then left. I judged it would be faster to use existing graphics in Octave or iPython. I have finally installed iPython etc and I can see other success with realtime matlibplot on the internet, but haven't yet "unwrapped" these tools so decided to start with Octave. I just need serivicable engineering plots for development and don't need a complicated GUI. I can receive the initial (open loop) device data using sp=fopen('COMx') and fread(sp,N), but get no joy with fprintf(sp,"cmd"). Same thing with the Realterm freeware. Something about my setup isn't allowing me to send commands while others in the development community have been successful. One difference is that I am automatically spawning two COM ports (send and received) with my built in BT dongle while the other report of success used an external Toshiba BT dongle and has only one COM port.. and so the band plays on...I do enjoy learning ..but if I wasn't retired this one exercise would have probably cost someone the price of a new car by now...mostly for constant hunting, downloading and installing open source software in the language dejour.
So internet connected instruments and devices are definitely of current interest as is softwared defined radio. I have a deep engineering science research background in MEMS and in signal ID... so a reasonable chance of interecting with the SETI challenge. But if you would like to continue on about more "terrestial pursuits" and the optimum trajectory in open source source software feel free to contact me by email at firstname.lastname@example.org. I would be happy to hear more on your interests as well.
P.S. Subsequently woke up to the fact that fprintf or fputs(sp,"cmd") were not compatible with the MP430 parsing code thruput. Say "cmd"="abc9". Octave or Realterm bursts out each character (byte) at full baud rate (115200) but the receiving code loop apparently couldn't keep up. At any rate when I used an Octave sequence: sp=fopen('COM6','r+');fputs(sp,"a");pause(1);fpust(sp,"b");pause(1);fputs(sp,"c");fpust(sp,"9');paus(1);data=fgetl(sp)...fclose(sp);clear sp.
I was finally able to get a serial command into the device...of course how Python/Pyserial managed to match the MP430 thruput remains a deep mystery.
Apologies for our lack of attention to this forum. Severe budget setbacks make it impossible for us to spend any time on this forum.
The ATA is alive and well. We're observing every night for SETI signals as well as doing radio astronomy projects.
We recently got a 3.6 M$ grant from a philanthropist to upgrade all the ATA receivers to have much greater sensitivity and the capacity to cover frequencies from 0.9 - 18 GHz.
Upgrading the receivers is the ONLY project currently funded for our team. This is why we cannot afford time to keep up the forum.
Wish us luck on getting new funding. We just submitted a grant proposal to NSF.
I assume then this is why SETILive has not been upgating live data on the site.
Just a quick comment, slightly off topic - i.e. very little to do with Python. As Gerry said the ATA is actively collecting data, and to my knowledge, there is a 'dumbed down' citizen science sister program to SETIQuest called SETILive, which allows you to use the mark one eyeball to identify potential straight line signals on the time frequency (Waterfall) plots. I am afraid I rapidly got tired of classifying these lines, as it provided very little in the way of mental stimulation; my interest is in designing what I hope are novel algorithms to automatically extract signals buried in noise.
I don't know whether you have discovered it during your perusal of the site, but buried away here: -
is an index into a database of what might be terabytes of data taken from the ATA on lots of interesting astronomical objects and spacecraft, which is available for you to download. Alas it is no longer being updated. However what I have found that the Pulsar data is particularly interesting, as here you know there is a real signal buried in there, so you are not actually looking for a signal that might not exist. The signal from psrb0329+54 is fairly easy to find, However signals from the others such as the Crab Pulsar is proving to be quite a challenge.