EVERY “CRUNCHER” WORKING ON PROJECTS RUNNING ON BOINC SOFTWARE OWES A DEBT TO THE SETI@HOME PROJECT. NO SETI@HOME, NO BOINC. SO I THINK THAT IT IS FAIR TO SAY THAT ANYONE CRUNCHING ON ANY PROJECT, EVEN THOSE PURIST WCG CRUNCHERS, SHOULD BE REPAYING THAT DEBT BY CRUNCHING FOR SETI@HOME.
SETI@HOME IS SOMETIMES CONFUSED AND CONFLATED WITH THE SETI INSTITUTE. THEY ARE SEPARATE ORGANIZATIONS.
Everything in this post was taken directly from the SETI@home web site, with the exception of a wee portion from Wikipedia. There is a lot more information to be found if you access the SETI@home web pages.
“SETI@home (“SETI at home”) is an Internet-based public volunteer computing project employing the BOINC software platform, hosted by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States. SETI is an acronym for the Search for Extra-Terrestrial Intelligence. Its purpose is to analyze radio signals, searching for signs of extra terrestrial intelligence, and is one of many activities undertaken as part of SETI.
SETI@home was released to the public on May 17, 1999, making it the second large-scale use of distributed computing over the Internet for research purposes, as Distributed.net was launched in 1997. Along with MilkyWay@home and Einstein@home, it is the third major computing project of this type that has the investigation of phenomena in interstellar space as its primary purpose.
How SETI@home works
The Problem — Mountains of Data
Most of the SETI programs in existence today, including those at UC Berkeley build large computers that analyze that data from the telescope in real time. None of these computers look very deeply at the data for weak signals nor do they look for a large class of signal types. The reason for this is because they are limited by the amount of computer power available for data analysis. To tease out the weakest signals, a great amount of computer power is necessary. It would take a monstrous supercomputer to get the job done. SETI programs could never afford to build or buy that computing power. There is a trade-off that they can make. Rather than a huge computer to do the job, they could use a smaller computer but just take longer to do it. But then there would be lots of data piling up. What if they used LOTS of small computers, all working simultaneously on different parts of the analysis? Where can the SETI team possibly find thousands of computers they’d need to analyze the data continuously streaming from Arecibo?
The UC Berkeley SETI team has discovered that there are already thousands of computers that might be available for use. Most of these computers sit around most of the time with toasters flying across their screens accomplishing absolutely nothing and wasting electricity to boot. This is where SETI@home (and you!) come into the picture. The SETI@home project hopes to convince you to allow us to borrow your computer when you aren’t using it and to help us “…search out new life and new civilizations.” We’ll do this with a screen saver that can go get a chunk of data from us over the internet, analyze that data, and then report the results back to us. When you need your computer back, our screen saver instantly gets out of the way and only continues it’s analysis when you are finished with your work.
Screenshot of SETI@home Enhanced BOINC Screensaver (v6.03)
It’s an interesting and difficult task. There’s so much data to analyze that it seems impossible! Fortunately, the data analysis task can be easily broken up into little pieces that can all be worked on separately and in parallel. None of the pieces depends on the other pieces. Also, there is only a finite amount of sky that can be seen from Arecibo. In the next two years the entire sky as seen from the telescope will be scanned three times. We feel that this will be enough for this project. By the time we’ve looked at the sky three times, there will be new telescopes, new experiments, and new approaches to SETI. We hope that you will be able to participate in them too!
Breaking Up the Data
Data will be recorded on high-density tapes at the Arecibo telescope in Puerto Rico, filling about one 35 Gbyte DLT tape per day. Because Arecibo does not have a high bandwidth Internet connection, the data tape must go by snail-mail to Berkeley. The data is then divided into 0.25 Mbyte chunks (which we call “work-units”). These are sent from the Seti@Home server over the Internet to people around the world to analyze.
Extra Credit Section: How the data is broken up
SETI@home looks at 2.5 MHz of data, centered at 1420 MHz. This is still too broad a spectrum to send to you for analysis, so we break this spectrum space up into 256 pieces, each 10 kHz wide (more like 9766 Hz, but we’ll simplify the numbers to make calculations easier to see). This is done with a software program called the “splitter”. These 10 kHz pieces are now more manageable in size. To record signals up to 10 KHz you have to record the bits at 20,000 bits per second (kbps). (This is called the Nyquist frequency.) We send you about 107 seconds of this 10 kHz (20kbps) data. 100 seconds times 20,000 bits equals 2,000,000 bits, or about 0.25 megabyte given that there are 8 bits per byte. Again, we call this 0.25 megabyte chunk a “work-unit.” We also send you lots of additional info about the work-unit, so the total comes out to about 340 kbytes of data.
What is Astropulse?
Snapshot of BOINC SETI@home Astropulse Screensaver.
Astropulse is a new type of SETI. It expands on the original SETI@home, but does not replace it. The original SETI@home is narrowband, meaning that it is listening for a particular radio frequency. That’s like listening to an orchestra playing, and trying to hear when anyone plays the note “A sharp”. Astropulse listens for short-time pulses. In the orchestra analogy, it’s like listening for a quick drum beat, or a series of drumbeats. Since no one knows what extraterrestrial communications will “sound like,” it seems like a good idea to search for several types of signals. In scientific terms, Astropulse is a sky survey that searches for microsecond transient radio pulses. These pulses could come from ET, or from some other source. I’ll define each of those terms:
• Sky survey: The telescope we use (Arecibo Observatory) scans across the sky, searching for signals everywhere. This differs from a directed SETI search, in which the telescope examines a few stars carefully.
• Microsecond: A millionth of a second. Astropulse is better than previous searches at detecting signals that last for a very short length of time. The shorter the signal, the better Astropulse is at detecting it, to a lower limit of 0.4 microseconds. Astropulse can detect signals shorter than 0.4 microseconds, it just stops getting better and better in comparison to other searches.
• Transient: A signal is transient if it is short, like a drumbeat. A transient signal can be repeating (it beats over and over again) or single pulse (it beats only once.)
• Radio: The signals are made of the same type of electromagnetic radiation that an AM or FM radio detects. (Actually of substantially higher frequency than that, but still considered “radio.”) Electromagnetic radiation includes radio waves, microwaves, infrared light, visible light, ultraviolet light, x-rays, and gamma rays. Click here for more information on electromagnetic radiation.
Sources of pulses
Where would a microsecond transient radio pulse come from? There are several possibilities, including:
• ET: Previous searches have looked for extraterrestrial communications in the form of narrow-band signals, analogous to our own radio stations. Since we know nothing about how ET might communicate, this might be a bit closed-minded.
• Pulsars and RRATs: Pulsars are rotating neutron stars that can produce signals as short as 100 microseconds, although typically much longer. 0.4 microseconds seems like a stretch. Astropulse is capable of detecting pulsars, but is unlikely to find any new ones. RRATs are a recently discovered pulsar variant. Perhaps Astropulse will discover a new type of rotating neutron star with a very short duty cycle.
• Exploding primordial black holes: Martin Rees has theorized that a black hole, exploding via Hawking radiation, might produce a signal that’s detectable in the radio. Click here to learn about black holes.
• Extragalactic pulses: Some scientists recently saw a single transient radio pulse from far outside the Milky Way galaxy. No one knows what caused it, but perhaps there are more of them for Astropulse to find.
• New phenomena: Perhaps the most likely result is that we will discover some unknown astrophysical phenomenon. Any time an astronomer looks at the sky in a new way, he or she may see a new phenomenon, whether it be a type of star, explosion, galaxy, or something else.
As a microsecond transient radio pulse comes to us from a distant source in space, it passes through the interstellar medium (ISM). The ISM is a gas of hydrogen atoms that pervades the whole galaxy. There is one big difference between the ISM and ordinary hydrogen gas. Some of the hydrogen atoms in the ISM are ionized, meaning they have no electron attached to them. For each ionized hydrogen atom in the ISM, a free electron is floating off somewhere nearby. A substance composed of free floating, ionized particles is called a plasma.
The microsecond radio pulse is composed of many different frequencies. As the pulse passes through the ISM plasma, the high frequency radiation goes slightly faster than the lower frequency radiation.When the pulse reaches Earth, we look at the parts of the signal ranging from 1418.75 MHz to 1421.25 MHz. This is a range of 2.5 MHz. The highest frequency radiation arrives about 0.4 milliseconds to 4 milliseconds earlier than the lowest frequency radiation, depending on the distance from which the signal originates. This effect is called dispersion. Click here to see how dispersed and undispersed pulses can be composed of many different frequencies.
In order to see the signal’s true shape, we have to undo this dispersion. That is, we must dedisperse the signal. Dedispersion is the primary purpose of the Astropulse algorithm.
Not only does dedispersion allow us to see the true shape of the signal, it also reduces the amount of noise that interferes with the signal’s visibility. Noise consists of fluctuations that produce a false signal. There could be electrical noise in the telescope, for instance, creating the illusion of a signal where there is none. Because dispersion spreads a signal out to be up to 10,000 times as long, this can cause 10,000 times as much noise to appear with the signal. (There’s a square root factor due to the math, so there’s really only 100 times as much noise power, but that’s still a lot.)
The amount of dispersion depends on the amount of ISM plasma between the Earth and the source of the pulse. The dispersion measure (DM) tells us how much plasma there is. DM is measured in “parsecs per centimeter cubed”, which is written pc cm-3. To get the DM, multiply the distance to the source of the signal (in parsecs) by the electron density in electrons per cubic centimeter. A parsec is about 3 light years. So if a source is 2 parsecs away, and the space between the Earth and that source is filled with plasma, with 3 free electrons per cubic centimeter, then that’s 6 pc cm-3. The actual density of free electrons in the ISM is about 0.03 per cubic centimeter.
Single Pulse Loops
Astropulse has to analyze the whole workunit at nearly 15,000 different DMs (14,208, to be precise.) At each DM, the whole dedispersion algorithm has to be run again for the entire workunit. The lowest DM is 55 pc cm-3, and the highest is 800 pc cm-3. Astropulse examines DMs at regular intervals between those two. Without going into detail about how to examine a piece of a workunit at a given DM, here is the organization with which Astropulse handles the data: it divides the DMs to be covered into large DM chunks of 128 DMs each, and then small DM chunks of 16 DMs each. It divides the data into chunks of 4096 bytes, and processes them one at a time. Once it has dedispersed the data, Astropulse co-adds the dedispersed data at 10 different levels, meaning that it looks for signals of size 0.4 microseconds, then twice that, 4 times, 8 times, and so on. (0.4 microseconds, 0.8, 1.6, 3.2, 6.4, …) On the lowest level of organization, astropulse looks at individual bins of data. A bin corresponds to 2 bits of the original data, but after dedispersion, it requires a floating point number to represent it. Here’s the breakdown of Astropulse’s loops:
1 workunit => 111 large DM chunks
1 large DM chunk => 8 small DM chunks
1 small DM chunk => 2048 data chunks
1 data chunk => 16 DMs
1 DM => 10 fold levels
1 fold level => 16384 bins (or less)
1 bin = smallest unit
So each workunit is composed of 111 large DM chunks, each of which is 0.901% of the whole. Each large DM chunk is composed of 8 small DM chunks, each of which is 0.113% of the whole. And so on.
The number of large DM chunks will probably change before the final version of Astropulse is released.
Fast Folding Algorithm
At the end of each small and large DM chunk, Astropulse performs the Fast Folding Algorithm. This algorithm checks for repeating pulses over a certain range of periods. (The period is the length of time after which the pulse repeats.) When the fast folding algorithm is performed after each large DM chunk, it searches over an entire 13 second workunit, and looks for repeating signals with a period of 256 times the sample rate (256 * 0.4 microseconds) or more. When the FFA is performed after each small DM chunk, it searches over a small fraction of the workunit, and looks for repeating signals with a period of 16 times the sample rate or more.”
David is a computer scientist, with research interests in volunteer computing, distributed systems, and real-time systems. He also runs the BOINC project.
David is a rock climber, mountain climber, classical pianist, and father of Noah (born Oct 2005).
Dan specializes in signal processing for radio astronomy. He has been doing SETI since 1979, and he runs the SERENDIP, Optical SETI, and CASPER projects.
Dan dabbles in jazz piano, and is the father of a 4-year old son, William.
BOINC is a leader in the field(s) of Distributed Computing, Grid Computing and Citizen Cyberscience.BOINC is more properly the Berkeley Open Infrastructure for Network Computing, developed at UC Berkeley.
Visit the BOINC web page, click on Choose projects and check out some of the very worthwhile studies you will find. Then click on Download and run BOINC software/ All Versons. Download and install the current software for your 32bit or 64bit system, for Windows, Mac or Linux. When you install BOINC, it will install its screen savers on your system as a default. You can choose to run the various project screen savers or you can turn them off. Once BOINC is installed, in BOINC Manager/Tools, click on “Add project or account manager” to attach to projects. Many BOINC projects are listed there, but not all, and, maybe not the one(s) in which you are interested. You can get the proper URL for attaching to the project at the projects’ web page(s) BOINC will never interfere with any other work on your computer.
MAJOR PROJECTS RUNNING ON BOINC SOFTWARE
SETI@home The search for extraterrestrial intelligence. “SETI (Search for Extraterrestrial Intelligence) is a scientific area whose goal is to detect intelligent life outside Earth. One approach, known as radio SETI, uses radio telescopes to listen for narrow-bandwidth radio signals from space. Such signals are not known to occur naturally, so a detection would provide evidence of extraterrestrial technology.
Radio telescope signals consist primarily of noise (from celestial sources and the receiver’s electronics) and man-made signals such as TV stations, radar, and satellites. Modern radio SETI projects analyze the data digitally. More computing power enables searches to cover greater frequency ranges with more sensitivity. Radio SETI, therefore, has an insatiable appetite for computing power.
Previous radio SETI projects have used special-purpose supercomputers, located at the telescope, to do the bulk of the data analysis. In 1995, David Gedye proposed doing radio SETI using a virtual supercomputer composed of large numbers of Internet-connected computers, and he organized the SETI@home project to explore this idea. SETI@home was originally launched in May 1999.”
SETI@home is the birthplace of BOINC software. Originally, it only ran in a screensaver when the computer on which it was installed was doing no other work. With the powerand memory available today, BOINC can run 24/7 without in any way interfering with other ongoing work.
The famous SET@home screen saver, a beauteous thing to behold.
einstein@home The search for pulsars. “Einstein@Home uses your computer’s idle time to search for weak astrophysical signals from spinning neutron stars (also called pulsars) using data from the LIGO gravitational-wave detectors, the Arecibo radio telescope, and the Fermi gamma-ray satellite. Einstein@Home volunteers have already discovered more than a dozen new neutron stars, and we hope to find many more in the future. Our long-term goal is to make the first direct detections of gravitational-wave emission from spinning neutron stars. Gravitational waves were predicted by Albert Einstein almost a century ago, but have never been directly detected. Such observations would open up a new window on the universe, and usher in a new era in astronomy.”
MilkyWay@Home Milkyway@Home uses the BOINC platform to harness volunteered computing resources, creating a highly accurate three dimensional model of the Milky Way galaxy using data gathered by the Sloan Digital Sky Survey. This project enables research in both astroinformatics and computer science.”
World Community Grid (WCG) World Community Grid is a special case at BOINC. WCG is part of the social initiative of IBM Corporation and the Smarter Planet. WCG has under its umbrella currently eleven disparate projects at globally wide ranging institutions and universities. Most projects relate to biological and medical subject matter. There are also projects for Clean Water and Clean Renewable Energy. WCG projects are treated respectively and respectably on their own at this blog. Watch for news.
Rosetta@home “Rosetta@home needs your help to determine the 3-dimensional shapes of proteins in research that may ultimately lead to finding cures for some major human diseases. By running the Rosetta program on your computer while you don’t need it you will help us speed up and extend our research in ways we couldn’t possibly attempt without your help. You will also be helping our efforts at designing new proteins to fight diseases such as HIV, Malaria, Cancer, and Alzheimer’s….”
GPUGrid.net “GPUGRID.net is a distributed computing infrastructure devoted to biomedical research. Thanks to the contribution of volunteers, GPUGRID scientists can perform molecular simulations to understand the function of proteins in health and disease.” GPUGrid is a special case in that all processor work done by the volunteers is GPU processing. There is no CPU processing, which is the more common processing. Other projects (Einstein, SETI, Milky Way) also feature GPU processing, but they offer CPU processing for those not able to do work on GPU’s.
These projects are just the oldest and most prominent projects. There are many others from which you can choose.
There are currently some 300,000 users with about 480,000 computers working on BOINC projects That is in a world of over one billion computers. We sure could use your help.