NASA has started a rather ambitious project: to provide open-sourceeverything. The main site is located athttp://open.nasa.gov. From here,there is access to data, code and applications, among other things. Thisis a great launching point for anyone interested in space science andNASA work. In this article, I look at what kind of code is being madeavailable that you might want to explore.
The available software covers several genres.Some are low-level, systems-layer software. You can go ahead and do somereally long-distance transfers with the Interplanetary Overlay Network (ION). Thisis an implementation of the Delay-Tolerant Networking architecture(DTN) as described in RFC 4838. This software is physically hostedat SourceForge, and you can use this code to communicate withyour next interplanetary probe.
A bit more down to earth is a middleware package that actuallyis hosted by the Apache Foundation. You can download and use theObject-Oriented Data Technology (OODT) middleware. OODT is component-based,so you can pick and choose which parts you want to use. There arecomponents to handle transparent access to distributed resources, datadiscovery and query optimization, and distributed processing. There arealso components to handle work-flow and resource management. Groups thatare using it include the Children's Hospital of Los Angeles and NASA'sPlanetary Data System. If you're managing data systems, this might beworth taking a look at.
Getting back to actual science processing, you might want to download the DataProductivity Toolkit (DPT). This package is a collection of command-linetools, written in Python, that lets you work on text data files. Theseutilities follow the UNIX design method of having small utilities thatdo one task well, and then chaining them together to do more complicatedprocessing. There are tools for massaging and manipulating your data,tools for doing statistics on that data and even tools for visualizingthe data and the results. Many of the tools even provide an API to basicPython and numpy/scipy/matplotlib routines.
And, while I'm talking about Pythonand science, you also can look at SunPy. SunPy aims to provide a library ofroutines that are useful in studying solar physics. With it, you canquery the Virtual Solar Observatory (VSO) and grab data that you canprocess. Many routines are available that allow you to plot this data usingvarious color maps and processing filters. There is a Sun object thatcontains physical constants useful in solar physics, along with thesun's position and numerous other solar attributes.
A lot of the computational work done at NASA involves clustersof machines and massively parallel code. This means the NASA folks have needed toput together lots of tools to manage these machines. They also have beennice enough to release a lot of this code for public consumption. Thefirst of these is multil (Multi-Threaded Multi-Node Utilities). Inthe standard GNU file tools, cp and md5sum operate as a single-threadedprocess on a single machine. The multil tools provide drop-in replacementscalled mcp and msum. These utilities use multithreading to make sureeach node is kept as busy as possible. Read and write parallelism allowsfor individual operations of a single copy to be interleaved throughasynchronous I/O. Split file processing allows for different threadsto operate on different portions of a file in parallel.
NASA alsoprovides a utility to give SSH access to your cluster. There is amiddleware utility called mesh (Middleware Using Existing SSH Hosts)that provides single sign-on capability. Mesh sits on top of SSH,and instead of using the local authorized_keys file, loads a file for adedicated server at runtime. Mesh also has its own shell (called mash)that restricts what applications are available to the user. Using thissystem, you can add and remove SSH hosts that are availableto be used dynamically. Also, because the authentication is handled by a library thatis preloaded when SSH first starts up, the restrictions are sure tobe enforced on the user.
Now that you have a connection mechanism, youmay need to handle load balancing across all of these machines. Again,NASA comes to your aid. It has a software package called ballast(Balancing Load Across Systems) that might help. This package handlesload balancing for SSH connections specifically. Each available host runs aballast client, and there are one or more ballast servers. The serversmaintain system load information gathered from the clients and use it tomake decisions about where to send SSH connection requests. Because allof this is handled over SSH, the policy deciding which host to connect toalso can take into account the user name. This way, you can have policiesthat are specific to each user. This lets you better tune the best optionsfor each user, rather than trying to find a common policy that everyoneis forced to use.
Going back to doing science, another important task is visualization, and NASAhas released several tools to help. The first one I look at here is World Wind. This is an Earth visualization system. You can use it toget a 3-D look at Earth and to see data projected ontothe globe. It is a Java application, so it works on any desktop thathas a Java virtual machine, as well as in most browsers. It is a fulldevelopment kit, and it has several example applications that you can use asjumping-off points for your own code.
Taking visualization further fromthe surface of the Earth, there is ViSBARD (Visual System for Browsing,Analysis and Retrieval of Data). This application allows you to pull datafrom multiple satellites and display them concurrently. It also allowsfor 3-D viewing of all of this data. This type of vectorfield information is very difficult to analyze in 2-D plots,hence the need for this kind of tool. The latest version also allows youto visualize MHD (Magneto-Hydro-Dynamic) models. This way, you can compareresults from model calculations to actual satellite measurements.
Moreextensive image processing can be done with the Vision Workbench. Thisis an application and a full library of imaging and computer visionalgorithms. It isn't meant to be a complete, cutting-edge library though. Rather,it provides solid implementations of standard algorithms you canuse as starting points in developing your own algorithms.
When you're ready to go and launch your own satellite, you can downloadand use the Core Flight Executive (cFE). This software is used as thebasis for flight data systems and instrumentation. It is written in C andbased on OSAL (Operating System Abstraction Layer). It has an executive,along with time and event services. You can track your satellite withthe ODTBX (Orbit Determination Toolbox). The ODTBX package handles orbitdetermination analysis and early mission analysis. It's available asboth MATLAB code and Java.
The last piece of code I cover here is S4PM (Simple, Scalable,Script-based Science Processor for Measurements). This actually is usedat the Goddard Earth Sciences Data and Information Services Centerto do data processing. It is built up out of a processing engine,a toolkit and a graphical monitor. S4PM allows asingle person to manage hundreds of jobs simultaneously. It alsois designed to be relatively easy to set up new processing strings.
The open-source project at NASA doesn't cover only code. NASA hasbeen releasing data as well. The Kepler Project is looking forexo-planets. As I mentioned previously, you can download data from the SolarDynamics Observatory. You can work on climate data by checking outinformation from the Tropical Rainfall Measuring Mission. You can look uptons of data from the various moon missions, from Apollo on up. There also isdata from the various planetary missions. Climate data and measurementsof Earth are available too.
I've touched on only a few of the items NASAhas provided for the public. Hopefully, you have seen enough to go andcheck out the rest in more detail. There is a lot of science that regularcitizens can do, and NASA is doing its part to try to put the toolsinto your hands.






