From isgtw: “EarthServer: Big Earth data at your fingertips becomes a reality”


international science grid this week

August 12, 2015
No Writer Credit

1
An image of Europe created using Envisat’s Medium Resolution Imaging Spectrometer [MERIS]. Image courtesy ESA.

ESA Envisat
MERIS is on the ESA ENVISAT space craft

The Earth sciences, like geology, oceanography, and astronomy, generate vast quantities of data. Yet without the right tools, scientists either drown in this sea of big Earth data or the data sits in an archive, barely used.

The vision of the EarthServer project is to offer researchers ‘big Earth data at your fingertips’, so that they can access and manipulate enormous data sets with just a few mouse clicks.

“The project was the result of a ‘push’ and a ‘pull’,” says project coordinator Peter Baumann, professor of computer science at Jacobs University in Bremen, Germany. “On the demand side there was a need for new concepts to handle the wave of data crashing down on us. On the supply side we had a data cube technology that is well-suited to this domain.” A data cube is a three- (or higher) dimensional array of values, commonly used to describe a time series of image data.

Data cubes help researchers access and visualize data

EarthServer built advanced data cubes and custom web portals to make it possible for researchers to extract and visualize earth sciences data as 3D cubes, 2D maps, or 1D diagrams. The British Geological Survey, for example, used EarthServer technology to drill down through different layers of the Earth in 3D.

“For the user, data cubes hide the unnecessary complexity of the data,” says Baumann. “As a user, I don’t want to see a million files: I want to see a few data cubes.”

Data in the Earth sciences often takes the form of sensor recordings, images, simulation outputs, and statistical measurements — each often with an associated time dimension. The data items typically form regular or irregular grid values with space/time coordinates. EarthServer makes these arrays available as data cubes.

Aside from ease-of-use, the data cubes also make it possible to integrate data from different disciplines, and scientists can combine measurement data with data generated from simulations.

Building on existing technologies

To handle big Earth data efficiently, EarthServer needed to extend existing technologies and standards. The SQL database query language, for example, is more oriented towards the manipulation of alphanumeric data.

To enable data cubes, the project was built upon rasdaman, a new type of database management system specialized in multi-dimensional gridded data, called rasters or arrays. Rasdaman enables the flexible, fast extraction of data from big Earth data arrays of any size.

“Essentially, we have married the SQL database language with image processing,” says Baumann. “This is now becoming part of the ISO SQL standard.”

In addition, the project has strongly influenced the Big Earth Data standards of the Open Geospatial Consortium and INSPIRE, the European Spatial Data Infrastructure.

EarthServer’s researchers also developed a ‘semantic parallelization’ technology that sub-divides a single database query into multiple sub-queries. These are sent to other database servers for processing.

This method enables EarthServer to distribute a single incoming query over more than 1,000 cloud nodes and rapidly answer queries on hundreds of terabytes of data in less than a second.

Bigger and better: EarthServer-2

EarthServer-1, which ran from September 2011 for 36 months and received €4 million (~ $4.4 million) in EU funding, involved a range of multinational partners. Building on the success of the first phase of the project, EarthServer successfully applied for funding from the European Commission to support its next phase, EarthServer-2.

This kicked off in May 2015 and will focus on the ‘data cube’ paradigm, as well as on handling even higher data volumes. “The plan is to focus on the fusion of data from different domains and to be able to resolve a query on a petabyte within a second,” says Baumann. “That would mean that a user could view the data on screen and manipulate it interactively.” EarthServer-2 is now working on the next frontier, open-source 4D visualization.

See the full article here.

Please help promote STEM in your local schools.
STEM Icon

Stem Education Coalition

iSGTW is an international weekly online publication that covers distributed computing and the research it enables.

“We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

In its current incarnation, iSGTW is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

You can read iSGTW via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”