From isgtw: “Laying the groundwork for data-driven science”


international science grid this week

October 22, 2014
Amber Harmon

he ability to collect and analyze massive amounts of data is rapidly transforming science, industry, and everyday life — but many of the benefits of big data have yet to surface. Interoperability, tools, and hardware are still evolving to meet the needs of diverse scientific communities.

data
Image courtesy istockphoto.com.

One of the US National Science Foundation’s (NSF’s) goals is to improve the nation’s capacity in data science by investing in the development of infrastructure, building multi-institutional partnerships to increase the number of data scientists, and augmenting the usefulness and ease of using data.

As part of that effort, the NSF announced $31 million in new funding to support 17 innovative projects under the Data Infrastructure Building Blocks (DIBBs) program. Now in its second year, the 2014 DIBBs awards support research in 22 states and touch on research topics in computer science, information technology, and nearly every field of science supported by the NSF.

“Developed through extensive community input and vetting, NSF has an ambitious vision and strategy for advancing scientific discovery through data,” says Irene Qualters, division director for Advanced Cyberinfrastructure. “This vision requires a collaborative national data infrastructure that is aligned to research priorities and that is efficient, highly interoperable, and anticipates emerging data policies.”

Of the 17 awards, two support early implementations of research projects that are more mature; the others support pilot demonstrations. Each is a partnership between researchers in computer science and other science domains.

One of the two early implementation grants will support a research team led by Geoffrey Fox, a professor of computer science and informatics at Indiana University, US. Fox’s team plans to create middleware and analytics libraries that enable large-scale data science on high-performance computing systems. Fox and his team plan to test their platform with several different applications, including geospatial information systems (GIS), biomedicine, epidemiology, and remote sensing.

“Our innovative architecture integrates key features of open source cloud computing software with supercomputing technology,” Fox said. “And our outreach involves ‘data analytics as a service’ with training and curricula set up in a Massive Open Online Course or MOOC.”Among others, US institutions collaborating on the project include Arizona State University in Phoenix; Emory University in Atlanta, Georgia; and Rutgers University in New Brunswick, New Jersey.

Ken Koedinger, professor of human computer interaction and psychology at Carnegie Mellon University in Pittsburgh, Pennsylvania, US, leads the other early implementation project. Koedinger’s team concentrates on developing infrastructure that will drive innovation in education.

The team will develop a distributed data infrastructure, LearnSphere, that will make more educational data accessible to course developers, while also motivating more researchers and companies to share their data with the greater learning sciences community.

“We’ve seen the power that data has to improve performance in many fields, from medicine to movie recommendations,” Koedinger says. “Educational data holds the same potential to guide the development of courses that enhance learning while also generating even more data to give us a deeper understanding of the learning process.”

The DIBBs program is part of a coordinated strategy within NSF to advance data-driven cyberinfrastructure. It complements other major efforts like the DataOne project, the Research Data Alliance, and Wrangler, a groundbreaking data analysis and management system for the national open science community.

See the full article here.

iSGTW is an international weekly online publication that covers distributed computing and the research it enables.

“We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

In its current incarnation, iSGTW is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

You can read iSGTW via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

ScienceSprings relies on technology from

MAINGEAR computers

Lenovo
Lenovo

Dell
Dell