From Science Node: “Hacking Zika in the Lone Star state”

Science Node bloc
Science Node

25 May, 2016 [just put in social media]
Jorge Salazar

More than 50 data scientists, engineers, and UT Austin students joined forces at the Austin Zika Hackathon to use big data to fight the spread of Zika.

Hackathon participants investigated ways to pool together different sets of data — outbreak reports, stagnant water sources, empty swimming pools and ponds that are potential mosquito breeding grounds, and even Facebook and Twitter feeds.

The Texas Advanced Computing Center (TACC) plans to store all the data on a new data-intensive supercomputer called Wrangler.

Wrangler is one of the newest Extreme Science and Engineering Discovery Environment (XSEDE) supercomputing resources, and is supported by the National Science Foundation (NSF).

TACC Wrangler

“We’re trying to collect these disparate pieces of data, and there’s not a good way for people to ask questions about that data — that’s the big problem,” says Ari Kahn, human translational genomics coordinator at TACC.

Zika, a mosquito-borne disease that can cause fever and birth defects, threatens to spread to the United States. As of mid-May 2016, Mexico had reported 272 cases of Zika. The problem has grown so large that President Obama has requested $1.9 billion to halt the spread of Zika.

The US Centers for Disease Control (CDC) is now ramping up collection of data that tracks Zika spread. But big gaps exist in linking different kinds of data, and that makes it tough for experts to predict where it will go next and what to do to prevent it.

The Zika hackers formed groups and worked on creating demo projects based off of sample CDC and other data. One project developed a working tensor flow model that used machine learning to search through aerial images for pools of stagnant water, potential breeding ground for mosquitos that carry Zika.

Another team developed a mobile app with nodes that would allow researchers to report developing cases of mosquito-borne illness. One demonstrated a way to map microcephaly occurrences in Brazil using an R maps interface to Leaflet. Another made headway into readying CDC data from Puerto Rico to layer with CIA Fact Book data for richer understanding of how Zika has progressed there.

“The Zika Hackathon is about bringing awareness and building a platform that is repeatable, not just for the Zika virus data analysis,” says Zika Hackathon organizer Eddie Garcia, chief security architect at Cloudera. “Someone can take what we did here today and apply it to some other unknown outbreak. It’s really about getting people together, excited, bringing awareness, and building out a platform that is repeatable.”

The Zika Hackathon brought together an emerging kind of scientist, a data scientist. Data scientists specialize both in translating information from many different sources into data that can be used together and in using new technologies by which knowledge can be extracted from today’s massive data collections.

“There are three classes of work that get put under the umbrella of data science,” says data scientist Juliet Hougland of Cloudera. “1) Data scrubbing – getting data in the right format, in the right place — is a huge part of any job where you’re going to do something useful with that data. 2) Investigative analytics looks at historic data and doing interesting, useful analysis on it. 3) Operational analytics supports recommendation engines, fraud detection systems, and more.”

At the hackathon, software developer David Walling of TACC’s Data Intensive Computing group spoke of his current research extracting rich data from ‘grey literature,’ unofficial records that can be images inside PDF files, a bane of data scientists. His work uses natural language processing techniques to map occurrences in the grey literature of a given species such as fish at specific locations and dates. Progress on this problem would translate well to getting more information for researchers about Zika.

“If you can see where all the water sources are and then overlay how the reports of outbreaks are happening, then you can create a model for how it’s spreading and how it will spread in the future based on where the water sources are. Then maybe you can come up with some plans to offset that so the spreading doesn’t happen as fast or doesn’t happen at all,” Kahn says.

The charitable arm of the data analytics company, Cloudera Cares, along with TACC and other local partners, are planning to hold quarterly hackathons as part of a larger planned project to use big data to battle Zika and other threats. The project will help prevent outbreaks from happening, and make it easier for researchers to get answers.

To learn more about the Zika outbreak check these resources from the CDC, the World Health Organization, and the European Center for Disease Prevention and Control.

See the full article here .


There is a new project at World Community Grid [WCG] called OpenZika.
Image of the Zika virus

Rutgers Open Zika

WCG runs on your home computer or tablet on software from Berkeley Open Infrastructure for Network Computing [BOINC]. Many other scientific projects run on BOINC software.Visit WCG or BOINC, download and install the software, then at WCG attach to the OpenZika project. You will be joining tens of thousands of other “crunchers” processing computational data and saving the scientists literally thousands of hours of work at no real cost to you.

WCG Logo New

BOINC WallPaper

Please help promote STEM in your local schools.

Stem Education Coalition

Science Node is an international weekly online publication that covers distributed computing and the research it enables.

“We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”