From ALICE at CERN: “40 Years of Large Scale Data Analysis in HEP: Interview with René Brun”

CERN
CERN New Masthead

16 October 2017
Virginia Greco

1
Over 40 years of career at CERN, René Brun developed a number of software packages that became largely used in High Energy Physics. For these fundamental contributions he was recently awarded a special prize of the EPS High Energy Particle Physics Division. We have talked with him about the key events of this (hi)story.

1
René Brun giving a seminar at CERN (on October 4, 2017) about “40 Years of Large Scale Data Analysis in HEP – the HBOOK, Paw and Root Story”. [Credit: Virginia Greco]

It is hard to imagine that one same person can be behind many of the most important and largely used software packages developed at CERN and in high-energy physics: HBOOK, PAW, ROOT and GEANT. This passionate and visionary person is René Brun, now honorary member of CERN, who was recently awarded a special prize of the EPS High Energy Particle Physics Division “for his outstanding and original contributions to the software tools for data management, detector simulation, and analysis that have shaped particle and high energy physics experiments for many decades”. Over 40 years of career at CERN, he worked with various brilliant scientists and we cannot forget that the realization of such endeavors is always the product of a collaborative effort. Nevertheless, René has had the undoubtable merit of conceiving new ideas, proposing projects and working hard and enthusiastically to transform them in reality.

One of his creations, ROOT, is a data analysis tool widely used in high energy and nuclear physics experiments, at CERN and in other laboratories. It has already passed beyond the limits of physics and is now being applied in other scientific fields and even in finance. GEANT is an extremely successful software package developed by René Brun, which allows simulating physics experiments and particle interactions in detectors. Its latest version, GEANT4, is currently the first choice of particle physicists dealing with detector simulations.

But previous to ROOT and GEANT4, which are very well known among the youngest as well, many other projects had been proposed and software tools had been developed. It is a fascinating story, which René was invited to tell in a recent colloquium, organized at CERN by the EP department.

As he recounts, all started in 1973, when he was hired in the Data Handling (DD) division at CERN to work with Carlo Rubbia in the R602 experiment at the ISR. His duty was to help developing a special hardware processor for the online reconstruction of the collision patterns. But since this development was moving slowly and was not occupying much of his work time, René was asked to write some software for the event reconstruction in multiwire proportional chambers. “At that time, I hated software,” René confesses smiling, “I had written software during my PhD thesis, while studying in Clermont-Ferrand and working at CERN during the weekends, and I hadn’t really enjoyed it. I had joined Rubbia’s group with the ‘promise’ that I would work on hardware, but very quickly I became a software guy again…”

In short time, René implemented in software (programming in Fortran4) what they could not realize via hardware and, in addition, he developed a histogram package called HBOOK. This allowed realizing a very basic analysis of the data, creating histograms, filling them and sending the output to a line printer. He also wrote a program called HPLOT which was specialized in drawing histograms generated by HBOOK.

At that time, there were no graphic devices, so the only way to visualize histograms was printing them using a line printer, and programs were written in the form of punched cards.

René remembers with affection the time spent punching cards, not for the procedure itself, which was slow and quite tedious, but for the long chats he used to have in the room where the card punchers and printers of the DD department were sitting, as well as in the cafeteria nearby. In those long hours, he could discuss ideas and comment on new technologies with colleagues.

A huge progress was made possible by the introduction of the teletype, which replaced card punchers. Users could generate programs on a disk file and communicate with a central machine, called FOCUS, while – at the same time – seeing on a roll of paper what they were doing as in a normal type machine. “The way it worked can make people smile today,” René recounts, “To log in the FOCUS, one had to type a command which caused a red light to flash in the computer centre. Seeing the light, the operator would mount into the memory of the machine the tape of the connected person, who could thus run a session on the disk. When the user logged out, the session was again dumped on tape. You can imagine the traffic! But this was still much faster than punching cards.”

Some time later, the teletype was in turn replaced by a Tektronix 4010 terminal, which brought in a big revolution, since it gave the possibility to display results in graphic form. This new, very expensive device allowed René to speed up the development of his software: HBOOK first, then another package called ZBOOK and the first version of GEANT. Created in 1974 with his colleagues in the Electronic Experiments (EE) group, GEANT1 was a tool for performing simple detector simulations. Gradually, they added features to this software and were able to generate collision simulations: GEANT2 was born.

In 1975 René joined the NA4 experiment, a deep inelastic muon scattering experiment in the North Area, led by Carlo Rubbia. There he collaborated on the development of new graphic tools that allowed printing histograms using a device called CalComp plotter. This machine, which worked with a 10-meter-long roll of paper, granted a much better resolution compared with line printers, but was very expensive. In 1979 a microfilm system was introduced: histograms saved on the film could be inspected before sending them to the plotter, so that only the interesting ones were printed. This reduced the expenses due to the use of the CalComp.

René was then supposed to follow Rubbia in the UA1 experiment, for which he had been doing many simulations – “Without knowing that I was simulating for UA1,” René highlights. But instead, at the end of 1980, he joined the OPAL experiment, where he performed all the simulations and created GEANT3.

While working on the HBOOK system, in 1974 René had developed a memory management and I/O system called ZBOOK. This tool was an alternative to the HYDRA system, which was being developed in the bubble chambers group by the late Julius Zoll (also author of another management system called Patchy).

Thinking that it was meaningless to have two competing systems, in 1981, the late Emilio Pagiola proposed the development of a new software package called GEM. While three people were working hard on the GEM project, René and Julius together started to run benchmarks to compare their systems, ZBOOK and HYDRA, with GEM. Through these tests, they came to the conclusion that the new system was by far slower than theirs.

In 1983 Ian Butterworth, the then Director for Computing, decided that only the ZBOOK system would be supported at CERN and that GEM had to be stopped, and HYDRA was frozen. “My group leader, Hans Grote, came to my office, shook my hand and told me: ‘Congratulations René, you won.’ But I immediately thought that this decision was not fair, because actually both systems had good features and Julius Zoll was a great software developer.”

In consequence of this decision, René and Julius started a collaboration and joined forces to develop a package integrating the best features of both ZBOOK and HYDRA. The new project was called ZEBRA, from the combination of the names of the two original systems. “When Julius and I announced that we were collaborating, Ian Butterworth immediately called both of us to his office and told us that, if in 6 months the ZEBRA system was not functioning, we would be fired from CERN. But indeed, less than two months later we were already able to show a running primary version of the ZEBRA system.”

At the same time, histogram and visualization tools were under development. René put together an interactive version of HBOOK and HPLOT, called HTV, which run on Tektronix machines. But in 1982 the advent of personal workstations marked a revolution. The first personal workstation introduced in Europe, the Apollo, represented a leap in terms of characteristics and performance: it was faster, had more memory and better user interface than any other previous device. “I was invited by the Apollo company to go to Boston and visit them,” René recounts. “When I first saw the Apollo workstation, I was shocked. I immediately realized that it could speed up our development by a factor of 10. I put myself at work and I think that in just three days I adapted some 20000 lines of code for it.”

The work of René in adapting HTV for the Apollo workstation attracted the interest of the late Rudy Böck, Luc Pape and Jean-Pierre Revol from the UA1 collaboration, who also suggested some improvements. Therefore, in 1984 the three of them elaborated a proposal for a new package, which would be based on HBOOK and ZEBRA, that they called PAW, from Physics Analysis Workstation.

2
The PAW team: (from the left) René Brun, Pietro Zanarini, Olivier Couet (standing) and Carlo Vandoni.

After a first period of uncertainties, the PAW project developed quickly and many new features were introduced, thanks also to the increasing memory space of the workstations. “At a certain point, the PAW software was growing so fast that we started to receive complaints from users who could not keep up with the development,” says René smiling. “Maybe we were a bit naïve, but certainly full of enthusiasm.”

The programming language generally used for scientific computing was FORTRAN. In particular, at that time FORTRAN 77 (introduced in 1977) was widespread in the high-energy physics community and the main reason for its success was the fact that it was well structured and quite easy to learn. Besides, very efficient implementations of it were available on all the machines used at the time. As a consequence, when the new FORTRAN 90 appeared, it seemed obvious that it would replace FORTRAN 77 and that it would be as successful as the previous version. “I remember well the leader of the computing division, Paolo Zanella, saying: ‘I don’t know what the next programming language will do but I know its name: FORTRAN.’”

In 1990 and 91 René, together with Mike Metcalf, who was a great expert of FORTRAN, worked hard to adapt the ZEBRA package to FORTRAN 90. But this effort did not lead to a satisfactory result and discussions raised about the opportunity to keep working with FORTRAN or moving to another language. It was the period when object-oriented programming was taking its first steps and also when Tim Berners Lee joined René’s group.

Berners-Lee was supposed to develop a documentation system, called XFIND, to replace the previous FIND that could run only on IBM machines, which had to be usable on other devices. He believed, though, that the procedure he was supposed to implement was a bit clumsy and certainly not the best approach to the problem. So, he proposed a different solution with a more decentralized and adaptable approach, which required first of all a work of standardization. In this context, Berners-Lee developed the by-now-very-famous idea of the World Wide Web servers and clients, developed using an object-oriented language (Object C).

It was a very hot period, because the phase of design and simulation of the experiments for the new accelerator LHC had been launched. It was important to take a decision about the programming language and the software tools to use in these new projects.

At the workshop of ERICE, organized by INFN in November 1990, and then at the Computing in High Energy Physics (CHEP) conference in Annecy (France), in September 1992, the high-energy physics “software gurus” of the world gathered to discuss about programming languages and possible orientations for software in HEP. Among the many languages proposed, there were also Eiffel, Prolog, Modula2 and others.

In 1994 two Research and Development (RD) projects were launched: RD44, with the objective of implementing in C++ a new version of GEANT (which will become GEANT4), and RD45, aiming to investigate object-oriented database solutions for the LEP experiments.

According to René, his division was split in three opinion groups: those who wanted to stay with FORTRAN 90, those who bet on C++ and those who were interested in using commercial products. “I presented a proposal to develop a package that would take PAW to the OO word. But the project, which I called ZOO, was rejected and I was even invited to take a sabbatical leave” René admits.

This blow, though, proved later to be indeed a strike of luck for René. He was suggested by his division leader, David Williams, to join the NA49 experiment in the North Area, which needed somebody to help developing the software. At first, he refused. He had been leading for years both the GEANT and the PAW projects and making simulation or developing software for different groups and applications, thus accepting to go back working in a specific experiment appeared to him as a big limitation.

But he gave it second thoughts and realized that it was an opportunity to take some time to develop new software, with total freedom. He went to visit the NA49 building in the Prevessin site and, seeing from the windows pine trees and squirrels, he felt that it was indeed the kind of quiet environment he needed for his new project. Therefore, he moved his workstation from his office to the Prevessin site (“I did it during a weekend, without even telling David Williams”) and, while working for NA49, he taught himself C++ by converting in this new OO language a large part of his HBOOK software.

At the beginning of 1995, René was joined in NA49 by Fons Rademakers, with whom he had already collaborated. The two of them worked very hard for several months and produced the first version of what became the famous ROOT system. The name comes simply from the combination of the starting letter of the email addresses of the two founders (René and Rdm, for Rademakers), the double O of Object Oriented and the word Technology. But the meaning or the word ‘root’ also fitted well with its being a basic framework for more software to be developed and with the use of tree structures in its architecture.

In November of the same year, René gave a seminar to present the ROOT system. “The Computing Division auditorium was unexpectedly crowded!” René recalls, “I think it was because people thought that Fons and I had disappeared from the software arena, while all of a sudden we were back again!” And actually the ROOT system generated considerable interest.

But while René and Fons were completely absorbed by the work on their new software package, the RD45 project, which had the mandate to decide what new software had to be adopted by the new LHC experiments, had proposed to use the commercial product “Objectivity” and a lot of work was ongoing to develop applications to meet the HEP needs. According to René, there was a clear intention to obstruct the development and diffusion of ROOT. In spring 1996 the CERN director for computing, Lorenzo Foa, declared that the ROOT project was considered as a private initiative of NA49 which was not supported by the CERN management and that the official line of development was the one around Objectivity.

“I think that the LHC Computing Board didn’t have the right insight into the architecture of these software tools to be able to judge which solution was the best. Thus, they had to trust what they were told,” René comments. “It is always a problem when there is such a divide between the experts – and users – working on something and the people who are to take important decisions.”

Nevertheless, René and Fons continued developing ROOT and implementing new features, taking advantage of the lessons learnt with the previous software packages (in particular the requests and criticisms of the users). In addition, they followed closely the development of the official line with Objectivity, in order to know what people using it were looking for and what the problems or difficulties were. “The more we looked into Objectivity, the more we realized it could not meet the needs of our community,” René adds, “we knew that the system would fail and that eventually people would realize it. This gave us even more energy and motivation to work hard and improve our product.”

They had continuous support from the NA49 and ALICE collaborations, as well as from many people in ATLAS and CMS, who saw good potentiality in their software package. At the time, René was collaborating with many people in both experiments, including Fabiola Gianotti and Daniel Froidevaux, in particular for detector simulations. Besides, many users trusted them for the relationship created along many years through the user support of PAW and GEANT.

Things started to change when interest for ROOT raised outside CERN. In 1998, the two experiments of Fermilab, CDF and D0, decided to discuss about the future of their software approach, in view of the soon-coming Run II of the Tevatron. Hence, they opened two calls for proposals of software solutions, one for data storage and one for data analysis and visualization. René submitted ROOT to both calls. During the CHEP conference in Chicago the proposals were discussed and the last day it was publicly announced that CDF and D0 would adopt ROOT. “I was not expecting it,” says René, “I remember that when the communication was given, everybody turned their face and looked at me.” Soon later, the experiments of RHIC at the Brookhaven National Laboratory took the same decision. The BaBar experiment at SLAC, after years spent attempting to use Objectivity, had realized that it was not as good a system as expected, so moved to ROOT as well.

Gradually, it was clear that the HEP community was ‘naturally’ going towards ROOT, so the CERN management had to accept this situation and, eventually, support it. But this happened only in 2002. With more manpower allocated to the project, ROOT continued developing fast and the number of users increased dramatically. It also started to spread to other branches of science and into the financial world. “In 2010, we had on average 12000 downloads per month of the software package and the ROOT website had more visitors than the CERN one”.

3
The logo of the ROOT software package.

René retired in 2012, but his two most important brainchildren, ROOT and GEANT, keep growing thanks to the work of many young scientists. “I think that it is essential to have a continuous stimulus that pushes you to improve your products and come out with new solutions. For this, the contribution of young people is very important,” comments René. But, as he admits, what really made him and his colleagues work hard for so many years is the fact that the software packages they were developing had always some competitors and, in many cases, they were challenged and even obstructed. “When you are contrasted, but you know you are right, you are condemned to succeed.”

The great attention to the users’ needs has also been very important, because it helped to shape the software and build a trust relationship with people. “I have always said that you have to put the user support at the highest priority,” René explains. “If you reply to a request in 10 minutes you get 10 points, in one hour you get 2 points, and in one day you go already to -10 points. Answering questions and comments is fundamental, because if the users are satisfied with the support you give them, they are willing to trust what you propose next.”

Now that he is retired, René still follows the software development at CERN, but only as an external observer. This does not mean that he has left apart his scientific interests, on the contrary he is now dedicating most of his energies to a more theoretical project, since he is developing a physics model. In his spare time, he likes gardening. He loves flowers, but he cannot avoid looking at them with a scientific eye: “A colleague of mine, who is mathematician, and I developed a mathematical model about the way flowers are structured and grow.”

Brilliant minds are always at work.

See the full article here .

Please help promote STEM in your local schools.

STEM Icon

Stem Education Coalition

Meet CERN in a variety of places:


Cern Courier

THE FOUR MAJOR PROJECT COLLABORATIONS
ATLAS
CERN/ATLAS detector

ALICE
CERN ALICE New

CMS
CERN/CMS Detector

LHCb

CERN/LHCb

LHC

CERN/LHC Map
CERN LHC Grand Tunnel

CERN LHC particles


Quantum Diaries