This article conveys my view on CERN root as of summer 2013. 

The fact that ROOT kind of sucks is an opinion shared almost unanimously among those who work with it. However, as the basic data analysis framework at CERN, it is the maybe most widely used scientific software package in Europe.  The reason for that is that no other comparable toolkit is so highly optimized and able to handle such large amounts of data. But the interface it provides to the user and programmer is a tale of bad choices.  In the past two years, I have been working with it a lot and while I still consider myself a beginner, I have developed an opinion on ROOT it is so terrible. Let me explain:

ROOT was designed as the all-covering standard framework at CERN and its affiliated institutes.  On the one hand, it should work as  a command line tool,  allowing fast peeks into data files, easy creation of histograms and similar high level operations. On the other hand, it was supposed to be a library that could be integrated into big software projects and that need to be able to handle enormous amounts of data in as little time as possible.

Due to the latter requirement,  ROOT was written in C++. Because C++ is a compiled language with explicit heap access, it is practically unbeaten in performance and since it is so popular, integration with other software is simple. Of course there was still the requirement of a command line, which the developers decided to solve by creating CINT, a 400 000 line tool that can interpret C++ line by line.

But wrapping a C++ framework into an interpreter does not make it a good scripting tool.  Command line code  must be easy to write, short and concise.

C++ is a language made for compiled code, and it is designed to be easy to debug at the expense of  redundancy and lengthy code. This is achieved by object oriented encapsulation and and strong typing. Encapsulation means that any variable exists only inside a certain scope. There are no global variables or universally accessible objects. And if a class method needs information that it can not obtain from the class attributes it needs to be passed in the function arguments. Encapsulation is essential because it limits the amount of context that needs to be considered by the programmer at a time . Strong typing requires the programmer to specify the type when a variable is defined. This helps catching errors at compile time and also allows the compiler to allocate the exact amount of memory needed.

All these features make C++ the fast compiled language it is and they save a lot of time when finding bugs.  However they make C++ a terrible choice for the command line.  The ROOT developers realized that. And then they did something they shouldn't have done: They wrote ROOT in  a way that circumvents these C++ design principles. They introduced global scope variables and made the class methods depend on those variables. And they avoided type safety by excessively using strings  and void pointers.

For example, the Draw() method of a TTree will print a histogram of the tree object onto the canvas that has been instantiated last. If there is no canvas, it will create one. This makes printing a histogram from within the command line much simpler. But it leads to bitten finger nails and torn out hairs once the code exceeds a few dozen lines, because it becomes fucking impossible to tell which canvas is the currently active one. Another design mistake in ROOT is the excessive use of strings. For example, many methods expect an options string that follows a certain format. The string is then evaluated inside the method. This makes function calls more compact, but it also the source cryptic run-time errors.

The ROOT developers tried to create something that works great both as C++ library and as a command line tool. The result is a fail in both domains.

PS: What can be done?

First of all, scrape the .x feature. If C++ code is long enough to be written down, it must be compiled. Secondly, depreciate the direct use of CINT or Cling. Instead, make rootpy the tool for getting swift access of data and light scripting. As a last step, change the ROOT framework interface - if you dare.

Subscribe via RSS