CERN’s Large Hadron Collider (LHC) is at the forefront of physics research. The data output from its “Run 1” and “Run 2” phases have already been used to demonstrate the existence of a previously undetected subatomic particle and extend our understanding of the universe and how it formed. Notably in 2012 it confirmed the existence of the Higgs boson.
The scale of CERN is astounding. From the size of the large hadron collider – a circular particle accelerator with a radius of 4.3 km – to the rate of particle collisions - up to 1 billion particle collisions can take place every second inside the LHC experiment's detectors.
But it is the data that is most impressive, with the collisions generating 1 petabyte (PB) of data per second. Even after filtering only the interesting events, the facility requires approximately 10PB of new data to be stored for analysis each month.
This data is stored in the CERN Data Centre and is shared with a network of about 170 data centres for analysis, thanks to the Worldwide LHC Computing Grid (WLCG). The current storage setup at CERN consists of HDD buffers with 3,200 JBODs carrying 100,000 hard disk drives (HDDs) providing a total of 350PB.
LHC Runs are set to continue, and with each new “Run”, data storage increases significantly. Following upgrades, CERN’s Run 3 is scheduled for 2021.
Toshiba Electronics Europe GmbH’s hard disk drives are used by CERN to manage huge volumes of data since 2014, with three generations of Toshiba hard drive technology giving it the capacity increases it requires. But, can this continue when, as CERN’s Manager of the Facility Planning and Procurement Section at the IT Department, Eric Bonfillou, puts it: “The planned upgrades of the LHC machine will require scaling of compute and storage resources beyond what today’s technology can offer.”