
CMS DETECTOR LHC has several particle detectors on site, including the Compact Muon Solenoid (CSM) seen here, that capture data produced during the collisions.
Image: © CERN
-
The Best Science Writing Online 2012
Showcasing more than fifty of the most provocative, original, and significant online essays from 2011, The Best Science Writing Online 2012 will change the way...
Read More »
A deluge of high-energy physics data is headed toward servers in Geneva, Switzerland, later this month. That's because the European Organization for Nuclear Research (CERN) now says it plans to restart its Large Hadron Collider (LHC) soon for a run that could last as long as two years at a collision energy of seven TeV (tera–electron volts, 3.5 TeV per beam). As CERN ramps up the world's most powerful particle accelerator to operate well beyond its previous best performance, the lab's computer systems must likewise be tuned so they can properly capture and analyze all of this new output.
Rather than adding a platoon of new computers, and possibly overextending the information technology infrastructure's power and cooling capacity, CERN is testing a virtualized server environment that it hopes to have in place by the end of the year. Server virtualization, which involves using software to segment a server machine's processing and storage capacity, has become a popular technique in recent years to make better use of underutilized machines and drive up efficiency in data centers.
CERN plans to divide its 4,000 servers (which run about 32,500 processors) into about 35,000 virtual servers by the end of the year and manage the subsequent workflow with the help of software from Platform Computing Corp., in Markham, Ontario. Over the next two or three years the lab could further slice up its servers into as many as 80,000 virtual servers.
Simply adding more server machines to CERN's data centers was not an option. "We are limited in the amount of power and cooling available," says Tony Cass, group leader of the Fabric Infrastructure and Operations group at CERN. "We want to wring every last drop out of the resources we have to do physics. Even 10 percent more capacity means that much more toward improving the physics."
In addition to the particle accelerator itself, LHC has several particle detectors—including ATLAS and CMS—on site that capture data produced during the collisions. "To produce physics results, scientists at CERN first have to turn the 1s and 0s produced by the detectors into meaningful pictures showing the tracks of the different quantum particles produced in the collisions," Cass says. "Then they need to analyze these images to understand what they mean."
Detector data capture and analysis requires tremendous computing power, but not always in equal amounts. Sometimes more crunching is needed to create the images, other times it is necessary for the analysis. To allocate and reallocate processor resources, Cass and his team have to determine how many servers would be required to perform a certain task—analysis, for example. If they find that more computing resources are needed elsewhere, they have to stop the batch-processing work being done on the servers, reconfigure them, and then start them again. Virtual servers, however, can be allocated and reallocated dynamically using Platform Computing's software, without the need to interrupt processing work already in progress.
The LHC, which first went online in September 2008, is designed to accelerate bunches of protons to the highest energies ever generated by a machine, colliding them head-on 30 million times a second, with each collision generating thousands of particles at nearly light-speed. If successful, the LHC could help physicists answer questions about the subatomic composition of matter and energy in the universe.
Unfortunately, the LHC's first run lasted little more than a week before it had to be halted due to a problem with two superconducting magnets. The device was brought online again briefly in November to conduct a few experiments but has been down since December while CERN upgrades the equipment. More than 10,000 researchers in 85 countries plan to use the world's most powerful particle accelerator to test different predictions concerning high-energy physics.
When the LHC comes back online in a few weeks, it is expected run continuously through mid- to late 2011, the longest phase of accelerator operation in CERN's history, which dates back to 1954. "The computers dedicated to LHC, which run around 120,000 computing jobs per day, will need to run at maximum efficiency to ensure the flood of data from the detectors can be turned quickly into physics results," Cass says. A "job" in this context is a request to process a certain amount of data—turning the information produced by the detector in a given time period into the images, say, or scanning over some large number of these images for analysis.
"The pressure on the computing teams will increase once real data is here and physicists are competing to produce papers for journals and conferences, especially if there is any hint of a discovery," Cass says. "It's a critical moment for the LHC detectors."




See what we're tweeting about






19 Comments
Add CommentIncredible! The server virtualization technique described has been used with success to allow mainframe computers to emulate many servers. This is advantageous because mainframes have had superior data bandwidth, and a single mainframe is simpler to manage than a much more complex configuration of many servers.
Reply | Report Abuse | Link to thisVirualized servers do not create additional processing capacity from the Aether. Partitioning an actual server into many servers does not increase the total effective computing capacity, unless it had been architecturally constrained by the operating system.
From this article I can only guess that CERNs IT staff has become prematurely distracted by quantum computing fantasies . Or, perhaps I just don't understand these things: I have read here that computers wouldn't exist if it weren't for NASA and that CERN invented the world wide web...
I also suspect that CERN will never discover the Higgs Boson, with or without virtual servers, unless they manufacture the supporting data themselves, but that's just my opinion.
jtdwyer...i agree...i think that they are going to be disappointed with the little if any gain they see(most likely a reduction).
Reply | Report Abuse | Link to thisI thought the main purpose for the on site computers were to crunch the raw data and then pass the results on to others for evaluation/modeling.
If the evaluation processes are not mulithreaded then running one process per cpu with some left over for the os is the best bet.
If what they are trying to do is let others use the resources after the data has been collected then maybe some vitualization would be useful(give them each a sandbox).
good luck....
Read Nostradamus / He predicts something unusual happens at Lake Geneva. // I believe the Large Hadron Collider is located at Lake Geneva.
Reply | Report Abuse | Link to thisYou are confusing capacity with efficiency. Instead of having servers dedicated to a single task, the virtualization allows them to re-allocate servers to other tasks. It's made pretty clear what their thinking is in the 6th paragraph.
Reply | Report Abuse | Link to this@jtdwyer, "Virualized servers do not create additional processing capacity from the Aether," I don't think anyone made that claim. Virtualizing allows the processing power of many machines to be treated as one machine then allocated as needed, instead of many separate machines sitting underutilized. Read the article.
Reply | Report Abuse | Link to this"I also suspect that CERN will never discover the Higgs Boson, with or without virtual servers, unless they manufacture the supporting data themselves, but that's just my opinion." and an asinine one at that. First, what science do you have special access to that the actual particle physics community does not? Please share with us the science you did to determine that the LHC will be incapable of discovering the Higgs. Second, it is an unbelievably idiotic thing to say that they would have to, "manufacture the supporting data". I’ve read your crap here before. You’re one of those science hating trolls. Based on the ignorant things you’ve said in the past I am guessing your hatred of science is due to some sort of intellectual impairment that makes you feel inferior to anyone capable of having a thought. Coming here and making idiotic statements isn’t going to help with that.
sagian2005 – No, I’m not confused. You may not understand that the prioritized dispatching in a multitasking operating system performs the function of allocating processing resources among multiple competing tasks as they describe. If their operating system cannot adequately do so, the virtualization approach is a highly complex and inefficient method of circumventing an architectural shortcoming.
Reply | Report Abuse | Link to thisrobert schmidt – Personal insults deserve no verbal response, and I am not required to justify my stated personal opinion to morons.
Reply | Report Abuse | Link to thisSon, I've been in the software engineering business for 25 years. I can write the core code for a multi-tasking operating system in my sleep. You are arguing apples and oranges.
Reply | Report Abuse | Link to thisEach of their virtualized servers performs a specific task. You seem to be missing that point. When one of the physical servers is not running at capacity, they can allocate a virtual server to perform a different specific task so that the physical server does run at capacity without the need for reconfiguration. Capacity is static. Efficiency is not.
Please re-read the article.
@jtdwyer, you don't want personal insults, don't act like an a$$. You don't want to justify your opinion, then why bore us all by posting it? No one would mind if you just said nothing.
Reply | Report Abuse | Link to thissagian2005...I've been doing it 30+ years...only recently exposed to virualization(3+ years..vmware)...in a near real time application. It failed miserably.
Reply | Report Abuse | Link to thisrobert...virtualization does not allow you to harness multiple machines into one(not that i've seen)...all you can do is carve up a machine into one+ virtual machines..
CERN already runs a few hundred virtual servers on Intel-based x86 servers without issue. Believe me, they don't use VMWare. They are expanding on an idea that is already proven to maximize the available physical assets.
Reply | Report Abuse | Link to thisNon-science from jtdwyer and others like him are a waste of space.
Reply | Report Abuse | Link to thisCERN is utilizing an approach know as "cloud computing" in the industry. A mainframe is just an old business computer and none of them could keep up with even a single modern server running multiple processors.
The CERN facility is running over 30,000 such servers. To manage them efficiently, they are virtualized. As such, the programs that run on them are not aware of which physical server they are running on. That decision is made by Platform LSF, which manages the entire set of servers as if they all exist in a cloud.
The management software directs each resource request to a server that is either idle or nearest to becoming idle. This is merely the best way to distribute the computing load over a finite set of computing resources.
sagian2005 – OK, I do have seniority over you, although I am now retired. You get to sleep? I primarily spent most of my career, among many other things, determining architectural and equipment requirements for one of the world’s largest and fastest growing computer centers, posting the world’s highest volume of database update transactions. This is somewhat analogous to the CERN workload described, although they are processing sequentially batched data.
Reply | Report Abuse | Link to thisI did not intend to insult you but simply could not determine your level of understanding from your brief comment. Your point is valid if one focuses on paragraph 6, but the rest of the article describes how CERN was implementing server virtualization in order to handle the tremendous increase in data capture and processing without installing an enormous amount of additional computing capacity.
Very briefly reviewing Platform Computing Corporation’s scant documentation that I found, I conclude that it is generally intended to benefit computer center level resource usage by simply dynamically allowing physical servers to be added and removed from pools of virtual servers assigned to specific workloads. This allows the IT group to vary the number of servers dedicated to processing image creation and image analysis throughout the day, presumedly reducing the total number of servers required if they had been statically configured (assuming no overlap of peak processing requirements). Wow, they’re adding a folding rear seat to their Model T architecture. Never mind, I withdraw all of my comments about this glorified nonsense.
CERN has bigger problems afoot than simple virtualization!
Reply | Report Abuse | Link to thisThe headline says: "CERN Gears Up Its Computers"
If CERN thinks that computers have gears--well, they need to go electronic. That Babbage equipment, for the most part, is obsolete now.
You can even buy a Zune that would be faster.
Reply | Report Abuse | Link to thisTo
jtdwyer , I agree wholeheartedly with robert schmidt.
You come across as a raving looney.
Why did you come here?
Irreversible processes are at the heart of the arrow of time. Events happen in some sequences, and not in others.
Reply | Report Abuse | Link to thisThe orderly “Time” indicative as always, there was a beginning and logically there is an ending....
God is very real; in addition, we as human beings have been created in his image, with one exception, (not our physical bodies, but our minds only)
Remember Manly P. Hall: (philosopher) “If the infinite had not desired man to be wise, he would not have bestowed upon him the faculty of knowing”
Yes, with the encouragements of others, as always we have destroyed ourselves many times in the past, (so much so, that even the mighty genes had to restart at “A, B, C”)
Facts are that, some humans have the abilities to drag others to their destructions; it is not that they are evils; the facts are that they have a suicidal gene.
However, as we have seen lately and in many occasions, a genius protection appears and stops such events, (further on these subjects, is hard to categorize however finding a suicidal gene, before is too late may not be hard)
Thread carefully, for destructions takes only seconds; but regenerating a new and orderly system takes Billions of years.
elderlybloke - That's fine, but don't overly concern yourself about it - I'm not.
Reply | Report Abuse | Link to thissagian2005 and jtdwyer and others, I've been at it software wise since 1964 (and I'm still in the market for contracts). As for CERN's software/hardware effort they obviously have one of the most complex realtime computing tasks ever conceived.
Reply | Report Abuse | Link to thisBatch processing is easy - if a task can't be done today, stick it in a queue and do it tomorrow. (I can already hear the howls of pain from mainframe jocks!)
The architects and implementers of this monumental task at CERN deserve medals and prizes as their efforts help us unravel the Universe's deepest secrets. All hail to them I say.
rotay - Actually, the real time data capture has been completed by the particle detectors. The workload discussed here is exactly two batch workloads. The first produces images from the detector data. The second analyzes the image data. The big problem here is managing the number of servers allocated to each workload as demand varies throughout the day. Sounds like big time multi-tasking resource management, except it's not. This is just big time PR spin.
Reply | Report Abuse | Link to thisHaving worked on system management and resource allocation, in addition to application performance optimization, etc., etc., for the world's first express package tracking system, storing tens of millions of tracking transmissions from around the world every few hours, this looks looks kinda like a 1964 insurance company billing app to me. I did look over some of my Dad's COBOL doc is 1964, but didn't get started until 1973.
Having been long involved with the world's biggest systems, I've often read PR tripe in the trade rags about the big time system I had all the inside info on. This was PR hype.