How many Ferraris did you see this morning on your way to work?
Unless you work in the Maranello-based company, zero is the most likely answer. Of course, some days you will stumble upon one – there are car buyers wealthy enough to indulge in indecently horse-powered models…. even if those horses are not necessary for everyday journeys. The case of silicon horses powering a CPU is not that different: more power is expensive and no one will indulge in paying the price if it’s useless. This is what a series of interviews taught me about how FEA and CFD simulation software users handle ever-growing computing resources.
As announced in my previous post on FEA/CFD model sizes (see https://blog.ceetron.com/2015/05/28/technology-prediction-for-big-3d-typical-fea-and-cfd-model-sizes-to-be-used-in-the-cae-community-in-2020-and-2025/), my colleagues at Ceetron and I have been collecting information from key players in the industry in order to close the subject of model size evolution with some rock-solid facts and figures. I have lead a series of interviews for that purpose as I am close enough to a few well-established actors in those fields.
My first visitor at Ceetron’s Sophia Antipolis office was Dr. Hugues Digonnet, Associate Professor at the HPC Institute of the Ecole Centrale de Nantes (France), a group headed by Professor Thierry Coupez. Most of the underlying numerical tools to which I refer in this post owe something to Thierry Coupez, but as far as model sizes go – especially large model sizes – Hugues Digonnet is the right person from that group to talk to. I have known Hugues for almost 20 years now I feel quite confident he will not complain when I describe as a maniac whose professional purpose is to solve ever-larger systems of equations: in fact, he lives by the motto “larger is better”.
In 2000, when he started solving large systems with Thierry Coupez at the Center for Material Forming (CEMEF) of the Ecole des Mines de Paris in Sophia Antipolis, the state of the art in material forming simulation was to handle entire transient non-linear processes on 2D meshes of ca. 10000 nodes, i.e. 30000 degrees of freedom (d.o.f.). Each of these simulations involves hundreds of linear system resolutions, which was done at the time using direct resolution methods – notoriously greedy memory-wise. That year, new and lighter iterative methods being developed were slowly enabling to handle larger systems in 3D. According to Hugues, the size of those first 3D linear systems being solved could reach 200.000 nodes – i.e. 800.000 d.o.f., – one order of magnitude larger that the state of the art.
From there, the evolution of linear system sizes proves Moore’s law right. Hugues reports achieving the million node frontier in 2004, and the billion node model was handled in 2013. One year later, that figure had taken another order of magnitude, to reach Hugues’ personal record for a linear system resolution on a mesh of 13.4 billion nodes. The computation took place on a cloud-based distributed platform of 100.000 cores, using up to 200 TB (200 terabytes) of total RAM memory. The resolution time was about 100 seconds. Hugues’ eyes lighten up when talks about this computation. I have decided to call him Mr Giganode from now on – at least until he breaks the Teranode roof.
The following plot sums up linear system sizes that Hugues has been able to deal with.
However, considering that several hundred linear systems need to be solved in order to simulate a complete transient material forming process and that your everyday computer does not have 100.000 cores, these figures need to be projected into a more industrial context. According to Hugues, a high-end full 3D transient simulation today would involve a 3-4 million node model on a 256-core machine.
While summing up his academic achievements, Hugues also brought some insight as to how the evolution of the size of solvable linear systems takes place. On the long road to billion-node models, some milestones were harder to achieve than others. For example, a natural limit appeared suddenly when he first had to tackle resolutions on more than 8000 cores. This type of hardware configuration made the algorithms strike the limits of 32-bit integers – not to mention severe file input/output issues that simply do not exist on less parallel machines.
Another unexpected limit was unveiled when he first went for a billion-node model: the conjugate gradient method used began to show signs of bad convergence, a symptom that was finally linked to the insufficient precision of 64-bit doubles at such levels of accuracy.
As a rule of thumb, the limiting of factor of increasingly larger models is memory – except when the algorithm starts to fail for unexpected reasons such as those mentioned above.
From this interview, two key facts emerged. First, at research level the evolution of model sizes is exponential, and second, more powerful machines do not always automatically translate into larger model sizes: the process has its hiccups and depends on human brains to set them aside.
A few weeks after Hugues, I enjoyed a pleasant conversation with Dr. Richard Ducloux, who manages a Key Account Applications team in my former company Transvalor, a material forming simulation software editor. Richard was my Technical Director for many years, and we have always shared a few common views on the evolution of the niche market Transvalor is in. This proved to be the case once again as we discussed how model sizes have evolved in industrial applications over the last 20 years, and how they may evolve in the upcoming years.
Richard joined Transvalor as a developer, working on the transition from 2D to 3D of the software package Forge. In those early 90’s, the linear systems were being solved on single-processor machines using direct resolution methods. Within this context, model sizes were limited to a few thousand nodes. According to Richard, 3000 node models were considered large at the time – very in much in line with the figures provided by Hugues Digonnet.
A major transition took place around 1997, with the shift to parallel machines and the implementation of iterative resolution methods such as those Hugues Digonnet worked on. The increase of model sizes was however limited at first to about 10000 nodes, and hardly applicable in everyday applications. The machines – SGI Origin, Dec Alpha 4100, IBM SP1 & SP2 – required to carry out the computation were indeed quite expensive. For comparison’s sake, you acquired one of these machines for an amount equivalent to 5 cars – Fiats that is, not Ferraris.
A few years later however, the hardware not only continued to improve along Moore’s path, but it also became more affordable. Accordingly, the model sizes used by Transvalor’s clients started to increase in size, until they attained 20 to 30000 nodes. This figure is still considered today to be a standard one for a typical 3D hot forging transient simulation. Richard explains this quite simply: these models prove to be accurate enough for the underlying physical modelling. The end user’s choice is to use the added hardware power to achieve more quickly these satisfying results. Furthermore, larger models imply a few caveats: they require larger storage and use up more read/write operations. Of course, some users make sure that their choice of model size reaches its goals by performing “proof of concept” simulations on much larger models. Million-node simulations with Forge have been undertaken, but these sizes remain the exception.
Questioned about model sizes used in other material forming software packages developed at Transvalor, Richard told me that casting simulations required more accuracy due to the underlying physics. In this field, models of several hundred thousand nodes are considered to be standard. Again, once the necessary accuracy is achieved, end users choose to reduce computation times rather than to increase model sizes.
Finally, Richard and I talked about what paths material forming simulation may take in the next years. There is an obvious trend to include more physics into the models, in order to answer questions related to the microstructure of the material while it is formed in order to predict how the produced component will behave when put in use. This is where multi-scale models come into play: microscopic models feed macroscopic ones with a more accurate description of the underlying physics. Due to hardware limitations, these models are implemented in a loose, uncoupled way: the material forming simulation provides the thermos-mechanical history of the material, which is then used by micro-scale models to compute the microstructural evolution in selected areas of the component.
If hardware limits are lifted, the natural evolution of such approaches will be to couple both computations within the material forming simulation. At a given time step, the microstructure could be computed at every element of the mesh and its influence on the forming itself could be taken into account. In order to achieve this, each element of the mesh would be seen as a sub-mesh in itself, divided into sub-elements on which the microscopic model would be solved.
This is interesting for a few reasons. First, we see again that what drives the use of larger models is not only the ability to solve them (hardware power), but also – and mostly – the need for more physical accuracy. The second interesting point about this predicted evolution is that there will be a need for new, adapted tools to analyze these results. This is certainly a point to ponder for my colleagues and me at Ceetron: the future holds new data model challenges for 3D FEA visualization.
The third and final interview in this series took place 2 weeks ago. I was pleased to welcome an ex-colleague of Hugues Digonnet: Dr Elie Hachem, head of the Computing and Fluids research group at the Center for Material Forming (CEMEF) of the Ecole des Mines de Paris. Elie was accompanied by Dr. Youssef MESRI, the HPC expert of the CFL group.
The idea for this third interview was to gain some insight on the model size issue for a wider variety of applications and numerical techniques. Elie and Youssef are closer to CFD computations – both academically and industrially – and their knowledge of the field provided a stronger backing to what we already learnt from Hugues and Richard.
Elie’s award-winning Ph.D. dealt with the simulation of air flows inside industrial furnaces (“Stabilized finite element method for heat transfer and turbulent flows inside industrial furnaces”). In 2006, when he first started out, typical computations – in academic situations – involved ~10.000 nodes. In the final application of the numerical method he had developed, the flow of air inside a furnace model is computed on a 57.000 node mesh. This seems extraordinarily small, but it is necessary to note that the meshing and re-meshing techniques involved are highly non isotropic, and especially designed to keep the number of nodes under control. This is a new fact: more accuracy can be gained through adaptive meshing methods.
Talking about a few projects that took place at Cemef throughout the years, Elie gave me a few figures that show that model sizes are chosen to solve a given problem with a sufficient level of accuracy. The best example of this came from a project I actually worked on in 2008, which addressed the simulation of water-assisted injection molding. In this process, plastic tubes are obtained by injecting water in molten polymer cylinders to create (hollow) tubes. Even if adaptive meshing schemes were used, the extreme conditions of this process (turbulent flow, complex polymer behavior and extreme thermal conditions) required an 8 million node mesh to achieve the necessary accuracy. The computation did last a few weeks, if my memory is correct.
The final point Elie, Youssef and I discussed was an interactive web-based CFD simulation platform: Aeromines (http://www.aeromines.com/). The platform was set up a few years ago when both Elie and Hugues were part of a Cemef group headed by Professor Thierry Coupez. Aeromines’ web interface is developed and maintained by Transvalor’s innovation department. The platform allows registered users to set up and perform CFD simulations in the cloud. This type of SAAS system is very interesting for us at Ceetron, as one of our innovative projects deals with 3D visualization on the web.
Looking at what is being computed on Aeromines, it appears that current everyday industrial model sizes are counted in millions of nodes – once again this figure is seen as a standard, as it was by Hugues and Richard.
Well, after those three very interesting interviews surrounded by nostalgia, I think I now have enough facts and figures to conclude on the model size subject. First, the size of a model is intimately linked to what the simulation is focusing: the scales of the observed physics sets out a minimal size required to gain enough accuracy for predicted results to be truly predictive. The fuel that makes larger models emerge is thus the complexity of the simulated physical phenomena: higher complexity means lower scale, which in turn sets larger model sizes. When attempting to predict how model sizes will evolve in the coming years, it is necessary to evaluate what the markets will be interested in and if the will be able to afford the hardware required by their interests. At this point in time, the keyword of the answer to that question is “multi” : multiscale, multi-physics, multi-object… When such detail is required, we will certainly see model sizes grow rapidly until the physics is correctly predicted. On the other hand, for less advanced yet meaningful computations, larger models will remain as rare as Ferraris on the way to work.