Fredrik Viken. CTO Ceetron
In 2010, Ceetron began to re-shape our 3D Components software to be the visualization platform for a cloud-based world. The effort had two linked goals:
1) To enable the visualization of any CAE model on any device. This is an obvious necessity, given the emerging business model of distributed collaboration, and the rapid increase in the performance of tablet and phone GPUs/CPUs.
2) To create, in the cloud, a user experience indistinguishable from that of most responsive desktop solutions.
We considered each goal crucial. All of the potential benefits of cloudification would go unrealized if visualizations were device-dependent, or if cloud-based apps made users less productive.
Achieving these goals required a painstaking choice between two fundamentally different architectures: Image/Video Streaming, and 3D Object Streaming. How we made the choice says a lot about how we can all leverage the massive changes happening in 3D visualization today.
We first reviewed the literature on cloud-based 3D visualization, which described two basic approaches:
1) Image/Video Streaming. In this approach, the rendering is done on the cloud server and transmitted to the client, which shows the image. This approach can be implemented on any client, with no special hardware.
2) 3D Object Streaming. In this approach, the rendering of the model is done on the client. The server interprets the analysis files and transmits to the client a collection of textured triangles: a ready-to-render 3D display model. The GPU on the client device is then used to render the scene. The recent support for WebGL on virtually any device (from smart TVs to desktop computers) enables advanced 3D rendering directly in a web browser without installing any software or plugins.
Before continuing, let me note that our final choice was informed by three basic assumptions about future CAE computing environments:
- The simulator part of non-trivial FEA- or CFD-based simulators will be run on HPC-type compute clusters, implemented as a private or public cloud;
- The client app will be running inside a standard web browser;
- Any viable cloud-based approach must also support native/desktop client apps.
There are of course pros and cons to each architecture. I will touch on the ones we considered, highlighting the decisions we made along the way.
Image streaming is appealing largely because it lowers the requirements for the client. The client must show only images of 3D models, not real 3D model representations. The client is also totally independent of the model size (number of nodes/elements), ergo, its load is determined solely by the resolution of the images it receives. The user receives three significant benefits. A client that is:
- Relatively easy and to develop, hence a low cost;
- Adaptable to virtually any device;
- Able to handle any conceivable model size.
But there are downsides.
Ceetron’s early project work (done with an R&D partner), revealed two major challenges inherent to this approach:
- Latency / network stability;
- Server resource usage.
Latency is a major challenge. This is experienced as a lag from you interact with the application until the 3D model is refreshed on the screen. This is due to the fact that change in e.g. camera specification needs to be transmitted to the server, the server needs to render the image and send it back. This roundtrip takes time, even on very fast connections. We considered workarounds, of course: reductions of image size/quality, or even the use of a coarse “proxy” model as a kind of digital stand-in while interacting with the model and then transmitting the full image whenever the interaction stopped. But in the end, we thought these did not solve the fundamental problem of the experienced lag.
Server resource usage is a second challenge here because, in general, the server must render a new image whenever the 3D view needs an update. This could be when navigating the model (pan/rotate/zoom) or when running an animation. Considering a typical interactive CAE client, in this scenario we would need one server for each client to render 30-60 images per second. That means dedicated GPU resources for each concurrent client, available for the entire session. This is not just incredibly costly; it does not scale.
One could argue that our friend Moore’s Law would eventually rescue us, that latency and resource usage would decline rapidly. This is logically possible, but our experience flatly contradicts this view. We have seen that Moore’s Law is generally leveraged to increase model size and spatial and temporal resolution, rather than to mitigate the effects of latency and lack of scalability on the user experience. Put another way, the effects of Moore’s Law in this architecture would be to increase latency and usage, not reduce them. (A previous blog on model size trends in the CAE community, written by my colleague Andres Rodriguez-Villa, covers this in some detail. https://blog.ceetron.com/2015/07/13/future-model-sizes-in-the-fea-and-cfd-community-predictions-from-interviews-with-industry-insiders/.)
3D Object Streaming
At first glance, 3D object streaming seems to present insuperable disadvantages. The client must be more complex, which makes this approach more difficult to implement, and more resource-intensive. Even more worrisome, client-side performance would be inversely correlated with model size: this in an age of gigantic (and growing) models. So why would one choose this path?
First of all, with a local, compacted 3D representation of the model residing on the client, the issue of latency disappears. The user could render in the full resolution of the local display, and achieve a first-class visualization experience. Developers could also create apps that have the same look-and-feel as desktop apps, and the same premium performance.
Second, server resource-use would decline over time. The initial load (loading the analysis model, setting up the visualization) would be the same. As soon as all the 3D objects are streamed to the client, however, the server can focus on other tasks while the user is inspecting the model. Pan, zoom, rotate, play and pause could all be done without any interaction with the server. Should the user start querying the model (e.g., picking), a small workload would be incurred, but the server could then rapidly re-focus on other clients.
This was all encouraging; but we still worried about performance, especially on smaller devices like smart phones and tablets—which is where the positive impact of Moore’s Law became evident. Stating the obvious, the graphics performance of user devices (from phones to desktop computers) has increased enormously over the last years. The graph below illustrates the performance history of the iPad, from the first iPad released in April 2010 to the iPad Pro released in November 2015 (source: Apple Special Event, Sept 9, 2015).
In fact, GPU performance has increased 360 (!) times over this time period, and the iPad Pro and the iPhone 6s today outperform many desktop computers in when it comes to rendering speed. On a general note, almost all user devices, from phones to desktop computers, are capable of showing advanced graphics. So why not utilize that power?
A major challenge with this approach is dependency on model size. If we assume a model (including results) in the 10GB range, sending it to the client is not practicable. Also, if one sent every surface of the model to the client, one would only be able to visualize small models. It would therefore be important to reduce the size of the model, due to both transfer times and client resources. We have found that clients are generally capable of showing millions of triangles per second, but many devices (especially Apple’s) have limited memory (both main memory and GPU memory). So in this approach one would need to optimize the system for client-side memory usage and transfer volume.
We also observed that a solution to the model size problem for 3D object streaming, could be to use Level of Detail (LOD) for sending objects of appropriate resolution to the client. Appropriate triangle count could be determined by transfer speed and client capabilities. Level of detail could be both interactive and cached by the cloud software. One could also combine LOD with progressive streaming, in which the object with the largest footprint on the screen would be sent first, followed by smaller-footprint objects, in resolutions progressing from coarse to fine. Finding good LOD representations of arbitrary CAE models that consider both the geometry and the result distribution, is far from non-trivial. This is ongoing work and will be the topic of a separate blog article.
The final challenge with 3D Object Streaming is the complexity of the client; it must be capable of 3D rendering, not just presenting a stream of images. One solution is to divide the workload so the client can be as ‘stupid’ as possible. Broadly, the cloud would have all the CAE knowledge and functionality, while the client would be a very powerful ‘triangle pusher’ utilizing the very capable local GPU. Using the 10GB example above, the cloud would have access to—and the hardware resources to quickly read and process—the full (10+ GB) analysis data. It would not be possible to download or access the full FEA/CFD model on the client, but it would not be necessary. The cloud would extract the triangles to render a given scene and progressively stream them to the client. These triangles might be the surface of the model, a cutting plane or an isovolume, etc. The client would not know the difference; it would simply render the groups of triangles it receives. Using the previous ‘picking request’ example, whenever the user queries an element or node result value, a picking request is sent to the cloud and executed there, returning back the relevant information. Adding a cutting plane would be no problem: specify it in the web client, the spec would be sent to the cloud where the cutting plane would be computed and then, the new triangles would be streamed back to the client.
Of course, the client would have to be able to run in a Web browser environment on any device. This would preclude the re-use of any legacy rendering, UI and application code on the client. The code must be downloaded to the client when the app is opened the first time. The code-base would have to be very compact: a lean-and-mean client is the only one that would work.
In 2014, after several years of research and collaboration, Ceetron decided that the best approach for the coming cloud-based world was Progressive Streaming of 3D Objects. Further benchmarking has corroborated that decision, and I am personally hoping to be able to make some of these benchmarking results available to the public. (It should also be noted that what could be interpreted, a posteriori, as a linear and rational decision, was indeed a recursive and iterative process involving design, implementation, evaluation / benchmarking / issue detection, (re)design, etc.)
One may also wonder why we chose the 3D Object Streaming approach despite the acknowledged client-side and server-side complexity it entails. We believe that we have been able to contain that complexity within the 3D visualization platform and below a set of well-designed abstraction level constructs, so that the developer of a FEA- or CFD-based simulator will never see this complexity. All this complexity can be hidden for the CAE software developer in a cloud-based framework like Ceetron 3D Components.
One may finally wonder about the implications of the significant variance of how CAE models are structured. Some FEA/CFD models might be just one giant part with several million elements, whereas more CAD-oriented designs might contain 100k+ small parts. In order to get the progressive 3D object streaming to work optimally in all cases, one needs to establish total separation between the 3D display models that are streamed to the client, and the CAE model itself. One needs to gather all parts with the same visual appearance, and then split each unique set of attributes (color, opacity, draw style, texturing attributes, etc) into suitable pieces (usually <64k vertices so one can keep unsigned short indices and get reasonable VBO (vertex buffer object) sizes). Our experience is that this decoupling of the CAE model itself and the ‘display model’ on the client is a prerequisite for good performance.
To sum up, six years after our first R&D activities on cloud-based visualization for the CAE community (and with a number of top-tier CAE tool providers, including our customers DNV GL, Autodesk, and Transvalor), we have made good progress towards cloudifying CAE apps and enabling the visualization of any CAE model on any device. Choosing the Progressive Streaming of 3D objects paradigm has been instrumental for this progress. You can follow our further progress on https://cloud.ceetron.com/ .
CTO, Ceetron AS