### Refine

#### H-BRS Bibliography

- yes (5)

#### Departments, institutes and facilities

#### Document Type

- Conference Object (3)
- Article (1)
- Preprint (1)

#### Language

- English (5)

#### Keywords

- Codes (1)
- Data structures (1)
- Geometry (1)
- Hardware (1)
- Ray tracing (1)
- Three-dimensional displays (1)
- Topology (1)

We describe a systematic approach for rendering time-varying simulation data produced by exa-scale simulations, using GPU workstations. The data sets we focus on use adaptive mesh refinement (AMR) to overcome memory bandwidth limitations by representing interesting regions in space with high detail. Particularly, our focus is on data sets where the AMR hierarchy is fixed and does not change over time. Our study is motivated by the NASA Exajet, a large computational fluid dynamics simulation of a civilian cargo aircraft that consists of 423 simulation time steps, each storing 2.5 GB of data per scalar field, amounting to a total of 4 TB. We present strategies for rendering this time series data set with smooth animation and at interactive rates using current generation GPUs. We start with an unoptimized baseline and step by step extend that to support fast streaming updates. Our approach demonstrates how to push current visualization workstations and modern visualization APIs to their limits to achieve interactive visualization of exa-scale time series data sets.

Modern GPUs come with dedicated hardware to perform ray/triangle intersections and bounding volume hierarchy (BVH) traversal. While the primary use case for this hardware is photorealistic 3D computer graphics, with careful algorithm design scientists can also use this special-purpose hardware to accelerate general-purpose computations such as point containment queries. This article explains the principles behind these techniques and their application to vector field visualization of large simulation data using particle tracing.

Graph drawing with spring embedders employs a V x V computation phase over the graph's vertex set to compute repulsive forces. Here, the efficacy of forces diminishes with distance: a vertex can effectively only influence other vertices in a certain radius around its position. Therefore, the algorithm lends itself to an implementation using search data structures to reduce the runtime complexity. NVIDIA RT cores implement hierarchical tree traversal in hardware. We show how to map the problem of finding graph layouts with force-directed methods to a ray tracing problem that can subsequently be implemented with dedicated ray tracing hardware. With that, we observe speedups of 4x to 13x over a CUDA software implementation.