Volume rendering of large scientific data sets

Advances in high performance computing platforms have allowed for an ever-increasing number of numerical simulations that investigate a multitude of problems in physics. Direct numerical simulation (DNS) of turbulent flows requires resolutions that are fine enough to capture turbulent structures that dissipate the initial kinetic energy. Therefore, the study and physical interpretation of the effects of fine scale persistent structure depends on having techniques to appropriately analyze and visualize large data sets.

Visualization of the data sets can be used to provide insights into the dynamics of physical processes. In the case of turbulent flows, the visualization of vorticity can be insightful in determining the extent of mixing. Volume rendering of vorticity are useful for the following reasons:

  • Vortical structures of different sizes can be visualized in way that allows for context between scales to be preserved.
  • Color maps provide information about low and high values of scalars simultaneously.
  • The interplay of color and transparency provide a visual representation of the underlying dynamics in a beautiful and captivating manner.

Managing data for visualization and scientific discovery

At the Computation Reacting Flows Laboratory at KAUST, a variety of software tools are utilized to study the underlying physics of combustion. The primary tool for investigating detailed chemistry with DNS is called S3D, developed at Sandia National Labs, CA with a multi-institutional group of partners across the United States.

Due to the immense resolution requirements imposed by three-dimensional turbulence, it is critical to develop a workflow that takes advanced libraries, software, and hardware optimized for parallelism. S3D uses HDF5 libraries to save check-pointed raw data for restarting, with the data saved to GPFS file systems for fast input/output (I/O). Separate sets of files with processed data are saved in another directory, taking advantage of the compute notes at runtime. This allows for faster turnaround in analysis, as the bandwidth or available compute nodes in data processing computer systems can be lower. Finally, visualization of the processed data is performed using Paraview, a parallel-capable visualization software package build on VTK and Python.

Paraview (pvpython) tutorial

Recently, I had the honor of briefly presenting at the KAUST Visualization Laboratory's HPC Visualization with Paraview workshop, a two-day course that provided participants with an overview and case-studies on visuaziling scientific data sets from a variety of disciplines. I provided a reduced data set from one of my prelimindary flame numerical simulations and placed all of this on a GitHub repository.

You can access the repository here, which includes a sample data set, accompanying XDMF template file and the actual presentation. The tutorial serves as an introduction on using the Paraview Trace function, which provides a set of instructions that can be copied to file. This allows you to access those functions in a way that can be replicated for a large set of data files.

Feel free to take a look at the presentation and practice with the included data set.

Acknowledgements

I would like to thank Dr. Madhu Srinivasan , who serves as Visualization and CG Research Scientist at KAUST's VisLab. Check out his blog for some details on KAUST's Paraview server implementation for visualization of data dristibuted among multiple cores.