Speaker
Description
Scientific data is commonly stored in formats such as CSV for small to medium datasets, HDF5 for complex hierarchical structures, and NetCDF for gridded and temporal variables. These formats provide robust mechanisms for structured storage, preparing hierarchical or multidimensional data for machine learning (ML) models and GPU-accelerated computation. However, often they incur substantial transformation overhead. Recent studies have demonstrated the promise of tailored data-transformation and encoding approaches in scientific domains. In particular, convolutional autoencoder–based compression of 3D particle-tracking data can achieve significant reductions in post-processing as well as storage requirements while keeping the essential spatial features for downstream workflows. This work discusses the potential of domain-specific encoding strategies for interfacing particle and field data across different simulation tools, with the goal of enhancing the scalability and integration of scientific data into more sustainable workflows and future machine learning pipelines.
| In which format do you inted to submit your paper? | LaTeX |
|---|