7–12 May 2023
Venice, Italy
Europe/Zurich timezone

Upgrades to logging and ml analytics architecture at APS

THPL008
11 May 2023, 16:30
2h
Sala Laguna

Sala Laguna

Poster Presentation MC6.T22: Reliability, Operability Thursday Poster Session

Speakers

Nikita Kuklev (Argonne National Laboratory) Ihar Lobach (Argonne National Laboratory)

Description

Several machine learning (ML) projects on anomaly detection and optimization were recently started at the Advanced Photon Source (APS). To improve training data quality, and accommodate the upcoming APS Upgrade changes, a large increase in the number and size of log files is expected. Recent studies found performance bottlenecks in the current log analysis architecture, especially for large ML analytics tasks. We explored several approaches to improve both data density and throughput. First, we swapped lzma compression algorithm for modern alternatives like zstd and lz4, scanning presets to find an optimal one that increased decompression throughput by 10x for a 20\% file size increase. Several lossy compression schemes were attempted to take advantage of limited device resolution and ML quantization, yielding further size decreases with reasonable fidelity losses. Finally, we tested several analytics and time-series databases, finding them faster for both linear and random-access reads while maintaining good compression ratios. They also enabled offloading analytics computations to server nodes, reducing network load. Our results indicate that with some effort, it is possible to increase the amount of logged data significantly while improving ML analytics performance.

Funding Agency

The work is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357.

I have read and accept the Privacy Policy Statement Yes

Primary author

Nikita Kuklev (Argonne National Laboratory)

Co-authors

Ihar Lobach (Argonne National Laboratory) Hairong Shang (Argonne National Laboratory) Robert Soliday (Argonne National Laboratory)

Presentation materials

There are no materials yet.