HighFive: An easy-to-use, header-only C++ library for HDF5

Adrien Devresse1, Omar Awile1, Jorge Blanco1, Tristan Carel1, Nicolas Cornu1, Tom de Geus2, Luc Grosheintz-Laval1, Pramod Kumbhar1, Fernando Pereira1, Sergio Rivas Gomez1, Matthias Wolf1, James King1

1 Blue Brain Project, École Polytechnique Fédérale de Lausanne, Switzerland
2 Physics of Complex Systems Laboratory, École Polytechnique Fédérale de Lausanne, Switzerland

Introduction

The use of portable scientific data formats are vital for managing complex workflows, reliable data storage, knowledge transfer, and long-term maintainability and reproducibility. Hierarchical Data Format (HDF) 5 is considered the de-facto industry-standard for this purpose. While the official HDF5 library is versatile and well supported, it only provides a low-level C/C++ interface. Lacking proper high-level C++ abstractions dissuades the use of HDF5 in scientific applications. There are a number of C++ wrapper libraries available. Many, however, are domain-specific, incomplete or not actively maintained.

HighFive is an attempt to address these challenges.

Basic use of HighFive

It is an easy to use modern C++ header-only library that reduces most of the book-keeping overhead required by HDF5. HighFive uses RAII to handle object life-times and automatically handles reference counting on HDF5 objects. The library makes use of C++ templating for automatic type mapping. These features significantly increase programmer productivity and reduce coding bugs.

HighFive HDF5

Support for HDF5 advanced features

Its simplified data-management does not come at a loss of HDF5's flexibility and advanced features and tunable parameters are exposed through a simple interface. File version bounds can be read and written to define object compatibility, the metadata block size can be set. Group properties for compression, chunking and link info estimates can be set and read:

Complex data types

HighFive is built with scientific applications in mind. Besides scalar and simple STL vectors it is possible to map C++ structs to HDF5 compound types and to read and write Boost, Boost ublas, Eigen and XTensor array types. The library is also able to handle combinations of array types (e.g. std::vector<Eigen::Matrix>). This is achieved through various templated converters. Additionally, HighFive supports enums and various string types.

HighFive for parallel applications

With the aim to support large-scale scientific application, we have made an effort to also natively support the HDF5 MPI backend in HighFive. A special MPIOFileDriver is used in the application code to ensure that HDF5 is correctly initialized. No other special API calls are required since all necessary provisions are handled transparently.

H5Easy: one-liners

HighFive also offers the H5Easy API (on its own namespace). This has an API in which things can be done in one-liners, with a syntax comparable to for example h5py for Python. It offers overloads for STL containers, Boost, Eigen, xtensor, and OpenCV.

H5Easy h5py

Obtaining and building HighFive

HighFive is developed open-source and can be cloned and forked from GitHub. It can also be installed from the clone, via spack, or via conda. It can then be used for example with find_package(HighFive) in CMake. Being a header-only library, HighFive can be used directly as a subfolder in a C++ project, for example by adding it as a submodule. More details can be found in the README.md file.

© 2022 Blue Brain Project/EPFL
The development of this software was supported by funding to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL), from the Swiss government's ETH Board of the Swiss Federal Institutes of Technology.

Boost Software License 1.0