Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening output.csv file in all ranks can lead to very high file open overhead #131

Open
jprorama opened this issue Sep 29, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@jprorama
Copy link
Contributor

Bug Report

The output.csv file is currently opened during parameter parsing as side effect of the read_config() called by all ranks in the main() function of the core benchmarks in h5bench_patterns/. This leads to a degenerate performance scenario when Lustre striping is set to -1 (stripe across all OSTs) and ranks coordinate opening the output.csv. This causes observable coordination overhead at small ranks (64) and degrades rapidly with scale, easily leading to more than an hour of coordination overhead at just 512 ranks. This delays the start of the actual HDF5 write test until the file open coordination is complete. This can lead to unexpected job time overruns when exploring benchmark scenarios.

Note this does not appear to affect benchmark results. It just seriously degrades process runtime with what is assumed to be metadata operations related to the file open from all ranks. This has been tested with collective I/O, although the output.csv is reported as part of STDIO module accounting according to Darshan.

This configuration arises when setting the -1 Lustre stripe pattern for the HDF5 write tests by applying it to the storage directory specified in the .json config file. This approach ensures newly created HDF5 files have the desired stripe config. If the benchmark config doesn't take care to move the output.csv file to a location where this strip pattern is not active, the benchmark runtime grows due to file open overhead.

It seems reasonable to open the output.csv only for the rank that will write the file, currently rank 0. Because the file is opened during parameter parsing there is no information available about the rank of the process. This can be resolved by moving the csv_init() out of the _set_params() function in hbench_util.c and into the main() body of the benchmark runner, which knows the required function params and the process rank. This also gives ranks more control over when the file is opened. It is only written by rank 0 at the conclusion of all benchmark operations.

To Reproduce

How are you building/running h5bench?

Running

mpirun -n 512 -ppn 64 --depth 1  --label --line-buffer /eagle/dist_relational_alg/nuio/bin//h5bench_write storage/f41141cb-2100189.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/h5bench.cfg storage/test-2100189.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov.h5

I'm building h5bench on Polaris.

mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=${NUIO_INSTALL_DIR} \
         -DWITH_ASYNC_VOL=ON \
         -DCMAKE_C_FLAGS="-I$HDF5_DIR/include -L/$HDF5_DIR/lib"

make
make install

What is the input configuration file you use?

{
    "mpi": {
        "command": "mpirun",
        "ranks": "64",
        "configuration": "-n 512 -ppn 64 --depth 1  --label --line-buffer"
    },
    "vol": {},
    "file-system": {},
    "directory": "storage",
    "benchmarks": [
        {
            "benchmark": "write",
            "file": "test-2100189.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov.h5",
            "configuration": {
                "MEM_PATTERN": "CONTIG",
                "FILE_PATTERN": "CONTIG",
                "TIMESTEPS": "3",
                "DELAYED_CLOSE_TIMESTEPS": "2",
                "COLLECTIVE_DATA": "YES",
                "COLLECTIVE_METADATA": "YES",
                "EMULATED_COMPUTE_TIME_PER_TIMESTEP": "1 s",
                "NUM_DIMS": "1",
                "DIM_1": "375385",
                "STDEV_DIM_1": "0",
                "DIM_2": "1",
                "DIM_3": "1",
                "CSV_FILE": "output.csv",
                "MODE": "SYNC",
                "DATA_DIST_SCALE": "5"
            }
        }
    ]
}

Expected Behavior

Opening log files should have low overhead.

Software Environment

  • version of h5bench: [e.g. 1.0] 1.4
  • installed h5bench using: [spack, from source] from source
  • operating system: [name and version] SUSE in CrayOS
  • machine: [Are you running on a supercomputer or public cluster?] Polaris
  • version of HDF5: [e.g. 1.12.0] 1.14
  • version of VOL-ASYNC: [e.g. 1.13.1] 1.8.1
  • name and version of MPI: [e.g. OpenMPI 4.1.1] cray-mpich/8.1.28

Additional information

Add any other information about the problem here.

@jprorama jprorama added the bug Something isn't working label Sep 29, 2024
@jprorama jprorama changed the title Opening CSV file in all ranks leads Opening output.csv file in all ranks can lead to very high file open overhead Sep 29, 2024
jprorama added a commit to jprorama/h5bench that referenced this issue Oct 3, 2024
This avoids opening the file in all ranks, causing file open contention
between ranks.  Contention increases with rank count and stripe count.

Proposed partial fix for hpc-io#131 for base write benchmarks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant