MOAB: Mesh Oriented datABase  (version 5.4.1)
moab::ReadHDF5Dataset Class Reference

Utility used for reading portions of an HDF5 dataset. More...

#include <ReadHDF5Dataset.hpp>

+ Collaboration diagram for moab::ReadHDF5Dataset:

Classes

class  Exception

Public Types

typedef int Comm

Public Member Functions

 ReadHDF5Dataset (const char *debug_desc, hid_t data_set_handle, bool parallel, const Comm *communicator=0, bool close_data_set_on_destruct=true)
 Setup to read entire table.
 ReadHDF5Dataset (const char *debug_desc, bool parallel, const Comm *communicator=0)
void init (hid_t data_set_handle, bool close_data_set_on_destruct=true)
bool will_close_data_set () const
void close_data_set_on_destruct (bool val)
 ~ReadHDF5Dataset ()
void set_file_ids (const Range &file_ids, EntityHandle start_id, hsize_t row_cout, hid_t data_type)
 Change file ids to read from.
void set_all_file_ids (hsize_t row_count, hid_t data_type)
 Read all values in dataset (undo set_file_ids)
bool done () const
 Return false if more data to read, true otherwise.
void read (void *buffer, size_t &rows_read)
 Read rows of table.
Range::const_iterator next_file_id () const
 Return position in Range of file IDs at which next read will start.
void null_read ()
 Do null read operation.
unsigned columns () const
void set_column (unsigned c)
unsigned long get_read_count () const
const char * get_debug_desc () const

Static Public Member Functions

static void set_hyperslab_selection_limit (size_t val)
static void default_hyperslab_selection_limit ()
static void append_hyperslabs ()
static void or_hyperslabs ()

Private Member Functions

Range::const_iterator next_end (Range::const_iterator iter)

Private Attributes

Range internalRange
 used when reading entire dataset
bool closeDataSet
 close dataset in destructor
hsize_t dataSetOffset [64]
hsize_t dataSetCount [64]
hid_t dataSet
 Handle for HDF5 data set.
hid_t dataSpace
 Data space for data set.
hid_t dataType
 Data type client code wants for data.
hid_t fileType
 Data type as stored in data set.
hid_t ioProp
 Used to specify collective IO.
int dataSpaceRank
 Rank of data set.
hsize_t rowsInTable
 Total number of rows in dataset.
bool doConversion
 True if dataType != fileType.
bool nativeParallel
 If true then reading different data on different procs.
hsize_t readCount
 Number of actual reads to do.
hsize_t bufferSize
 size of buffer passed to read, in number of rows
const CommmpiComm
Range::const_iterator currOffset
Range::const_iterator rangeEnd
EntityHandle startID
std::string mpeDesc

Static Private Attributes

static bool haveMPEEvents = false
static std::pair< int, int > mpeReadEvent
static std::pair< int, int > mpeReduceEvent
static size_t hyperslabSelectionLimit = DEFAULT_HYPERSLAB_SELECTION_LIMIT
static H5S_seloper_t hyperslabSelectOp = H5S_SELECT_OR

Detailed Description

Utility used for reading portions of an HDF5 dataset.

Implement iterative read of table where:

  • subset of rows to be read can be specified usign an Range of offsets
  • each read fills as much as possible of a passed buffer
  • each read call reads a subsequent set of rows of the data set in an iterator-like fashion.

NOTE: This class also implements an RAII pattern for the data set handle: It will close the data set in its destructor unless it is specified to the constructor that only a single column should be read.

NOTE: This class will always do collective IO for parallel reads.

Definition at line 38 of file ReadHDF5Dataset.hpp.


Member Typedef Documentation

Definition at line 44 of file ReadHDF5Dataset.hpp.


Constructor & Destructor Documentation

moab::ReadHDF5Dataset::ReadHDF5Dataset ( const char *  debug_desc,
hid_t  data_set_handle,
bool  parallel,
const Comm communicator = 0,
bool  close_data_set_on_destruct = true 
)

Setup to read entire table.

Parameters:
data_set_handleThe HDF5 DataSet to read.
parallelDoing true partial-read parallel read (as opposed to read and delete where collective IO is done for everything because all procs read the same stuff.)
communictorIf parallel is true and io_prop is H5FD_MPIO_COLLECTIVE, then this must be a pointer to the MPI_Communicator value.
close_data_set_on_destructCall H5Dclose on passed data_set_handle in desturctor.

If parallel is true and io_prop is H5FD_MPIO_COLLECTIVE, then not only must communicator be non-null, but this call must be made collectively!

Class instance will not be usable until one of either set_file_ids or set_all_file_ids is called.

Definition at line 85 of file ReadHDF5Dataset.cpp.

References moab::allocate_mpe_state(), haveMPEEvents, init(), ioProp, mpeReadEvent, mpeReduceEvent, mpiComm, and nativeParallel.

    : closeDataSet( close_data_set ), dataSet( data_set_handle ), dataSpace( -1 ), dataType( -1 ), fileType( -1 ),
      ioProp( H5P_DEFAULT ), dataSpaceRank( 0 ), rowsInTable( 0 ), doConversion( false ), nativeParallel( parallel ),
      readCount( 0 ), bufferSize( 0 ), mpiComm( communicator ), mpeDesc( debug_desc )
{
    if( !haveMPEEvents )
    {
        haveMPEEvents  = true;
        mpeReadEvent   = allocate_mpe_state( "ReadHDF5Dataset::read", "yellow" );
        mpeReduceEvent = allocate_mpe_state( "ReadHDF5Dataset::all_reduce", "yellow" );
    }

    init( data_set_handle, close_data_set );

#ifndef MOAB_HAVE_HDF5_PARALLEL
    if( nativeParallel ) throw Exception( __LINE__ );
#else
    if( nativeParallel && !mpiComm ) throw Exception( __LINE__ );

    if( mpiComm )
    {
        ioProp = H5Pcreate( H5P_DATASET_XFER );
        H5Pset_dxpl_mpio( ioProp, H5FD_MPIO_COLLECTIVE );
    }
#endif
}
moab::ReadHDF5Dataset::ReadHDF5Dataset ( const char *  debug_desc,
bool  parallel,
const Comm communicator = 0 
)

Definition at line 60 of file ReadHDF5Dataset.cpp.

References moab::allocate_mpe_state(), haveMPEEvents, ioProp, mpeReadEvent, mpeReduceEvent, mpiComm, and nativeParallel.

    : closeDataSet( false ), dataSet( -1 ), dataSpace( -1 ), dataType( -1 ), fileType( -1 ), ioProp( H5P_DEFAULT ),
      dataSpaceRank( 0 ), rowsInTable( 0 ), doConversion( false ), nativeParallel( parallel ), readCount( 0 ),
      bufferSize( 0 ), mpiComm( communicator ), mpeDesc( debug_desc )
{
    if( !haveMPEEvents )
    {
        haveMPEEvents  = true;
        mpeReadEvent   = allocate_mpe_state( "ReadHDF5Dataset::read", "yellow" );
        mpeReduceEvent = allocate_mpe_state( "ReadHDF5Dataset::all_reduce", "yellow" );
    }

#ifndef MOAB_HAVE_HDF5_PARALLEL
    if( nativeParallel ) throw Exception( __LINE__ );
#else
    if( nativeParallel && !mpiComm ) throw Exception( __LINE__ );

    if( mpiComm )
    {
        ioProp = H5Pcreate( H5P_DATASET_XFER );
        H5Pset_dxpl_mpio( ioProp, H5FD_MPIO_COLLECTIVE );
    }
#endif
}

Definition at line 231 of file ReadHDF5Dataset.cpp.

References closeDataSet, dataSet, dataSpace, fileType, and ioProp.

{
    if( fileType >= 0 ) H5Tclose( fileType );
    if( dataSpace >= 0 ) H5Sclose( dataSpace );
    if( closeDataSet && dataSet >= 0 ) H5Dclose( dataSet );
    dataSpace = dataSet = -1;
    if( ioProp != H5P_DEFAULT ) H5Pclose( ioProp );
}

Member Function Documentation

static void moab::ReadHDF5Dataset::append_hyperslabs ( ) [inline, static]

Use non-standard 'APPEND' operation for hyperslab selection

Definition at line 166 of file ReadHDF5Dataset.hpp.

References hyperslabSelectOp.

Referenced by moab::ReadHDF5::set_up_read().

    {
        hyperslabSelectOp = H5S_SELECT_APPEND;
    }

Definition at line 85 of file ReadHDF5Dataset.hpp.

References closeDataSet.

    {
        closeDataSet = val;
    }
unsigned moab::ReadHDF5Dataset::columns ( ) const

Definition at line 137 of file ReadHDF5Dataset.cpp.

References dataSetCount, and dataSpaceRank.

Referenced by read().

{
    if( dataSpaceRank == 1 )
        return 1;
    else if( dataSpaceRank == 2 )
        return dataSetCount[1];

    throw Exception( __LINE__ );
}
bool moab::ReadHDF5Dataset::done ( ) const [inline]

Return false if more data to read, true otherwise.

Test if the iterative read has reached the end.

Definition at line 116 of file ReadHDF5Dataset.hpp.

References currOffset, rangeEnd, and readCount.

Referenced by moab::ReadHDF5VarLen::read_data(), moab::ReadHDF5::read_elems(), moab::ReadHDF5::read_nodes(), moab::ReadHDF5VarLen::read_offsets(), moab::ReadHDF5::read_set_data(), moab::ReadHDF5::read_sparse_tag(), and moab::ReadHDF5::read_tag_values_partial().

    {
        return ( currOffset == rangeEnd ) && ( readCount == 0 );
    }
const char* moab::ReadHDF5Dataset::get_debug_desc ( ) const [inline]
void moab::ReadHDF5Dataset::init ( hid_t  data_set_handle,
bool  close_data_set_on_destruct = true 
)

Definition at line 116 of file ReadHDF5Dataset.cpp.

References closeDataSet, currOffset, dataSet, dataSetCount, dataSetOffset, dataSpace, dataSpaceRank, moab::Range::end(), fileType, internalRange, rangeEnd, and rowsInTable.

Referenced by moab::ReadHDF5::read_set_ids_recursive(), and ReadHDF5Dataset().

{
    closeDataSet = close_data_set;
    dataSet      = data_set_handle;

    fileType = H5Dget_type( data_set_handle );
    if( fileType < 0 ) throw Exception( __LINE__ );

    dataSpace = H5Dget_space( dataSet );
    if( dataSpace < 0 ) throw Exception( __LINE__ );

    dataSpaceRank = H5Sget_simple_extent_dims( dataSpace, dataSetCount, dataSetOffset );
    if( dataSpaceRank < 0 ) throw Exception( __LINE__ );
    rowsInTable = dataSetCount[0];

    for( int i = 0; i < dataSpaceRank; ++i )
        dataSetOffset[i] = 0;

    currOffset = rangeEnd = internalRange.end();
}

Definition at line 154 of file ReadHDF5Dataset.cpp.

References bufferSize, moab::Range::const_iterator::end_of_block(), hyperslabSelectionLimit, and rangeEnd.

Referenced by read(), and set_file_ids().

{
    size_t slabs_remaining = hyperslabSelectionLimit;
    size_t avail           = bufferSize;
    while( iter != rangeEnd && slabs_remaining )
    {
        size_t count = *( iter.end_of_block() ) - *iter + 1;
        if( count >= avail )
        {
            iter += avail;
            break;
        }

        avail -= count;
        iter += count;
        --slabs_remaining;
    }
    return iter;
}

Return position in Range of file IDs at which next read will start.

Definition at line 132 of file ReadHDF5Dataset.hpp.

References currOffset.

    {
        return currOffset;
    }

Do null read operation.

Do a read call requesting no data. This functionality is provided so as to allow collective IO when not all processes need to make the same number of read calls. To prevent deadlock in this case, processes that have finished their necessary read calls can call this function so that all processes are calling the read method collectively.

Definition at line 297 of file ReadHDF5Dataset.cpp.

References dataSet, dataSpace, fileType, and ioProp.

Referenced by read(), and moab::ReadHDF5::read_set_data().

{
    herr_t err;
    err = H5Sselect_none( dataSpace );
    if( err < 0 ) throw Exception( __LINE__ );

    //#if HDF5_16API
    hsize_t one  = 1;
    hid_t mem_id = H5Screate_simple( 1, &one, NULL );
    if( mem_id < 0 ) throw Exception( __LINE__ );
    err = H5Sselect_none( mem_id );
    if( err < 0 )
    {
        H5Sclose( mem_id );
        throw Exception( __LINE__ );
    }
    //#else
    //  hid_t mem_id = H5Screate(H5S_NULL);
    //  if (mem_id < 0)
    //    throw Exception(__LINE__);
    //#endif

    err = H5Dread( dataSet, fileType, mem_id, dataSpace, ioProp, 0 );
    H5Sclose( mem_id );
    if( err < 0 ) throw Exception( __LINE__ );
}
static void moab::ReadHDF5Dataset::or_hyperslabs ( ) [inline, static]

Revert to default select behavior for standard HDF5 library

Definition at line 171 of file ReadHDF5Dataset.hpp.

References hyperslabSelectOp.

    {
        hyperslabSelectOp = H5S_SELECT_OR;
    }
void moab::ReadHDF5Dataset::read ( void *  buffer,
size_t &  rows_read 
)

Read rows of table.

Read up to max_num_rows from data set.

Parameters:
bufferMemory in which to store values read from data set
rows_readThe actual number of rows read from the table. Will never exceed max_rows .

Definition at line 240 of file ReadHDF5Dataset.cpp.

References columns(), currOffset, dataSet, dataSetCount, dataSetOffset, dataSpace, dataSpaceRank, dataType, doConversion, moab::Range::const_iterator::end_of_block(), fileType, hyperslabSelectOp, ioProp, MPE_Log_event, mpeDesc, mpeReadEvent, next_end(), null_read(), rangeEnd, readCount, and startID.

Referenced by moab::ReadHDF5VarLen::read_data(), moab::ReadHDF5::read_elems(), moab::ReadHDF5::read_nodes(), moab::ReadHDF5VarLen::read_offsets(), moab::ReadHDF5::read_set_data(), moab::ReadHDF5::read_sparse_tag(), and moab::ReadHDF5::read_tag_values_partial().

{
    herr_t err;
    rows_read = 0;

    MPE_Log_event( mpeReadEvent.first, (int)readCount, mpeDesc.c_str() );
    if( currOffset != rangeEnd )
    {

        // Build H5S hyperslab selection describing the portions of the
        // data set to read
        H5S_seloper_t sop       = H5S_SELECT_SET;
        Range::iterator new_end = next_end( currOffset );
        while( currOffset != new_end )
        {
            size_t count = *( currOffset.end_of_block() ) - *currOffset + 1;
            if( new_end != rangeEnd && *currOffset + count > *new_end )
            {
                count = *new_end - *currOffset;
            }
            rows_read += count;

            dataSetOffset[0] = *currOffset - startID;
            dataSetCount[0]  = count;
            err              = H5Sselect_hyperslab( dataSpace, sop, dataSetOffset, NULL, dataSetCount, 0 );
            if( err < 0 ) throw Exception( __LINE__ );
            sop = hyperslabSelectOp;  // subsequent calls to select_hyperslab append

            currOffset += count;
        }

        // Create a data space describing the memory in which to read the data
        dataSetCount[0] = rows_read;
        hid_t mem_id    = H5Screate_simple( dataSpaceRank, dataSetCount, NULL );
        if( mem_id < 0 ) throw Exception( __LINE__ );

        // Do the actual read
        err = H5Dread( dataSet, fileType, mem_id, dataSpace, ioProp, buffer );
        H5Sclose( mem_id );
        if( err < 0 ) throw Exception( __LINE__ );

        if( readCount ) --readCount;

        if( doConversion )
        {
            err = H5Tconvert( fileType, dataType, rows_read * columns(), buffer, 0, H5P_DEFAULT );
            if( err < 0 ) throw Exception( __LINE__ );
        }
    }
    else if( readCount )
    {
        null_read();
        --readCount;
    }
    MPE_Log_event( mpeReadEvent.second, (int)readCount, mpeDesc.c_str() );
}
void moab::ReadHDF5Dataset::set_all_file_ids ( hsize_t  row_count,
hid_t  data_type 
)

Read all values in dataset (undo set_file_ids)

Parameters:
row_countRead buffer size in number of table rows.
data_typeThe data type of the buffer into which table values are to be read.

Definition at line 224 of file ReadHDF5Dataset.cpp.

References moab::Range::clear(), moab::Range::insert(), internalRange, rowsInTable, and set_file_ids().

Referenced by moab::ReadHDF5::read_sparse_tag_indices(), and moab::ReadHDF5::read_tag_values_partial().

void moab::ReadHDF5Dataset::set_column ( unsigned  c)

Definition at line 147 of file ReadHDF5Dataset.cpp.

References dataSetCount, dataSetOffset, and dataSpaceRank.

Referenced by moab::ReadHDF5::read_nodes().

{
    if( dataSpaceRank != 2 || column >= dataSetCount[1] ) throw Exception( __LINE__ );
    dataSetCount[1]  = 1;
    dataSetOffset[1] = column;
}
void moab::ReadHDF5Dataset::set_file_ids ( const Range file_ids,
EntityHandle  start_id,
hsize_t  row_cout,
hid_t  data_type 
)

Change file ids to read from.

Parameters:
file_idsList of rows to read from dataset
start_idRows of dataset are enumerating beginning with this value. Thus the offset row to be read from dataset will be file_ids.begin() - start_id .
row_countRead buffer size in number of table rows.
data_typeThe data type of the buffer into which table values are to be read.

Definition at line 174 of file ReadHDF5Dataset.cpp.

References moab::Range::begin(), bufferSize, currOffset, dataType, doConversion, moab::Range::end(), fileType, MPE_Log_event, mpeDesc, mpeReduceEvent, mpiComm, nativeParallel, next_end(), rangeEnd, readCount, and startID.

Referenced by moab::ReadHDF5VarLen::read_data(), moab::ReadHDF5::read_elems(), moab::ReadHDF5::read_nodes(), moab::ReadHDF5VarLen::read_offsets(), moab::ReadHDF5::read_set_data(), moab::ReadHDF5::read_sparse_tag(), moab::ReadHDF5::read_tag_values_partial(), and set_all_file_ids().

{
    startID    = start_id;
    currOffset = file_ids.begin();
    rangeEnd   = file_ids.end();
    readCount  = 0;
    bufferSize = row_count;

    // if a) user specified buffer size and b) we're doing a true
    // parallel partial read and c) we're doing collective I/O, then
    // we need to know the maximum number of reads that will be done.
#ifdef MOAB_HAVE_HDF5_PARALLEL
    if( nativeParallel )
    {
        Range::const_iterator iter = currOffset;
        while( iter != rangeEnd )
        {
            ++readCount;
            iter = next_end( iter );
        }

        MPE_Log_event( mpeReduceEvent.first, (int)readCount, mpeDesc.c_str() );
        unsigned long recv = readCount, send = readCount;
        MPI_Allreduce( &send, &recv, 1, MPI_UNSIGNED_LONG, MPI_MAX, *mpiComm );
        readCount = recv;
        MPE_Log_event( mpeReduceEvent.second, (int)readCount, mpeDesc.c_str() );
    }
#endif

    dataType     = data_type;
    htri_t equal = H5Tequal( fileType, dataType );
    if( equal < 0 ) throw Exception( __LINE__ );
    doConversion = !equal;

    // We always read in the format of the file to avoid stupind HDF5
    // library behavior when reading in parallel.  We call H5Tconvert
    // ourselves to do the data conversion.  If the type we're reading
    // from the file is larger than the type we want in memory, then
    // we need to reduce num_rows so that we can read the larger type
    // from the file into the passed buffer mean to accomodate num_rows
    // of values of the smaller in-memory type.
    if( doConversion )
    {
        size_t mem_size, file_size;
        mem_size  = H5Tget_size( dataType );
        file_size = H5Tget_size( fileType );
        if( file_size > mem_size ) bufferSize = bufferSize * mem_size / file_size;
    }
}
static void moab::ReadHDF5Dataset::set_hyperslab_selection_limit ( size_t  val) [inline, static]

Definition at line 159 of file ReadHDF5Dataset.hpp.

References hyperslabSelectionLimit.

Referenced by moab::ReadHDF5::set_up_read().

Definition at line 81 of file ReadHDF5Dataset.hpp.

References closeDataSet.

    {
        return closeDataSet;
    }

Member Data Documentation

size of buffer passed to read, in number of rows

Definition at line 194 of file ReadHDF5Dataset.hpp.

Referenced by next_end(), and set_file_ids().

close dataset in destructor

Definition at line 181 of file ReadHDF5Dataset.hpp.

Referenced by close_data_set_on_destruct(), init(), will_close_data_set(), and ~ReadHDF5Dataset().

Handle for HDF5 data set.

Definition at line 183 of file ReadHDF5Dataset.hpp.

Referenced by init(), null_read(), read(), and ~ReadHDF5Dataset().

hsize_t moab::ReadHDF5Dataset::dataSetCount[64] [private]

Definition at line 182 of file ReadHDF5Dataset.hpp.

Referenced by columns(), init(), read(), and set_column().

hsize_t moab::ReadHDF5Dataset::dataSetOffset[64] [private]

Definition at line 182 of file ReadHDF5Dataset.hpp.

Referenced by init(), read(), and set_column().

Data space for data set.

Definition at line 184 of file ReadHDF5Dataset.hpp.

Referenced by init(), null_read(), read(), and ~ReadHDF5Dataset().

Rank of data set.

Definition at line 188 of file ReadHDF5Dataset.hpp.

Referenced by columns(), init(), read(), and set_column().

Data type client code wants for data.

Definition at line 185 of file ReadHDF5Dataset.hpp.

Referenced by read(), and set_file_ids().

True if dataType != fileType.

Definition at line 190 of file ReadHDF5Dataset.hpp.

Referenced by read(), and set_file_ids().

Data type as stored in data set.

Definition at line 186 of file ReadHDF5Dataset.hpp.

Referenced by init(), null_read(), read(), set_file_ids(), and ~ReadHDF5Dataset().

bool moab::ReadHDF5Dataset::haveMPEEvents = false [static, private]

Definition at line 200 of file ReadHDF5Dataset.hpp.

Referenced by ReadHDF5Dataset().

H5S_seloper_t moab::ReadHDF5Dataset::hyperslabSelectOp = H5S_SELECT_OR [static, private]

Definition at line 206 of file ReadHDF5Dataset.hpp.

Referenced by append_hyperslabs(), or_hyperslabs(), and read().

used when reading entire dataset

Definition at line 179 of file ReadHDF5Dataset.hpp.

Referenced by init(), and set_all_file_ids().

Used to specify collective IO.

Definition at line 187 of file ReadHDF5Dataset.hpp.

Referenced by null_read(), read(), ReadHDF5Dataset(), and ~ReadHDF5Dataset().

std::string moab::ReadHDF5Dataset::mpeDesc [private]

Definition at line 203 of file ReadHDF5Dataset.hpp.

Referenced by get_debug_desc(), read(), and set_file_ids().

std::pair< int, int > moab::ReadHDF5Dataset::mpeReadEvent [static, private]

Definition at line 201 of file ReadHDF5Dataset.hpp.

Referenced by read(), and ReadHDF5Dataset().

std::pair< int, int > moab::ReadHDF5Dataset::mpeReduceEvent [static, private]

Definition at line 202 of file ReadHDF5Dataset.hpp.

Referenced by ReadHDF5Dataset(), and set_file_ids().

Definition at line 195 of file ReadHDF5Dataset.hpp.

Referenced by ReadHDF5Dataset(), and set_file_ids().

If true then reading different data on different procs.

Definition at line 191 of file ReadHDF5Dataset.hpp.

Referenced by ReadHDF5Dataset(), and set_file_ids().

Number of actual reads to do.

Definition at line 193 of file ReadHDF5Dataset.hpp.

Referenced by done(), get_read_count(), read(), and set_file_ids().

Total number of rows in dataset.

Definition at line 189 of file ReadHDF5Dataset.hpp.

Referenced by init(), and set_all_file_ids().

Definition at line 198 of file ReadHDF5Dataset.hpp.

Referenced by read(), and set_file_ids().

List of all members.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines