Mesh Oriented datABase  (version 5.4.1)
Array-based unstructured mesh datastructure
moab::FileTokenizer Class Reference

Parse a file as space-separated tokens. More...

#include <FileTokenizer.hpp>

Public Member Functions

 FileTokenizer (std::FILE *file_ptr, ReadUtilIface *read_util_ptr)
 constructor
 ~FileTokenizer ()
 destructor : closes file.
const char * get_string ()
 get next token
bool get_newline (bool report_error=true)
 check for newline
bool get_doubles (size_t count, double *array)
 Parse a sequence of double values.
bool get_floats (size_t count, float *array)
 Parse a sequence of float values.
bool get_integers (size_t count, int *array)
 Parse a sequence of integer values.
bool get_long_ints (size_t count, long *array)
 Parse a sequence of integer values.
bool get_short_ints (size_t count, short *array)
 Parse a sequence of integer values.
bool get_bytes (size_t count, unsigned char *array)
 Parse a sequence of integer values.
bool get_binary (size_t bytes, void *mem)
 Read binary data (interleaved with ASCII)
bool get_booleans (size_t count, bool *array)
 Parse a sequence of bit or boolean values.
bool eof () const
int line_number () const
void unget_token ()
bool match_token (const char *string, bool print_error=true)
int match_token (const char *const *string_list, bool print_error=true)

Private Member Functions

bool get_double_internal (double &result)
bool get_long_int_internal (long &result)
bool get_boolean_internal (bool &result)
bool get_float_internal (float &result)
bool get_integer_internal (int &result)
bool get_short_int_internal (short &result)
bool get_byte_internal (unsigned char &result)

Private Attributes

std::FILE * filePtr
char buffer [512]
char * nextToken
char * bufferEnd
int lineNumber
char lastChar

Detailed Description

Parse a file as space-separated tokens.

Author:
Jason Kraftcheck
Date:
30 Sept 2004

Read a file, separating it into space-separated tokens. This is provided in place of using the standard C or C++ file parsing routines because it counts lines, which is useful for error reporting. Also provides some useful utility methods for parsing VTK files (which is the intended use of this implementation.)

Uses raw reads/writes, implementing internal buffering. Token size may not exceed buffer size.

Definition at line 44 of file FileTokenizer.hpp.


Constructor & Destructor Documentation

moab::FileTokenizer::FileTokenizer ( std::FILE *  file_ptr,
ReadUtilIface read_util_ptr 
)

constructor

Parameters:
file_ptrThe file to read from.
read_util_ptrPointer to ReadUtilIface to use for reporting errors.

Definition at line 30 of file FileTokenizer.cpp.

    : filePtr( file_ptr ), nextToken( buffer ), bufferEnd( buffer ), lineNumber( 1 ), lastChar( '\0' )
{
}

destructor : closes file.

The destructor closes the passed file handle. This is done as a convenience feature. If the caller creates an instance of this object on the stack, the file will automatically be closed when the caller returns.

Definition at line 35 of file FileTokenizer.cpp.

References filePtr.

{
    fclose( filePtr );
}

Member Function Documentation

bool moab::FileTokenizer::eof ( ) const

Check for end-of-file condition.

Definition at line 40 of file FileTokenizer.cpp.

References bufferEnd, filePtr, and nextToken.

Referenced by get_newline(), and moab::ReadVtk::load_file().

{
    return nextToken == bufferEnd && feof( filePtr );
}
bool moab::FileTokenizer::get_binary ( size_t  bytes,
void *  mem 
)

Read binary data (interleaved with ASCII)

Read a block of binary data.

Parameters:
bytesNumber of bytes to read
memMemory address at which to store data.

Definition at line 420 of file FileTokenizer.cpp.

References bufferEnd, filePtr, nextToken, and size.

{
    // If data in buffer
    if( nextToken != bufferEnd )
    {
        // If requested size is less than buffer contents,
        // just pass back part of the buffer
        if( bufferEnd - nextToken <= (int)size )
        {
            memcpy( mem, nextToken, size );
            nextToken += size;
            return true;
        }

        // Copy buffer contents into memory and clear buffer
        memcpy( mem, nextToken, bufferEnd - nextToken );
        size -= bufferEnd - nextToken;
        mem       = reinterpret_cast< char* >( mem ) + ( bufferEnd - nextToken );
        nextToken = bufferEnd;
    }

    // Read any additional data from file
    return size == fread( mem, 1, size, filePtr );
}
bool moab::FileTokenizer::get_boolean_internal ( bool &  result) [private]

Internal implementation of get_Booleans

Definition at line 214 of file FileTokenizer.cpp.

References get_string(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_booleans().

{
    // Get a token
    const char* token = get_string();
    if( !token ) return false;

    if( token[1] || ( token[0] != '0' && token[0] != '1' ) )
        MB_SET_ERR_RET_VAL( "Syntax error at line " << line_number() << ": expected 0 or 1, got \"" << token << "\"",
                            false );

    result = token[0] == '1';

    return true;
}
bool moab::FileTokenizer::get_booleans ( size_t  count,
bool *  array 
)

Parse a sequence of bit or boolean values.

Read the specified number of space-delimited values.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 295 of file FileTokenizer.cpp.

References get_boolean_internal().

Referenced by moab::ReadVtk::vtk_read_tag_data().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_boolean_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_byte_internal ( unsigned char &  result) [private]

Internal implementation of get_bytes

Definition at line 181 of file FileTokenizer.cpp.

References get_long_int_internal(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_bytes().

{
    long i;
    if( !get_long_int_internal( i ) ) return false;

    result = (unsigned char)i;
    if( i != (long)result ) MB_SET_ERR_RET_VAL( "Numeric overflow at line " << line_number(), false );

    return true;
}
bool moab::FileTokenizer::get_bytes ( size_t  count,
unsigned char *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 251 of file FileTokenizer.cpp.

References get_byte_internal().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_byte_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_double_internal ( double &  result) [private]

Internal implementation of get_doubles

Definition at line 126 of file FileTokenizer.cpp.

References get_string(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_doubles(), and get_float_internal().

{
    // Get a token
    const char *token_end, *token = get_string();
    if( !token ) return false;

    // Check for hex value -- on some platforms (e.g. Linux), strtod
    // will accept hex values, on others (e.g. Sun) it will not.  Force
    // failure on hex numbers for consistency.
    if( token[0] && token[1] && token[0] == '0' && toupper( token[1] ) == 'X' )
        MB_SET_ERR_RET_VAL( "Syntax error at line " << line_number() << ": expected number, got \"" << token << "\"",
                            false );

    // Parse token as double
    result = strtod( token, (char**)&token_end );

    // If the one past the last char read by strtod is
    // not the NULL character terminating the string,
    // then parse failed.
    if( *token_end )
        MB_SET_ERR_RET_VAL( "Syntax error at line " << line_number() << ": expected number, got \"" << token << "\"",
                            false );

    return true;
}
bool moab::FileTokenizer::get_doubles ( size_t  count,
double *  array 
)

Parse a sequence of double values.

Read the specified number of space-delimited doubles.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 240 of file FileTokenizer.cpp.

References get_double_internal().

Referenced by moab::ReadGmsh::load_file(), moab::ReadVtk::read_vertices(), moab::ReadVtk::vtk_read_field(), moab::ReadVtk::vtk_read_rectilinear_grid(), moab::ReadVtk::vtk_read_structured_points(), and moab::ReadVtk::vtk_read_tag_data().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_double_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_float_internal ( float &  result) [private]

Internal implementation of get_floats

Definition at line 152 of file FileTokenizer.cpp.

References get_double_internal().

Referenced by get_floats().

{
    double d;
    if( !get_double_internal( d ) ) return false;

    result = (float)d;

    return true;
}
bool moab::FileTokenizer::get_floats ( size_t  count,
float *  array 
)

Parse a sequence of float values.

Read the specified number of space-delimited doubles.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 229 of file FileTokenizer.cpp.

References get_float_internal().

Referenced by moab::ReadSTL::ascii_read_triangles().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_float_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_integer_internal ( int &  result) [private]

Internal implementation of get_integers

Definition at line 203 of file FileTokenizer.cpp.

References get_long_int_internal(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_integers().

{
    long i;
    if( !get_long_int_internal( i ) ) return false;

    result = (int)i;
    if( i != (long)result ) MB_SET_ERR_RET_VAL( "Numeric overflow at line " << line_number(), false );

    return true;
}
bool moab::FileTokenizer::get_integers ( size_t  count,
int *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 273 of file FileTokenizer.cpp.

References get_integer_internal().

Referenced by moab::ReadGmsh::load_file(), moab::ReadVtk::vtk_read_tag_data(), and moab::ReadVtk::vtk_read_texture_attrib().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_integer_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_long_int_internal ( long &  result) [private]

Internal implementation of get_long_ints

Definition at line 162 of file FileTokenizer.cpp.

References get_string(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_byte_internal(), get_integer_internal(), get_long_ints(), and get_short_int_internal().

{
    // Get a token
    const char *token_end, *token = get_string();
    if( !token ) return false;

    // Parse token as long
    result = strtol( token, (char**)&token_end, 0 );

    // If the one past the last char read by strtol is
    // not the NULL character terminating the string,
    // then parse failed.
    if( *token_end )
        MB_SET_ERR_RET_VAL( "Syntax error at line " << line_number() << ": expected number, got \"" << token << "\"",
                            false );

    return true;
}
bool moab::FileTokenizer::get_long_ints ( size_t  count,
long *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 284 of file FileTokenizer.cpp.

References get_long_int_internal().

Referenced by moab::ReadVtk::load_file(), moab::ReadGmsh::load_file(), moab::ReadVtk::vtk_read_color_attrib(), moab::ReadVtk::vtk_read_field(), moab::ReadVtk::vtk_read_field_attrib(), moab::ReadVtk::vtk_read_polydata(), moab::ReadVtk::vtk_read_polygons(), moab::ReadVtk::vtk_read_rectilinear_grid(), moab::ReadVtk::vtk_read_structured_grid(), moab::ReadVtk::vtk_read_structured_points(), and moab::ReadVtk::vtk_read_unstructured_grid().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_long_int_internal( *array ) ) return false;
        ++array;
    }

    return true;
}
bool moab::FileTokenizer::get_newline ( bool  report_error = true)

check for newline

Consume whitespace up to and including the next newline. If a non-space character is found before a newline, the function will stop, set the error message, and return false.

Returns:
True if a newline was found before any non-space character. False otherwise.

Definition at line 372 of file FileTokenizer.cpp.

References buffer, bufferEnd, eof(), filePtr, lastChar, line_number(), lineNumber, MB_SET_ERR_RET_VAL, and nextToken.

Referenced by moab::ReadGmsh::load_file(), moab::ReadVtk::vtk_read_polydata(), moab::ReadVtk::vtk_read_polygons(), moab::ReadVtk::vtk_read_rectilinear_grid(), moab::ReadVtk::vtk_read_structured_grid(), moab::ReadVtk::vtk_read_structured_points(), and moab::ReadVtk::vtk_read_unstructured_grid().

{
    if( lastChar == '\n' )
    {
        lastChar = ' ';
        ++lineNumber;
        return true;
    }

    // Loop until either we a) find a newline, b) find a non-whitespace
    // character or c) reach the end of the file.
    for( ;; )
    {
        // If the buffer is empty, read more.
        if( nextToken == bufferEnd )
        {
            size_t count = fread( buffer, 1, sizeof( buffer ), filePtr );
            if( 0 == count )
            {
                if( eof() )
                    MB_SET_ERR_RET_VAL( "File truncated at line " << line_number(), false );
                else
                    MB_SET_ERR_RET_VAL( "I/O Error", false );
            }

            nextToken = buffer;
            bufferEnd = buffer + count;
        }

        // If the current character is not a space, the we've failed.
        if( !isspace( *nextToken ) )
            if( report_error ) MB_SET_ERR_RET_VAL( "Expected newline at line " << line_number(), false );

        // If the current space character is a newline,
        // increment the line number count.
        if( *nextToken == '\n' )
        {
            ++lineNumber;
            ++nextToken;
            lastChar = ' ';
            return true;
        }
        ++nextToken;
    }

    return false;
}
bool moab::FileTokenizer::get_short_int_internal ( short &  result) [private]

Internal implementation of get_short_ints

Definition at line 192 of file FileTokenizer.cpp.

References get_long_int_internal(), line_number(), and MB_SET_ERR_RET_VAL.

Referenced by get_short_ints().

{
    long i;
    if( !get_long_int_internal( i ) ) return false;

    result = (short)i;
    if( i != (long)result ) MB_SET_ERR_RET_VAL( "Numeric overflow at line " << line_number(), false );

    return true;
}
bool moab::FileTokenizer::get_short_ints ( size_t  count,
short *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 262 of file FileTokenizer.cpp.

References get_short_int_internal().

{
    for( size_t i = 0; i < count; ++i )
    {
        if( !get_short_int_internal( *array ) ) return false;
        ++array;
    }

    return true;
}

get next token

Get the next whitespace-delimited token from the file. NOTE: The returned string is only valid until the next call to any of the functions in this class that read from the file.

Returns:
A pointer to the buffer space containing the string, or NULL if an error occurred.

Definition at line 45 of file FileTokenizer.cpp.

References buffer, bufferEnd, filePtr, lastChar, lineNumber, MB_SET_ERR_RET_VAL, and nextToken.

Referenced by get_boolean_internal(), get_double_internal(), get_long_int_internal(), moab::ReadGmsh::load_file(), match_token(), moab::ReadVtk::vtk_read_attrib_data(), moab::ReadVtk::vtk_read_field(), moab::ReadVtk::vtk_read_field_attrib(), and moab::ReadVtk::vtk_read_scalar_attrib().

{
    // If the whitespace character marking the end of the
    // last token was a newline, increment the line count.
    if( lastChar == '\n' ) ++lineNumber;

    // Loop until either found the start of a token to return or have
    // reached the end of the file.
    for( ;; )
    {
        // If the buffer is empty, read more.
        if( nextToken == bufferEnd )
        {
            size_t count = fread( buffer, 1, sizeof( buffer ) - 1, filePtr );
            if( 0 == count )
            {
                if( feof( filePtr ) )
                    return NULL;
                else
                    MB_SET_ERR_RET_VAL( "I/O Error", NULL );
            }

            nextToken = buffer;
            bufferEnd = buffer + count;
        }

        // If the current character is not a space, we've found a token.
        if( !isspace( *nextToken ) ) break;

        // If the current space character is a newline,
        // increment the line number count.
        if( *nextToken == '\n' ) ++lineNumber;
        ++nextToken;
    }

    // Store the start of the token in "result" and
    // advance "nextToken" to one past the end of the
    // token.
    char* result = nextToken;
    while( nextToken != bufferEnd && !isspace( static_cast< unsigned char >( *nextToken ) ) )
        ++nextToken;

    // If we have reached the end of the buffer without finding
    // a whitespace character terminating the token, we need to
    // read more from the file.  Only try once.  If the token is
    // too large to fit in the buffer, give up.
    if( nextToken == bufferEnd )
    {
        // Shift the (possibly) partial token to the start of the buffer.
        size_t remaining = bufferEnd - result;
        memmove( buffer, result, remaining );
        result    = buffer;
        nextToken = result + remaining;

        // Fill the remainder of the buffer after the token.
        size_t count = fread( nextToken, 1, sizeof( buffer ) - remaining - 1, filePtr );
        if( 0 == count && !feof( filePtr ) ) MB_SET_ERR_RET_VAL( "I/O Error", NULL );
        bufferEnd = nextToken + count;

        // Continue to advance nextToken until we find the space
        // terminating the token.
        while( nextToken != bufferEnd && !isspace( *nextToken ) )
            ++nextToken;

        if( nextToken == bufferEnd )
        {  // EOF
            *bufferEnd = '\0';
            ++bufferEnd;
        }
    }

    // Save terminating whitespace character (or NULL char if EOF).
    lastChar = *nextToken;
    // Put null in buffer to mark end of current token.
    *nextToken = '\0';
    // Advance nextToken to the next character to search next time.
    ++nextToken;

    return result;
}
bool moab::FileTokenizer::match_token ( const char *  string,
bool  print_error = true 
)
int moab::FileTokenizer::match_token ( const char *const *  string_list,
bool  print_error = true 
)

Match the current token to one of an array of strings. Sets the error message if the current token doesn't match any of the input strings.

Parameters:
string_listA NULL-terminated array of strings.
Returns:
One greater than the index of the matched string, or zero if no match.

Definition at line 338 of file FileTokenizer.cpp.

References get_string(), line_number(), and MB_SET_ERR_CONT.

{
    // Get a token
    const char* token = get_string();
    if( !token ) return 0;

    // Check if it matches any input string
    const char* const* ptr;
    for( ptr = list; *ptr; ++ptr )
    {
        if( 0 == strcmp( token, *ptr ) ) return ptr - list + 1;
    }

    if( !print_error ) return 0;

    // No match, constuct error message
    std::string message( "Parsing error at line " );
    char lineno[16];
    sprintf( lineno, "%d", line_number() );
    message += lineno;
    message += ": expected one of {";
    for( ptr = list; *ptr; ++ptr )
    {
        message += " ";
        message += *ptr;
    }
    message += " } got \"";
    message += token;
    message += "\"";
    MB_SET_ERR_CONT( message.c_str() );

    return 0;
}

Put current token back in buffer. Can only unget one token.

Definition at line 306 of file FileTokenizer.cpp.

References buffer, lastChar, and nextToken.

Referenced by moab::ReadVtk::load_file(), and moab::ReadVtk::vtk_read_scalar_attrib().

{
    if( nextToken - buffer < 2 ) return;

    --nextToken;
    *nextToken = lastChar;
    --nextToken;
    while( nextToken > buffer && *nextToken )
        --nextToken;

    if( !*nextToken ) ++nextToken;

    lastChar = '\0';
}

Member Data Documentation

char moab::FileTokenizer::buffer[512] [private]

Input buffer

Definition at line 223 of file FileTokenizer.hpp.

Referenced by get_newline(), get_string(), and unget_token().

One past the last used byte of the buffer

Definition at line 228 of file FileTokenizer.hpp.

Referenced by eof(), get_binary(), get_newline(), and get_string().

std::FILE* moab::FileTokenizer::filePtr [private]

Pointer to standard C FILE struct

Definition at line 220 of file FileTokenizer.hpp.

Referenced by eof(), get_binary(), get_newline(), get_string(), and ~FileTokenizer().

The whitespace character marking the end of the last returned token. Saved here because if it is a newline, the line count will need to be incremented when the next token is returned.

Definition at line 238 of file FileTokenizer.hpp.

Referenced by get_newline(), get_string(), and unget_token().

Line number of last returned token

Definition at line 231 of file FileTokenizer.hpp.

Referenced by get_newline(), get_string(), and line_number().

One past the end of the last token returned

Definition at line 226 of file FileTokenizer.hpp.

Referenced by eof(), get_binary(), get_newline(), get_string(), and unget_token().

List of all members.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines