![]() |
Mesh Oriented datABase
(version 5.4.1)
Array-based unstructured mesh datastructure
|
00001 /*! \page developerguide Developer's Guide
00002
00003 \subpage dg-contents
00004
00005 \subpage dg-figures
00006
00007 */
00008
00009 /*! \page dg-figures List of Figures
00010
00011 \ref figure1
00012
00013 \ref figure2
00014
00015 \ref figure3
00016 */
00017
00018
00019 /*! \page dg-contents Table of Contents
00020
00021 \ref sequence
00022
00023 \ref manager
00024
00025 \ref s-mesh
00026
00027 \ref sets
00028
00029 \ref impl-error-handling
00030
00031 \ref dgfiveone
00032
00033 \ref dgfivetwo
00034
00035 \ref dgfivethree
00036
00037 \ref dgfivefour
00038
00039 \ref dgfivefive
00040
00041 \ref dgfivesix
00042
00043 \section sequence 1.EntitySequence & SequenceData
00044
00045 \subsection figure1 Figure 1: EntitySequences For One SequenceData
00046 \image html figure1.jpg
00047
00048 \ref dg-figures "List of Figures"
00049
00050 The SequenceData class manages as set of arrays of per-entity values. Each
00051 SequenceData has a start and end handle denoting the block of entities for which
00052 the arrays contain data. The arrays managed by a SequenceData instance are
00053 divided into three groups:
00054
00055 - Type-specific data (connectivity, coordinates, etc.): zero or more arrays.
00056 - Adjacency data: zero or one array.
00057 - Dense tag data: zero or more arrays.
00058 .
00059
00060 The abstract EntitySequence class is a non-strict subset of a SequenceData.
00061 It contains a pointer to a SequenceData and the start and end handles to indicate
00062 the subset of the referenced SequenceData. The EntitySequence class is
00063 used to represent the regions of valid (or allocated) handles in a SequenceData.
00064 A SequenceData is expected to be referenced by one or more EntitySequence
00065 instances.
00066
00067 Initial EntitySequence and SequenceData pairs are typically created in one
00068 of two configurations. When reading from a file, a SequenceData will be created
00069 to represent all of a single type of entity contained in a file. As all entries in the SequenceData correspond to valid handles (entities read from the file) a single
00070 EntitySequence instance corresponding to the entire SequenceData is initially
00071 created. The second configuration arises when allocating a single entity. If no
00072 entities have been allocated yet, a new SequenceData must be created to store
00073 the entity data. It is created with a constant size (e.g. 4k entities). The new
00074 EntitySequence corresponds to only the first entity in the SequenceData: the
00075 one allocated entity. As subsequent entities are allocated, the EntitySequence
00076 is extended to cover more of the corresponding SequenceData.
00077
00078 Concrete subclasses of the EntitySequence class are responsible for representing
00079 specific types of entities using the array storage provided by the
00080 SequenceData class. They also handle allocating SequenceData instances with
00081 appropriate arrays for storing a particular type of entity. Each concrete subclass
00082 typically provides two constructors corresponding to the two initial allocation
00083 configurations described in the previous paragraph. EntitySequence implementations
00084 also provide a split method, which is a type of factory method. It
00085 modifies the called sequence and creates a new sequence such that the range of
00086 entities represented by the original sequence is split.
00087
00088 The VertexSequence class provides an EntitySequence for storing vertex
00089 data. It references a SequenceData containing three arrays of doubles
00090 for storing the blocked vertex coordinate data. The ElementSequence class
00091 extends the EntitySequence interface with element-specific functionality. The
00092 UnstructuredElemSeq class is the concrete implementation of ElementSequence
00093 used to represent unstructured elements, polygons, and polyhedra. MeshSetSequence
00094 is the EntitySequence used for storing entity sets.
00095
00096 Each EntitySequence implementation also provides an implementation of
00097 the values per entity method. This value is used to determine if an existing
00098 SequenceData that has available entities is suitable for storing a particular
00099 entity. For example, UnstructuredElemSeq returns the number of nodes per element
00100 from values per entity. When allocating a new element with a specific
00101 number of nodes, this value is used to determine if that element may be stored
00102 in a specific SequenceData. For vertices, this value is always zero. This could
00103 be changed to the number of coordinates per vertex, allowing representation of
00104 mixed-dimension data. However, API changes would be required to utilize such
00105 a feature. Sequences for which the corresponding data cannot be used to store
00106 new entities (e.g. structured mesh discussed in a later section) will return -1 or
00107 some other invalid value.
00108
00109 \ref dg-contents "Top"
00110
00111 \section manager 2.TypeSequenceManager & SequenceManager
00112
00113 The TypeSequenceManager class maintains an organized set of EntitySequence
00114 instances and corresponding SequenceData instances. It is used to manage
00115 all such instances for entities of a single EntityType. TypeSequenceManager
00116 enforces the following four rules on its contained data:
00117
00118 -# No two SequenceData instances may overlap.
00119 -# No two EntitySequence instances may overlap.
00120 -# Every EntitySequence must be a subset of a SequenceData.
00121 -# Any pair of EntitySequence instances referencing the same SequenceData must be separated by at least one unallocated handle.
00122 .
00123
00124 \subsection figure2 Figure 2: SequenceManager and Related Classes
00125 \image html figure2.jpg
00126
00127 \ref dg-figures "List of Figures"
00128
00129 The first three rules are required for the validity of the data model. The
00130 fourth rule avoids unnecessary inefficiency. It is implemented by merging such
00131 adjacent sequences. In some cases, other classes (e.g. SequenceManager) can
00132 modify an EntitySequence such that the fourth rule is violated. In such cases,
00133 the TypeSequenceManager::notify prepended or TypeSequenceManager::notify appended
00134 method must be called to maintain the integrity of the data1. The above rules
00135 (including the fourth) are assumed in many other methods of the TypeSequenceManager
00136 class, such that those methods will fail or behave unexpectedly if the managed
00137 data does not conform to the rules.
00138
00139 TypeSequenceManager contains three principal data structures. The first is
00140 a std::set of EntitySequence pointers sorted using a custom comparison
00141 operator that queries the start and end handles of the referenced sequences. The
00142 comparison operation is defined as: a->end_handle() < b->start_handle().
00143 This method of comparison has the advantage that a sequence corresponding to
00144 a specific handle can be located by searching the set for a “sequence” beginning
00145 and ending with the search value. The lower bound and find methods provided
00146 by the library are guaranteed to return the sequence, if it exists. Using
00147 such a comparison operator will result in undefined behavior if the set contains
00148 overlapping sequences. This is acceptable, as rule two above prohibits such
00149 a configuration. However, some care must be taken in writing and modifying
00150 methods in TypeSequenceManager so as to avoid having overlapping sequences
00151 as a transitory state of some operation.
00152
00153 The second important data member of TypeSequenceManager is a pointer
00154 to the last referenced EntitySequence. This “cached” value is used to speed up
00155 searches by entity handle. This pointer is never null unless the sequence is empty.
00156 This rule is maintained to avoid unnecessary branches in fast query paths. In
00157 cases where the last referenced sequence is deleted, TypeSequenceManager will
00158 typically assign an arbitrary sequence (e.g. the first one) to the last referenced
00159 pointer. (Note: this cached value might give problems for threading models; it
00160 should probably be different for each thread)
00161
00162 The third data member of TypeSequenceManager is a std::set of SequenceData
00163 instances that are not completely covered by a EntitySequence instance2.
00164 This list is searched when allocating new handles. TypeSequenceManager also
00165 embeds in each SequenceData instance a reference to the first corresponding
00166 EntitySequence so that it may be located quickly from only the SequenceData
00167 pointer.
00168
00169 The SequenceManager class contains an array of TypeSequenceManager in-
00170 stances, one for each EntityType. It also provides all type-specific operations
00171 such as allocating the correct EntitySequence subtype for a given EntityType.
00172
00173 1This source of potential error can be eliminated with changes to the entity set representation.
00174 This is discussed in a later section.
00175
00176 2Given rule four for the data managed by a TypeSequenceManager, any
00177 SequenceData for which all handles are allocated will be referenced by exactly one EntitySequence.
00178
00179 \ref dg-contents "Top"
00180
00181 \section s-mesh 3.Structured Mesh
00182
00183 Structured mesh storage is implemented using subclasses of SequenceData:
00184 ScdElementData and ScdVertexData. The StructuredElementSeq class is
00185 used to access the structured element connectivity. A standard VertexSequence
00186 instance is used to access the ScdVertexData because the vertex data storage
00187 is the same as for unstructured mesh.
00188
00189 \ref dg-contents "Top"
00190
00191 \section sets 4.Entity Sets
00192
00193 - MeshSetSequence
00194
00195 The MeshSetSequence class is the same as most other subclasses of EntitySequence
00196 in that it utilizes SequenceData to store its data. A single array in the SequenceData
00197 is used to store instances of the MeshSet class, one per allocated EntityHandle.
00198 SequenceData allocates all of its managed arrays using malloc and free as
00199 simple arrays of bytes. MeshSetSequence does in-place construction and
00200 destruction of MeshSet instances within that array. This is similar to what is
00201 done by std::vector and other container classes that may own more storage
00202 than is required at a given time for contained objects.
00203
00204 - MeshSet
00205
00206 \subsection figure3 Figure 3: SequenceManager and Related Classes
00207 \image html figure3.jpg
00208
00209 \ref dg-figures "List of Figures"
00210
00211 The MeshSet class is used to represent a single entity set instance in MOAB.
00212 The class is optimized to minimize storage (further possible improvements in
00213 storage size are discussed later.)
00214
00215 Figure 3 shows the memory layout of an instance of the MeshSet class.
00216 The flags member holds the set creation bit flags: MESHSET_TRACK_OWNER,
00217 MESHSET_SET, and MESHSET_ORDERED. The presence of the MESHSET_TRACK_OWNER
00218 indicates that reverse links from the contained entities back to the owning set
00219 should be maintained in the adjacency list of each entity. The MESHSET_SET
00220 and MESHSET_ORDERED bits are mutually exclusive, and as such most code only
00221 tests for the MESHSET_ORDERED, meaning that in practice the MESHSET_SET bit is
00222 ignored. MESHSET_ORDERED indicates that the set may contain duplicate handles
00223 and that the order that the handles are added to the set should be preserved.
00224 In practice, such sets are stored as a simple list of handles. MESHSET_SET (or in
00225 practice, the lack of MESHSET_ORDERED) indicates that the order of the handles
00226 need not be preserved and that the set may not contain duplicate handles. Such
00227 sets are stored in a sorted range-compacted format similar to that of the Range
00228 class.
00229
00230 The memory for storing contents, parents, and children are each handled in
00231 the same way. The data in the class is composed of a 2-bit ‘size’ field and two
00232 values, where the two values may either be two handles or two pointers. The size
00233 bit-fields are grouped together to reduce the required amount of memory. If the
00234 numerical value of the 2-bit size field is 0 then the corresponding list is empty.
00235 If the 2-bit size field is either 1 or 2, then the contents of the corresponding list
00236 are stored directly in the corresponding two data fields of the MeshSet object.
00237 If the 2-bit size field has a value of 3 (11 binary), then the corresponding two
00238 data fields store the begin and end pointers of an external array of handles.
00239 The number of handles in the external array can be obtained by taking the
00240 difference of the start and end pointers. Note that unlike std::vector, we
00241 do not store both an allocated and used size. We store only the ‘used’ size
00242 and call std::realloc whenever the used size is modified, thus we rely on the
00243 std::malloc implementation in the standard C library to track ‘allocated’ size
00244 for us. In practice this performs well but does not return memory to the ‘system’
00245 when lists shrink (unless they shrink to zero). This overall scheme could exhibit
00246 poor performance if the size of one of the data lists in the set frequently changes
00247 between less than two and more than two handles, as this will result in frequent
00248 releasing and re-allocating of the memory for the corresponding array.
00249
00250 If the MESHSET_ORDERED flag is not present, then the set contents list (parent
00251 and child lists are unaffected) is stored in a range-compacted format. In this
00252 format the number of handles stored in the array is always a multiple of two.
00253 Each consecutive pair of handles indicate the start and end, inclusive, of a range
00254 of handles contained in the set. All such handle range pairs are stored in sorted
00255 order and do not overlap. Nor is the end handle of one range ever one less than
00256 the start handle of the next. All such ‘adjacent’ range pairs are merged into a
00257 single pair. The code for insertion and removal of handles from range-formatted
00258 set content lists is fairly complex. The implementation will guarantee that a
00259 given call to insert entities into a range or remove entities from a range is never
00260 worse than O(ln n) + O(m + n), where ‘n’ is the number of handles to insert
00261 and ‘m’ is the number of handles already contained in the set. So it is generally
00262 much more efficient to build Ranges of handles to insert (and remove) and call
00263 MOAB to insert (or remove) the entire list at once rather than making many
00264 calls to insert (or remove) one or a few handles from the contents of a set.
00265 The set storage could probably be further minimized by allowing up to six
00266 handles in one of the lists to be elided. That is, as there are six potential ‘slots’
00267 in the MeshSet object then if two of the lists are empty it should be possible to
00268 store up to six values of the remaining list directly in the MeshSet object.
00269 However, the additional runtime cost of such complexity could easily outweigh
00270 any storage advantage. Further investigation into this has not been done because
00271 the primary motivation for the storage optimization was to support binary trees.
00272
00273 Another possible optimization of storage would be to remove the MeshSet
00274 object entirely and instead store the data in a ‘blocked’ format. The corresponding
00275 SequenceData would contain four arrays: flags, parents, children, and
00276 contents instead of a single array of MeshSet objects. If this were done then
00277 no storage need ever be allocated for parent or child links if none of the sets
00278 in a SequenceData has parent or child links. The effectiveness of the storage
00279 reduction would depend greatly on how sets get grouped into SequenceDatas.
00280 This alternate storage scheme might also allow for better cache utilization as it
00281 would group like data together. It is often the case that application code that
00282 is querying the contents of one set will query the contents of many but never
00283 query the parents or children of any set. Or that an application will query only
00284 parent or child links of a set without every querying other set properties. The
00285 downside of this solution is that it makes the implementation a little less mod-
00286 ular and maintainable because the existing logic contained in the MeshSet class
00287 would need to be spread throughout the MeshSetSequence class.
00288
00289 \ref dg-contents "Top"
00290
00291 \section impl-error-handling 5.Implementation of Error Handling
00292
00293 When a certain error occurs, a MOAB routine can return an enum type ErrorCode (defined in src/moab/Types.hpp)
00294 to its callers. Since MOAB 4.8, the existing error handling model has been completely redesigned to better set
00295 and check errors.
00296
00297 \subsection dgfiveone 5.1. Existing Error Handling Model
00298
00299 To keep track of detail information about errors, a class Error (defined in src/moab/Error.hpp) is used to
00300 store corresponding error messages. Some locally defined macros call Error::set_last_error() to report a new
00301 error message, before a non-success error code is returned. At any time, user may call Core::get_last_error()
00302 to retrieve the latest error message from the Error class instance held by MOAB.
00303
00304 Limitations:
00305 - The Error class can only store the last error message that is being set. When an error originates from a lower
00306 level in a call hierarchy, upper level callers might continue to report more error messages. As a result, previously
00307 reported error messages get overwritten and they will no longer be available to the user.
00308 - There is no useful stack trace for the user to find out where an error first occurs, making MOAB difficult to debug.
00309
00310 \subsection dgfivetwo 5.2. Enhanced Error Handling Model
00311
00312 The error handling model of PETSc (http://www.mcs.anl.gov/petsc/) has nice support for stack trace, and our design has
00313 borrowed many ideas from that. The new features of the enhanced model include:
00314 - Lightweight and easy to use with a macro-based interface
00315 - Generate a stack trace starting from the first non-success error
00316 - Support C++ style streams to build up error messages rather than C style sprintf:
00317 \code
00318 MB_SET_ERR(MB_FAILURE, "Failed to iterate over tag on " << NUM_VTX << " vertices");
00319 \endcode
00320 - Have preprocessor-like functionality such that we can do something like:
00321 \code
00322 result = MOABRoutine(...);MB_CHK_SET_ERR(result, "Error message to set if result is not MB_SUCCESS");
00323 \endcode
00324 - Define and handle globally fatal errors or per-processor specific errors.
00325
00326 The public include file for error handling is src/moab/ErrorHandler.hpp, the source code for the error
00327 handling is in src/ErrorHandler.cpp.
00328
00329 \subsection dgfivethree 5.3. Error Handler
00330
00331 The error handling function MBError() only calls one default error handler, MBTraceBackErrorHandler(), which tries to print
00332 out a stack trace. In the future, we need to provide a callback function to user routine before a complete abort. Something
00333 like a UserTeardown that is a function pointer with a context so that the user can destroy and free essential handles before
00334 an MPI abort.
00335
00336 The arguments to MBTraceBackErrorHandler() are the line number where the error occurred, the function where error was detected,
00337 the file in which the error was detected, the source directory, the error message, and the error type indicating whether the
00338 error message should be printed.
00339 \code
00340 ErrorCode MBTraceBackErrorHandler(int line, const char* func, const char* file, const char* dir, const char* err_msg, ErrorType err_type);
00341 \endcode
00342 This handler will print out a line of stack trace, indicating line number, function name, directory and file name. If MB_ERROR_TYPE_EXISTING
00343 is passed as the error type, the error message will not be printed.
00344
00345 \subsection dgfivefour 5.4. Simplified Interface
00346
00347 The simplified C/C++ macro-based interface consists of the following three basic macros:
00348 \code
00349 MB_SET_ERR(Error code, "Error message");
00350 MB_CHK_ERR(Error code);
00351 MB_CHK_SET_ERR(Error code, "Error message");
00352 \endcode
00353
00354 The macro MB_SET_ERR(err_code, err_msg) is given by
00355 \code
00356 std::ostringstream err_ostr;
00357 err_ostr << err_msg;
00358 return MBError(__LINE__, __func__, __FILENAME__, __SDIR__, err_code, err_ostr.str().c_str(), MB_ERROR_TYPE_NEW_LOCAL);
00359 \endcode
00360 It calls the error handler with the current function name and location: line number, file and directory, plus an error code,
00361 an error message and an error type. With an embedded std::ostringstream object, it supports C++ style streams to build up error
00362 messages. The error type is MB_ERROR_TYPE_NEW_LOCAL/MB_ERROR_TYPE_NEW_GLOBAL on detection of the initial error and MB_ERROR_TYPE_EXISTING
00363 for any additional calls. This is so that the detailed error information is only printed once instead of for all levels of returned errors.
00364
00365 The macro MB_CHK_ERR(err_code) is defined by
00366 \code
00367 if (MB_SUCCESS != err_code)
00368 return MBError(__LINE__, __func__, __FILENAME__, __SDIR__, err_code, "", MB_ERROR_TYPE_EXISTING);
00369 \endcode
00370 It checks the error code, if not MB_SUCCESS, calls the error handler with error type MB_ERROR_TYPE_EXISTING and return.
00371
00372 The MB_CHK_SET_ERR(err_code, err_msg) is defined by
00373 \code
00374 if (MB_SUCCESS != err_code)
00375 MB_SET_ERR(err_code, err_msg);
00376 \endcode
00377 It checks the error code, if not MB_SUCCESS, calls MB_SET_ERR() to set a new error.
00378
00379 To obtain correct line numbers in the stack trace, we recommend putting MB_CHK_ERR() and MB_CHK_SET_ERR() at the same line as a MOAB routine.
00380
00381 In addition to the basic macros mentioned above, there are some variations, such as (for more information, refer to src/moab/ErrorHandler.hpp):
00382 - MB_SET_GLB_ERR() to set a globally fatal error (for all processors)
00383 - MB_SET_ERR_RET() for functions that return void type
00384 - MB_SET_ERR_RET_VAL() for functions that return any data type
00385 - MB_SET_ERR_CONT() to continue execution instead of returning from current function
00386 These macros should be used appropriately in MOAB source code depending on the context.
00387
00388 \subsection dgfivefive 5.5. Embedded Parallel Functionality
00389
00390 We define a global MPI rank with which to prefix the output, as most systems have mechanisms for separating output by rank anyway.
00391 For the error handler, we can pass error type MB_ERROR_TYPE_NEW_GLOBAL for globally fatal errors and MB_ERROR_TYPE_NEW_LOCAL for
00392 per-processor relevant errors.
00393
00394 Note, if the error handler uses std::cout to print error messages and stack traces in each processor, it can result in a messy output.
00395 This is a known MPI issue with std::cout, and existing DebugOutput class has solved this issue with buffered lines. A new class
00396 ErrorOutput (implemented similar to DebugOutput) is used by the error handler to print each line prefixed with the MPI rank.
00397
00398 \subsection dgfivesix 5.6. Handle Non-error Conditions
00399
00400 We should notice that sometimes ErrorCode is used to return a non-error condition (some internal error code that can be handled, or even expected,
00401 e.g. MB_TAG_NOT_FOUND). Therefore, MB_SET_ERR() should be appropriately placed to report an error to the the caller. Before it is used, we need to
00402 carefully decide whether that error is intentional. For example, a lower level MOAB routine that could return MB_TAG_NOT_FOUND should probably not
00403 set an error on it, since the caller might expect to get that error code. In this case, the lower level routine just return MB_TAG_NOT_FOUND as a
00404 condition, and no error is being set. It is then up to the upper level callers to decide whether it should be a true error or not.
00405
00406 \ref dg-contents "Top"
00407 */
00408