MOAB: Mesh Oriented datABase
(version 5.4.1)
|
00001 /*! \page developerguide Developer's Guide 00002 00003 \subpage dg-contents 00004 00005 \subpage dg-figures 00006 00007 */ 00008 00009 /*! \page dg-figures List of Figures 00010 00011 \ref figure1 00012 00013 \ref figure2 00014 00015 \ref figure3 00016 */ 00017 00018 00019 /*! \page dg-contents Table of Contents 00020 00021 \ref sequence 00022 00023 \ref manager 00024 00025 \ref s-mesh 00026 00027 \ref sets 00028 00029 \ref impl-error-handling 00030 00031 \ref dgfiveone 00032 00033 \ref dgfivetwo 00034 00035 \ref dgfivethree 00036 00037 \ref dgfivefour 00038 00039 \ref dgfivefive 00040 00041 \ref dgfivesix 00042 00043 \section sequence 1.EntitySequence & SequenceData 00044 00045 \subsection figure1 Figure 1: EntitySequences For One SequenceData 00046 \image html figure1.jpg 00047 00048 \ref dg-figures "List of Figures" 00049 00050 The <I>SequenceData</I> class manages as set of arrays of per-entity values. Each 00051 <I>SequenceData</I> has a start and end handle denoting the block of entities for which 00052 the arrays contain data. The arrays managed by a <I>SequenceData</I> instance are 00053 divided into three groups: 00054 00055 - Type-specific data (connectivity, coordinates, etc.): zero or more arrays. 00056 - Adjacency data: zero or one array. 00057 - Dense tag data: zero or more arrays. 00058 . 00059 00060 The abstract <I>EntitySequence</I> class is a non-strict subset of a <I>SequenceData</I>. 00061 It contains a pointer to a <I>SequenceData</I> and the start and end handles to indicate 00062 the subset of the referenced <I>SequenceData</I>. The <I>EntitySequence</I> class is 00063 used to represent the regions of valid (or allocated) handles in a <I>SequenceData</I>. 00064 A <I>SequenceData</I> is expected to be referenced by one or more <I>EntitySequence</I> 00065 instances. 00066 00067 Initial <I>EntitySequence</I> and <I>SequenceData</I> pairs are typically created in one 00068 of two configurations. When reading from a file, a <I>SequenceData</I> will be created 00069 to represent all of a single type of entity contained in a file. As all entries in the <I>SequenceData</I> correspond to valid handles (entities read from the file) a single 00070 <I>EntitySequence</I> instance corresponding to the entire <I>SequenceData</I> is initially 00071 created. The second configuration arises when allocating a single entity. If no 00072 entities have been allocated yet, a new <I>SequenceData</I> must be created to store 00073 the entity data. It is created with a constant size (e.g. 4k entities). The new 00074 <I>EntitySequence</I> corresponds to only the first entity in the <I>SequenceData</I>: the 00075 one allocated entity. As subsequent entities are allocated, the <I>EntitySequence</I> 00076 is extended to cover more of the corresponding <I>SequenceData</I>. 00077 00078 Concrete subclasses of the <I>EntitySequence</I> class are responsible for representing 00079 specific types of entities using the array storage provided by the 00080 <I>SequenceData</I> class. They also handle allocating <I>SequenceData</I> instances with 00081 appropriate arrays for storing a particular type of entity. Each concrete subclass 00082 typically provides two constructors corresponding to the two initial allocation 00083 configurations described in the previous paragraph. <I>EntitySequence</I> implementations 00084 also provide a split method, which is a type of factory method. It 00085 modifies the called sequence and creates a new sequence such that the range of 00086 entities represented by the original sequence is split. 00087 00088 The <I>VertexSequence</I> class provides an <I>EntitySequence</I> for storing vertex 00089 data. It references a SequenceData containing three arrays of doubles 00090 for storing the blocked vertex coordinate data. The <I>ElementSequence</I> class 00091 extends the <I>EntitySequence</I> interface with element-specific functionality. The 00092 <I>UnstructuredElemSeq</I> class is the concrete implementation of <I>ElementSequence</I> 00093 used to represent unstructured elements, polygons, and polyhedra. <I>MeshSetSequence</I> 00094 is the <I>EntitySequence</I> used for storing entity sets. 00095 00096 Each <I>EntitySequence</I> implementation also provides an implementation of 00097 the values per entity method. This value is used to determine if an existing 00098 <I>SequenceData</I> that has available entities is suitable for storing a particular 00099 entity. For example, <I>UnstructuredElemSeq</I> returns the number of nodes per element 00100 from values per entity. When allocating a new element with a specific 00101 number of nodes, this value is used to determine if that element may be stored 00102 in a specific <I>SequenceData</I>. For vertices, this value is always zero. This could 00103 be changed to the number of coordinates per vertex, allowing representation of 00104 mixed-dimension data. However, API changes would be required to utilize such 00105 a feature. Sequences for which the corresponding data cannot be used to store 00106 new entities (e.g. structured mesh discussed in a later section) will return -1 or 00107 some other invalid value. 00108 00109 \ref dg-contents "Top" 00110 00111 \section manager 2.TypeSequenceManager & SequenceManager 00112 00113 The <I>TypeSequenceManager</I> class maintains an organized set of <I>EntitySequence</I> 00114 instances and corresponding <I>SequenceData</I> instances. It is used to manage 00115 all such instances for entities of a single <I>EntityType</I>. <I>TypeSequenceManager</I> 00116 enforces the following four rules on its contained data: 00117 00118 -# No two <I>SequenceData</I> instances may overlap. 00119 -# No two <I>EntitySequence</I> instances may overlap. 00120 -# Every <I>EntitySequence</I> must be a subset of a <I>SequenceData</I>. 00121 -# Any pair of <I>EntitySequence</I> instances referencing the same <I>SequenceData</I> must be separated by at least one unallocated handle. 00122 . 00123 00124 \subsection figure2 Figure 2: SequenceManager and Related Classes 00125 \image html figure2.jpg 00126 00127 \ref dg-figures "List of Figures" 00128 00129 The first three rules are required for the validity of the data model. The 00130 fourth rule avoids unnecessary inefficiency. It is implemented by merging such 00131 adjacent sequences. In some cases, other classes (e.g. <I>SequenceManager</I>) can 00132 modify an <I>EntitySequence</I> such that the fourth rule is violated. In such cases, 00133 the <I>TypeSequenceManager::notify</I> prepended or <I>TypeSequenceManager::notify</I> appended 00134 method must be called to maintain the integrity of the data<sup>1</sup>. The above rules 00135 (including the fourth) are assumed in many other methods of the <I>TypeSequenceManager</I> 00136 class, such that those methods will fail or behave unexpectedly if the managed 00137 data does not conform to the rules. 00138 00139 <I>TypeSequenceManager</I> contains three principal data structures. The first is 00140 a <I>std::set</I> of <I>EntitySequence</I> pointers sorted using a custom comparison 00141 operator that queries the start and end handles of the referenced sequences. The 00142 comparison operation is defined as: <I>a->end_handle() < b->start_handle()</I>. 00143 This method of comparison has the advantage that a sequence corresponding to 00144 a specific handle can be located by searching the set for a “sequence” beginning 00145 and ending with the search value. The lower bound and find methods provided 00146 by the library are guaranteed to return the sequence, if it exists. Using 00147 such a comparison operator will result in undefined behavior if the set contains 00148 overlapping sequences. This is acceptable, as rule two above prohibits such 00149 a configuration. However, some care must be taken in writing and modifying 00150 methods in <I>TypeSequenceManager</I> so as to avoid having overlapping sequences 00151 as a transitory state of some operation. 00152 00153 The second important data member of <I>TypeSequenceManager</I> is a pointer 00154 to the last referenced <I>EntitySequence</I>. This “cached” value is used to speed up 00155 searches by entity handle. This pointer is never null unless the sequence is empty. 00156 This rule is maintained to avoid unnecessary branches in fast query paths. In 00157 cases where the last referenced sequence is deleted, <I>TypeSequenceManager</I> will 00158 typically assign an arbitrary sequence (e.g. the first one) to the last referenced 00159 pointer. (Note: this cached value might give problems for threading models; it 00160 should probably be different for each thread) 00161 00162 The third data member of <I>TypeSequenceManager</I> is a <I>std::set</I> of <I>SequenceData</I> 00163 instances that are not completely covered by a <I>EntitySequence</I> instance<sup>2</sup>. 00164 This list is searched when allocating new handles. <I>TypeSequenceManager</I> also 00165 embeds in each <I>SequenceData</I> instance a reference to the first corresponding 00166 <I>EntitySequence</I> so that it may be located quickly from only the <I>SequenceData</I> 00167 pointer. 00168 00169 The <I>SequenceManager</I> class contains an array of <I>TypeSequenceManager</I> in- 00170 stances, one for each <I>EntityType</I>. It also provides all type-specific operations 00171 such as allocating the correct <I>EntitySequence</I> subtype for a given <I>EntityType</I>. 00172 00173 <sup>1</sup>This source of potential error can be eliminated with changes to the entity set representation. 00174 This is discussed in a later section. 00175 00176 <sup>2</sup>Given rule four for the data managed by a <I>TypeSequenceManager</I>, any 00177 <I>SequenceData</I> for which all handles are allocated will be referenced by exactly one <I>EntitySequence</I>. 00178 00179 \ref dg-contents "Top" 00180 00181 \section s-mesh 3.Structured Mesh 00182 00183 Structured mesh storage is implemented using subclasses of <I>SequenceData</I>: 00184 <I>ScdElementData</I> and <I>ScdVertexData</I>. The <I>StructuredElementSeq</I> class is 00185 used to access the structured element connectivity. A standard <I>VertexSequence</I> 00186 instance is used to access the ScdVertexData because the vertex data storage 00187 is the same as for unstructured mesh. 00188 00189 \ref dg-contents "Top" 00190 00191 \section sets 4.Entity Sets 00192 00193 - MeshSetSequence 00194 00195 The <I>MeshSetSequence</I> class is the same as most other subclasses of <I>EntitySequence</I> 00196 in that it utilizes SequenceData to store its data. A single array in the <I>SequenceData</I> 00197 is used to store instances of the MeshSet class, one per allocated <I>EntityHandle</I>. 00198 <I>SequenceData</I> allocates all of its managed arrays using malloc and free as 00199 simple arrays of bytes. <I>MeshSetSequence</I> does in-place construction and 00200 destruction of <I>MeshSet</I> instances within that array. This is similar to what is 00201 done by <I>std::vector</I> and other container classes that may own more storage 00202 than is required at a given time for contained objects. 00203 00204 - MeshSet 00205 00206 \subsection figure3 Figure 3: SequenceManager and Related Classes 00207 \image html figure3.jpg 00208 00209 \ref dg-figures "List of Figures" 00210 00211 The <I>MeshSet</I> class is used to represent a single entity set instance in MOAB. 00212 The class is optimized to minimize storage (further possible improvements in 00213 storage size are discussed later.) 00214 00215 Figure 3 shows the memory layout of an instance of the <I>MeshSet</I> class. 00216 The flags member holds the set creation bit flags: <I>MESHSET_TRACK_OWNER</I>, 00217 <I>MESHSET_SET</I>, and <I>MESHSET_ORDERED</I>. The presence of the <I>MESHSET_TRACK_OWNER</I> 00218 indicates that reverse links from the contained entities back to the owning set 00219 should be maintained in the adjacency list of each entity. The <I>MESHSET_SET</I> 00220 and <I>MESHSET_ORDERED</I> bits are mutually exclusive, and as such most code only 00221 tests for the <I>MESHSET_ORDERED</I>, meaning that in practice the <I>MESHSET_SET</I> bit is 00222 ignored. <I>MESHSET_ORDERED</I> indicates that the set may contain duplicate handles 00223 and that the order that the handles are added to the set should be preserved. 00224 In practice, such sets are stored as a simple list of handles. <I>MESHSET_SET</I> (or in 00225 practice, the lack of <I>MESHSET_ORDERED</I>) indicates that the order of the handles 00226 need not be preserved and that the set may not contain duplicate handles. Such 00227 sets are stored in a sorted range-compacted format similar to that of the Range 00228 class. 00229 00230 The memory for storing contents, parents, and children are each handled in 00231 the same way. The data in the class is composed of a 2-bit ‘size’ field and two 00232 values, where the two values may either be two handles or two pointers. The size 00233 bit-fields are grouped together to reduce the required amount of memory. If the 00234 numerical value of the 2-bit size field is 0 then the corresponding list is empty. 00235 If the 2-bit size field is either 1 or 2, then the contents of the corresponding list 00236 are stored directly in the corresponding two data fields of the MeshSet object. 00237 If the 2-bit size field has a value of 3 (11 binary), then the corresponding two 00238 data fields store the begin and end pointers of an external array of handles. 00239 The number of handles in the external array can be obtained by taking the 00240 difference of the start and end pointers. Note that unlike <I>std::vector</I>, we 00241 do not store both an allocated and used size. We store only the ‘used’ size 00242 and call std::realloc whenever the used size is modified, thus we rely on the 00243 std::malloc implementation in the standard C library to track ‘allocated’ size 00244 for us. In practice this performs well but does not return memory to the ‘system’ 00245 when lists shrink (unless they shrink to zero). This overall scheme could exhibit 00246 poor performance if the size of one of the data lists in the set frequently changes 00247 between less than two and more than two handles, as this will result in frequent 00248 releasing and re-allocating of the memory for the corresponding array. 00249 00250 If the <I>MESHSET_ORDERED</I> flag is not present, then the set contents list (parent 00251 and child lists are unaffected) is stored in a range-compacted format. In this 00252 format the number of handles stored in the array is always a multiple of two. 00253 Each consecutive pair of handles indicate the start and end, inclusive, of a range 00254 of handles contained in the set. All such handle range pairs are stored in sorted 00255 order and do not overlap. Nor is the end handle of one range ever one less than 00256 the start handle of the next. All such ‘adjacent’ range pairs are merged into a 00257 single pair. The code for insertion and removal of handles from range-formatted 00258 set content lists is fairly complex. The implementation will guarantee that a 00259 given call to insert entities into a range or remove entities from a range is never 00260 worse than O(ln n) + O(m + n), where ‘n’ is the number of handles to insert 00261 and ‘m’ is the number of handles already contained in the set. So it is generally 00262 much more efficient to build Ranges of handles to insert (and remove) and call 00263 MOAB to insert (or remove) the entire list at once rather than making many 00264 calls to insert (or remove) one or a few handles from the contents of a set. 00265 The set storage could probably be further minimized by allowing up to six 00266 handles in one of the lists to be elided. That is, as there are six potential ‘slots’ 00267 in the MeshSet object then if two of the lists are empty it should be possible to 00268 store up to six values of the remaining list directly in the MeshSet object. 00269 However, the additional runtime cost of such complexity could easily outweigh 00270 any storage advantage. Further investigation into this has not been done because 00271 the primary motivation for the storage optimization was to support binary trees. 00272 00273 Another possible optimization of storage would be to remove the <I>MeshSet</I> 00274 object entirely and instead store the data in a ‘blocked’ format. The corresponding 00275 <I>SequenceData</I> would contain four arrays: flags, parents, children, and 00276 contents instead of a single array of <I>MeshSet</I> objects. If this were done then 00277 no storage need ever be allocated for parent or child links if none of the sets 00278 in a <I>SequenceData</I> has parent or child links. The effectiveness of the storage 00279 reduction would depend greatly on how sets get grouped into <I>SequenceDatas</I>. 00280 This alternate storage scheme might also allow for better cache utilization as it 00281 would group like data together. It is often the case that application code that 00282 is querying the contents of one set will query the contents of many but never 00283 query the parents or children of any set. Or that an application will query only 00284 parent or child links of a set without every querying other set properties. The 00285 downside of this solution is that it makes the implementation a little less mod- 00286 ular and maintainable because the existing logic contained in the <I>MeshSet</I> class 00287 would need to be spread throughout the <I>MeshSetSequence</I> class. 00288 00289 \ref dg-contents "Top" 00290 00291 \section impl-error-handling 5.Implementation of Error Handling 00292 00293 When a certain error occurs, a MOAB routine can return an enum type ErrorCode (defined in src/moab/Types.hpp) 00294 to its callers. Since MOAB 4.8, the existing error handling model has been completely redesigned to better set 00295 and check errors. 00296 00297 \subsection dgfiveone 5.1. Existing Error Handling Model 00298 00299 To keep track of detail information about errors, a class Error (defined in src/moab/Error.hpp) is used to 00300 store corresponding error messages. Some locally defined macros call Error::set_last_error() to report a new 00301 error message, before a non-success error code is returned. At any time, user may call Core::get_last_error() 00302 to retrieve the latest error message from the Error class instance held by MOAB. 00303 00304 Limitations: 00305 - The Error class can only store the last error message that is being set. When an error originates from a lower 00306 level in a call hierarchy, upper level callers might continue to report more error messages. As a result, previously 00307 reported error messages get overwritten and they will no longer be available to the user. 00308 - There is no useful stack trace for the user to find out where an error first occurs, making MOAB difficult to debug. 00309 00310 \subsection dgfivetwo 5.2. Enhanced Error Handling Model 00311 00312 The error handling model of PETSc (http://www.mcs.anl.gov/petsc/) has nice support for stack trace, and our design has 00313 borrowed many ideas from that. The new features of the enhanced model include: 00314 - Lightweight and easy to use with a macro-based interface 00315 - Generate a stack trace starting from the first non-success error 00316 - Support C++ style streams to build up error messages rather than C style sprintf: 00317 \code 00318 MB_SET_ERR(MB_FAILURE, "Failed to iterate over tag on " << NUM_VTX << " vertices"); 00319 \endcode 00320 - Have preprocessor-like functionality such that we can do something like: 00321 \code 00322 result = MOABRoutine(...);MB_CHK_SET_ERR(result, "Error message to set if result is not MB_SUCCESS"); 00323 \endcode 00324 - Define and handle globally fatal errors or per-processor specific errors. 00325 00326 The public include file for error handling is src/moab/ErrorHandler.hpp, the source code for the error 00327 handling is in src/ErrorHandler.cpp. 00328 00329 \subsection dgfivethree 5.3. Error Handler 00330 00331 The error handling function MBError() only calls one default error handler, MBTraceBackErrorHandler(), which tries to print 00332 out a stack trace. In the future, we need to provide a callback function to user routine before a complete abort. Something 00333 like a UserTeardown that is a function pointer with a context so that the user can destroy and free essential handles before 00334 an MPI abort. 00335 00336 The arguments to MBTraceBackErrorHandler() are the line number where the error occurred, the function where error was detected, 00337 the file in which the error was detected, the source directory, the error message, and the error type indicating whether the 00338 error message should be printed. 00339 \code 00340 ErrorCode MBTraceBackErrorHandler(int line, const char* func, const char* file, const char* dir, const char* err_msg, ErrorType err_type); 00341 \endcode 00342 This handler will print out a line of stack trace, indicating line number, function name, directory and file name. If MB_ERROR_TYPE_EXISTING 00343 is passed as the error type, the error message will not be printed. 00344 00345 \subsection dgfivefour 5.4. Simplified Interface 00346 00347 The simplified C/C++ macro-based interface consists of the following three basic macros: 00348 \code 00349 MB_SET_ERR(Error code, "Error message"); 00350 MB_CHK_ERR(Error code); 00351 MB_CHK_SET_ERR(Error code, "Error message"); 00352 \endcode 00353 00354 The macro MB_SET_ERR(err_code, err_msg) is given by 00355 \code 00356 std::ostringstream err_ostr; 00357 err_ostr << err_msg; 00358 return MBError(__LINE__, __func__, __FILENAME__, __SDIR__, err_code, err_ostr.str().c_str(), MB_ERROR_TYPE_NEW_LOCAL); 00359 \endcode 00360 It calls the error handler with the current function name and location: line number, file and directory, plus an error code, 00361 an error message and an error type. With an embedded std::ostringstream object, it supports C++ style streams to build up error 00362 messages. The error type is MB_ERROR_TYPE_NEW_LOCAL/MB_ERROR_TYPE_NEW_GLOBAL on detection of the initial error and MB_ERROR_TYPE_EXISTING 00363 for any additional calls. This is so that the detailed error information is only printed once instead of for all levels of returned errors. 00364 00365 The macro MB_CHK_ERR(err_code) is defined by 00366 \code 00367 if (MB_SUCCESS != err_code) 00368 return MBError(__LINE__, __func__, __FILENAME__, __SDIR__, err_code, "", MB_ERROR_TYPE_EXISTING); 00369 \endcode 00370 It checks the error code, if not MB_SUCCESS, calls the error handler with error type MB_ERROR_TYPE_EXISTING and return. 00371 00372 The MB_CHK_SET_ERR(err_code, err_msg) is defined by 00373 \code 00374 if (MB_SUCCESS != err_code) 00375 MB_SET_ERR(err_code, err_msg); 00376 \endcode 00377 It checks the error code, if not MB_SUCCESS, calls MB_SET_ERR() to set a new error. 00378 00379 To obtain correct line numbers in the stack trace, we recommend putting MB_CHK_ERR() and MB_CHK_SET_ERR() at the same line as a MOAB routine. 00380 00381 In addition to the basic macros mentioned above, there are some variations, such as (for more information, refer to src/moab/ErrorHandler.hpp): 00382 - MB_SET_GLB_ERR() to set a globally fatal error (for all processors) 00383 - MB_SET_ERR_RET() for functions that return void type 00384 - MB_SET_ERR_RET_VAL() for functions that return any data type 00385 - MB_SET_ERR_CONT() to continue execution instead of returning from current function 00386 These macros should be used appropriately in MOAB source code depending on the context. 00387 00388 \subsection dgfivefive 5.5. Embedded Parallel Functionality 00389 00390 We define a global MPI rank with which to prefix the output, as most systems have mechanisms for separating output by rank anyway. 00391 For the error handler, we can pass error type MB_ERROR_TYPE_NEW_GLOBAL for globally fatal errors and MB_ERROR_TYPE_NEW_LOCAL for 00392 per-processor relevant errors. 00393 00394 Note, if the error handler uses std::cout to print error messages and stack traces in each processor, it can result in a messy output. 00395 This is a known MPI issue with std::cout, and existing DebugOutput class has solved this issue with buffered lines. A new class 00396 ErrorOutput (implemented similar to DebugOutput) is used by the error handler to print each line prefixed with the MPI rank. 00397 00398 \subsection dgfivesix 5.6. Handle Non-error Conditions 00399 00400 We should notice that sometimes ErrorCode is used to return a non-error condition (some internal error code that can be handled, or even expected, 00401 e.g. MB_TAG_NOT_FOUND). Therefore, MB_SET_ERR() should be appropriately placed to report an error to the the caller. Before it is used, we need to 00402 carefully decide whether that error is intentional. For example, a lower level MOAB routine that could return MB_TAG_NOT_FOUND should probably not 00403 set an error on it, since the caller might expect to get that error code. In this case, the lower level routine just return MB_TAG_NOT_FOUND as a 00404 condition, and no error is being set. It is then up to the upper level callers to decide whether it should be a true error or not. 00405 00406 \ref dg-contents "Top" 00407 */ 00408