Skip to content

Commit

Permalink
Also fix http URLs
Browse files Browse the repository at this point in the history
  • Loading branch information
byrnHDF committed Jan 13, 2025
1 parent 0a2c9d2 commit 1f33f8d
Show file tree
Hide file tree
Showing 10 changed files with 473 additions and 11 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,7 @@ else ()
endif ()
set (HDF5_PACKAGE_STRING "${HDF5_PACKAGE_NAME} ${HDF5_PACKAGE_VERSION_STRING}")
set (HDF5_PACKAGE_TARNAME "${HDF5_PACKAGE}${HDF_PACKAGE_EXT}")
set (HDF5_PACKAGE_URL "http://www.hdfgroup.org")
set (HDF5_PACKAGE_URL "https://www.hdfgroup.org")
set (HDF5_PACKAGE_BUGREPORT "[email protected]")

#-----------------------------------------------------------------------------
Expand Down
143 changes: 143 additions & 0 deletions doxygen/dox/CopyingCommited.dox
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
/** \page copying_committed Copying Committed Datatypes with H5Ocopy
* HDF5 Release 1.8.9 May 2012
*
* \section sec_copying_committed_datatypes Copying Committed Datatypes with H5Ocopy
* Committed datatypes can be a powerful feature in HDF5. They can be used to share a single datatype
* description among multiple datasets, to save space or ensure that the datatypes are truly identical, and to
* assign a name to that datatype within the HDF5 group structure. The object copy API, #H5Ocopy, can be used
* to copy HDF5 objects from one file to another, including committed datatypes and objects that use them.
* However, problems can occur when a dataset using a committed datatype or an object with an attribute that
* uses a committed datatype is copied to another file with #H5Ocopy.
*
* When copying a dataset that uses a committed datatype or an object with an attribute that uses a committed
* datatype between files, the library by default does not look for a matching committed datatype in the
* destination file. The library creates a new committed datatype in the destination file without any links to it (an
* anonymous committed datatype) and then links the dataset to the anonymous committed datatype. This
* means that, when copying multiple datasets in separate calls to #H5Ocopy, a new committed datatype is
* created for each #H5Ocopy call. While it is possible to have all of the copied datasets share the same
* committed datatype by copying them in a single call to #H5Ocopy, this is not always attainable.
*
* For example, imagine that a user has an application that automatically creates many data files, each with many
* datasets that all use a single committed datatype. At the end of a project, the user wants to merge all of these
* files into a single file. The HDF5 Library can have all of the datasets in the combined file use the same
* committed datatype, but the default behavior of the library is to create an anonymous committed datatype
* for each dataset.
*
* To make sure that shared committed datatypes in the source are shared in the copy, use the
* #H5Pset_copy_object property list API routine to set the #H5O_COPY_MERGE_COMMITTED_DTYPE_FLAG
* flag. When this flag is set and #H5Ocopy encounters an object or attribute that uses a committed datatype,
* H5Ocopy will search for a matching committed datatype in the destination file. If a matching committed
* datatype is found, then it will be used by the copied dataset or attribute. The next few paragraphs describe in
* more detail the process that #H5Ocopy goes through.
*
* When the #H5O_COPY_MERGE_COMMITTED_DTYPE_FLAG flag is set, H5Ocopy will search the destination file
* for committed datatypes and build a temporary list in memory of all the committed datatypes it finds. Then,
* whenever H5Ocopy encounters a dataset that uses a committed datatype or an object with an attribute that
* uses a committed datatype in the source, it will check that list to see if it contains a datatype equal to the
* source datatype. If #H5Ocopy finds an equal datatype, it will modify the copied object or attribute to use the
* found committed datatype as its datatype. #H5Ocopy will then update the list if a new committed datatype is
* created in the destination file as a result of the copy. When later datasets and attributes using committed
* datatypes are encountered, the library will again check to see if the list contains a matching datatype.
*
* To determine if two committed datatypes are equal, the library will compare their descriptions in a manner
* similar to #H5Tequal. In addition, if either committed datatype has one or more attributes, then all attributes
* must be present in both committed datatypes, and the attributes must all be identical. Each attribute’s
* datatype description, dataspace, and raw data must be identical. However, if an attribute uses a committed
* datatype, then the attributes of the attribute’s committed datatype will not be compared.
*
* When #H5Ocopy encounters a committed datatype object in the source file, it will similarly search for a
* matching committed datatype in the destination file. If a match is found, the library will create a hard link in
* the destination file to the found datatype. If a match is not found, the library will copy the committed datatype
* normally and add it to the temporary list of committed datatypes in the destination file.
*
* By default, #H5Ocopy will search the entire destination file for a matching committed datatype. It is possible to
* focus where #H5Ocopy will search. This focusing should result in a faster search. If there are locations in the
* destination file where a matching committed datatype might be found, then those locations can be specified
* with the #H5Padd_merge_committed_dtype_path property.
*
* The example below shows how to enable the feature described above for use with #H5Ocopy.
*
* <em>Example1. Setting the object copy property list</em>
* \code
* hid_t ocpypl_id;
*
* ocpypl_id = H5Pcreate(H5P_OBJECT_COPY);
* status = H5Pset_copy_object(ocpypl_id, H5O_COPY_MERGE_COMMITTED_DT_FLAG);
* status = H5Ocopy(file1_id, src_name, file2_id, dst_name, ocpypl_id, H5P_DEFAULT);
* \endcode
*
* \subsection subsec_copying_committed_datatypes_cb Callback Function
* If no matching datatype is found in the locations specified by the call to
* #H5Padd_merge_committed_dtype_path, then #H5Ocopy will by default search the entire destination file. In
* some cases, this may not be desirable. For instance, the user may expect the datatype to always have a match
* in the specified locations and may wish to return an error if a match is not found. The user may also have a
* very large file for which the full search incurs a substantial performance penalty. In this instance, the user may
* wish to log these events so that other datatypes can be added with
* #H5Padd_merge_committed_dtype_path, or the user may wish to abort the search and copy the datatype
* normally.
*
* To support these use cases, the functions #H5Pset_mcdt_search_cb and #H5Pget_mcdt_search_cb have
* been added. These functions allow the user to define a callback function that will be called every time the list
* of paths added by #H5Padd_merge_committed_dtype_path has been exhausted but before beginning the
* full search of the file. The prototype for the callback function is defined by #H5O_mcdt_search_cb_t. The
* only argument to the callback function is a user supplied user data pointer, and the return value is an enum,
* defined by #H5O_mcdt_search_ret_t, which tells the library to either continue with the full file search, abort
* the search and copy the datatype normally (create a new committed datatype in the destination file), or
* return an error.
*
* \subsection subsec_copying_committed_datatypes_sum Function Summary
* Functions used in committed datatype copying operations are listed below.
* <table>
* <caption>Committed Datatype Copying Related Functions</caption>
* <tr>
* <th>Function</th><th>Purpose</th>
* </tr>
* <tr>
* <td>#H5Ocopy</td><td>Allows an application to copy an object within an HDF5 file or to another HDF5 file.</td>
* </tr>
* <tr>
* <td>#H5Pset_copy_object</td><td>Allows an application to set properties to be used
* when an object is copied.</td>
* </tr>
* <tr>
* <td>#H5Padd_merge_committed_dtype_path</td><td>Allows an application to add a path to the list of
* paths that will be searched in the destination file for
* a matching committed datatype.</td>
* </tr>
* <tr>
* <td>#H5Pfree_merge_committed_dtype_paths</td><td>Allows an application to clear the list of paths stored
* in the object copy property list \b ocpypl_id.</td>
* </tr>
* <tr>
* <td>#H5Pset_mcdt_search_cb</td><td>Allows an application to set the callback function
* that #H5Ocopy will invoke before searching the entire destination file for a matching committed
* datatype.</td>
* </tr>
* <tr>
* <td>#H5Pget_mcdt_search_cb</td><td>Allows an application to retrieve the callback
* function from the specified object copy property list.</td>
* </tr>
* <tr>
* <td>#H5O_mcdt_search_cb_t</td><td>Definition of the callback function set by
* #H5Pset_mcdt_search_cb. Provides the mechanism by which a user application may set an
* action for #H5Ocopy to take after checking all suggested paths for a matching committed datatype
* but before starting the global search of the destination file.</td>
* </tr>
* </table>
*
* \subsection subsec_copying_committed_datatypes_res Resources
* See the following for more information.
*
* See the \ref sec_datatype chapter in the \ref UG.
*
* See these entries in the \ref RM
* \li #H5Ocopy
* \li #H5Pset_copy_object
* \li #H5Padd_merge_committed_dtype_path
* \li #H5Pfree_merge_committed_dtype_paths
* \li #H5Pset_mcdt_search_cb
* \li #H5Pget_mcdt_search_cb
*
*
*/

101 changes: 101 additions & 0 deletions doxygen/dox/FreeingMemoryAllocatedByTheHdf5Library.dox
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
/** \page freeing_memory Freeing Memory Allocated by the HDF5 Library
* Several functions in the HDF5 C API return buffers allocated by the HDF5 Library. When
* application code uses a different library for memory management than the HDF Library,
* a corrupt heap or a resource leak can occur when these allocated buffers are freed. This
* is most commonly a problem on Windows systems since Microsoft implements C library
* functions in Visual Studio-­­specific libraries which do not share heap state.
*
* Introduced with HDF5 Release 1.8.13 May 15, 2014
*
* This document describes this problem and the steps users can take to mitigate the
* problem. This document also introduces the new #H5free_memory function.
*
* \section sec_freeing_memory_intro Introduction
* In the HDF5 Library, responsibility for the allocation and freeing of memory is usually the responsibility
* of the same component: either the library or the user's code. When data that would normally be stored
* in dynamically­allocated memory must be returned from the library, the user is usually asked to allocate
* a buffer which is passed to the function and then filled by the library. The complication is that the user
* must be able to determine the buffer's size. The mechanism for this is for the user to make a preliminary
* call, passing a NULL pointer in for the buffer. The function will then return the appropriate number of
* bytes for the user to allocate. See the example below.
*
* <em>Example1. Determining the buffer size with a preliminary call</em>
* \code
* ssize_t size;
* size_t bufsize;
* hid_t object_id;
* char *comment;
* …
* size = H5Oget_comment(object_id, NULL, bufsize); // determine size
* bufsize = size;
* comment = (char *)malloc(bufsize * sizeof(char));
* size = H5Oget_comment(object_id, comment, bufsize); // fill buffer
* \endcode
*
* There are, however, several API calls in which the buffer is allocated by the HDF5 Library and returned to
* the user who is responsible for freeing it. This can be a problem when memory in the application and
* HDF5 Library are managed via different libraries as it can result in resource leaks or a corrupted heap.
* This heap corruption can result in subtle bugs that can be very difficult to reproduce and diagnose. In
* most cases, having the library allocate memory and the application free it is not a problem since
* memory operations will resolve down to the operating system's memory manager; however, there are
* cases where this is not true. For example, a debug memory manager may be in use by the application
* code but not the library. A complication that is unique to Windows is that the C standard library
* functions are implemented in Visual­Studio­specific C run-time (CRT) libraries. When different versions
* of Visual Studio are used to compile the library and application code, the allocate and free calls are
* made in different libraries, which do not share state, leading to the previously mentioned resource and
* corruption issues.
*
* \section sec_freeing_memory_crt The Windows C Run-time (CRT)
* Microsoft implements the standard C library functions in debug and release libraries that are specific to
* each version of Visual Studio<sup>1</sup>. Each library is a separate entity and maintains its own internal CRT object
* state, file handles, and heap information. Creating an object in one CRT and destroying it in another CRT
* may appear to work but can cause corruption of one CRT and resource leaks in the other.
*
* <table>
* <tr>
* <td>
* \image html FreeingMemory_fig1.png
* </td>
* </tr>
* </table>
*
* These problems are normally avoided on Windows by ensuring that all components that can return CRT
* resources are linked to the same CRT dynamic link library (DLL). Unfortunately, even debug and release
* CRTs are housed in separate DLLs, so this is not an easy solution to implement. Using static linkage does
* not avoid this problem since separate copies of the CRT are created in each statically linked component.
*
* \li <sup>1</sup> The names of these libraries are of the form MSVCR<#>.dll, where <#> is the Visual Studio version. For example,
* MSVCR110.dll corresponds to Visual Studio 11.0 (2012).
*
* \section sec_freeing_memory_api Affected API Calls
* This is a list of the API calls that are affected.
* \li #H5Eget_major
* \li #H5Eget_minor
* \li #H5Pget_class_name
* \li #H5Tget_member_name
* \li #H5Tget_tag
*
* \section sec_freeing_memory_mitigation Mitigation
* There are several potential solutions to the problem of freeing memory allocated by the HDF5 Library.
*
* \subsection subsec_freeing_memory_mitigation1 Use the Same Memory Manager/Correct C Run‐time Everywhere
* Both application code and the HDF5 Library must use the same memory allocator. When using Visual
* Studio, both the Visual Studio version and release/debug state must be identical. As of HDF5 1.8.12, this
* is the only available solution.
*
* \subsection subsec_freeing_memory_mitigation2 Use the H5free_memory Function
* A new function called #H5free_memory has been created and is essentially a thin wrapper for the run­
* time's free() call. This function would be used to free any memory allocated by the library. This solution
* has the advantages of being extremely easy to implement and intuitive to use. It can also be used as a
* solution with legacy API calls, so it would be necessary even if we modify the HDF5 API. This function will
* also be extremely useful when HDF5 is wrapped for use with managed languages such as Java, .NET, and
* Python so that the wrappers can properly clean up resources.
*
* See the #H5free_memory entry in the \ref RM for more information.
*
* Note that the creation of this function does not imply that it will be acceptable for new API calls to be
* created that return library­allocated memory. The preferred mechanism will still be to use the
* "preliminary call" scheme described in the "Introduction" on page 4 where the user allocates the buffer.
*
*/

Loading

0 comments on commit 1f33f8d

Please sign in to comment.