:py:mod:`km3pipe.io.hdf5`
=========================

.. py:module:: km3pipe.io.hdf5

.. autoapi-nested-parse::

   Read and write KM3NeT-formatted HDF5 files.

   ..
       !! processed by numpydoc !!


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   km3pipe.io.hdf5.HDF5Header
   km3pipe.io.hdf5.HDF5IndexTable
   km3pipe.io.hdf5.HDF5Sink
   km3pipe.io.hdf5.HDF5Pump
   km3pipe.io.hdf5.HDF5MetaData


Functions
~~~~~~~~~

.. autoapisummary::

   km3pipe.io.hdf5.check_version
   km3pipe.io.hdf5.create_index_tuple
   km3pipe.io.hdf5.header2table


Attributes
~~~~~~~~~~

.. autoapisummary::

   km3pipe.io.hdf5.jit
   km3pipe.io.hdf5.log
   km3pipe.io.hdf5.FORMAT_VERSION
   km3pipe.io.hdf5.MINIMUM_FORMAT_VERSION


.. py:data:: jit

   
.. py:data:: log

   
.. py:data:: FORMAT_VERSION

   
.. py:data:: MINIMUM_FORMAT_VERSION

   
.. py:exception:: H5VersionError


   Common base class for all non-exit exceptions.


   ..
       !! processed by numpydoc !!

.. py:function:: check_version(h5file)


.. py:class:: HDF5Header(data)


   Wrapper class for the `/raw_header` table in KM3HDF5


   :Parameters:

       **data** : dict(str, str/tuple/dict/OrderedDict)
           The actual header data, consisting of a key and an entry.
           If possible, the key will be set as a property and the the values will
           be converted to namedtuples (fields sorted by name to ensure consistency
           when dictionaries are provided).


   ..
       !! processed by numpydoc !!
   .. py:method:: keys()


   .. py:method:: values()


   .. py:method:: items()


   .. py:method:: from_table(table)
      :classmethod:


   .. py:method:: from_km3io(header)
      :classmethod:


   .. py:method:: from_aanet(table)
      :classmethod:


   .. py:method:: from_hdf5(filename)
      :classmethod:


   .. py:method:: from_pytable(table)
      :classmethod:


.. py:class:: HDF5IndexTable(h5loc, start=0)


   .. py:property:: data


   .. py:method:: append(n_items)


   .. py:method:: fillup(length)


.. py:class:: HDF5Sink(name=None, **parameters)


   Write KM3NeT-formatted HDF5 files, event-by-event.

   The data can be a ``kp.Table``, a numpy structured array,
   a pandas DataFrame, or a simple scalar.

   The name of the corresponding H5 table is the decamelised
   blob-key, so values which are stored in the blob under `FooBar`
   will be written to `/foo_bar` in the HDF5 file.

   :Parameters:

       **filename: str, optional [default: 'dump.h5']**
           Where to store the events.

       **h5file: pytables.File instance, optional [default: None]**
           Opened file to write to. This is mutually exclusive with filename.

       **keys: list of strings, optional**
           List of Blob-keys to write, everything else is ignored.

       **complib** : str [default: zlib]
           Compression library that should be used.
           'zlib', 'lzf', 'blosc' and all other PyTables filters
           are available.

       **complevel** : int [default: 5]
           Compression level.

       **chunksize** : int [optional]
           Chunksize that should be used for saving along the first axis
           of the input array.

       **flush_frequency: int, optional [default: 500]**
           The number of iterations to cache tables and arrays before
           dumping to disk.

       **pytab_file_args: dict [optional]**
           pass more arguments to the pytables File init

       **n_rows_expected = int, optional [default: 10000]**
           ..

       **append: bool, optional [default: False]**
           ..

       **reset_group_id: bool, optional [default: True]**
           Resets the group_id so that it's continuous in the output file.
           Use this with care!


   .. rubric:: Notes

   Provides service write_table(tab, h5loc=None): tab:Table, h5loc:str
       The table to write, with ".h5loc" set or to h5loc if specified.


   ..
       !! processed by numpydoc !!
   .. py:method:: configure()

      
      Configure module, like instance variables etc.


      ..
          !! processed by numpydoc !!

   .. py:method:: write_table(table, h5loc=None)

      
      Write a single table to the HDF5 file, exposed as a service


      ..
          !! processed by numpydoc !!

   .. py:method:: process(blob)

      
      Knead the blob and return it


      ..
          !! processed by numpydoc !!

   .. py:method:: flush()

      
      Flush tables and arrays to disk


      ..
          !! processed by numpydoc !!

   .. py:method:: finish()

      
      Clean everything up.


      ..
          !! processed by numpydoc !!


.. py:class:: HDF5Pump(name=None, **parameters)


   Read KM3NeT-formatted HDF5 files, event-by-event.


   :Parameters:

       **filename: str**
           From where to read events. Either this OR ``filenames`` needs to be
           defined.

       **skip_version_check: bool [default: False]**
           Don't check the H5 version. Might lead to unintended consequences.

       **shuffle: bool, optional [default: False]**
           Shuffle the group_ids, so that the blobs are mixed up.

       **shuffle_function: function, optional [default: np.random.shuffle**
           The function to be used to shuffle the group IDs.

       **reset_index: bool, optional [default: True]**
           When shuffle is set to true, reset the group ID - start to count
           the group_id by 0.


   .. rubric:: Notes

   Provides service h5singleton(h5loc): h5loc:str -> kp.Table
       Singleton tables for a given HDF5 location.


   ..
       !! processed by numpydoc !!
   .. py:method:: configure()

      
      Configure module, like instance variables etc.


      ..
          !! processed by numpydoc !!

   .. py:method:: h5singleton(h5loc)

      
      Returns the singleton table for a given HDF5 location


      ..
          !! processed by numpydoc !!

   .. py:method:: process(blob)

      
      Knead the blob and return it


      ..
          !! processed by numpydoc !!

   .. py:method:: get_blob(index)


   .. py:method:: finish()

      
      Clean everything up.


      ..
          !! processed by numpydoc !!


.. py:function:: create_index_tuple(group_ids)

   
   An helper function to create index tuples for fast lookup in HDF5Pump


   ..
       !! processed by numpydoc !!

.. py:class:: HDF5MetaData(name=None, **parameters)


   Metadata to attach to the HDF5 file.


   :Parameters:

       **data: dict**
           ..


   ..
       !! processed by numpydoc !!
   .. py:method:: configure()

      
      Configure module, like instance variables etc.


      ..
          !! processed by numpydoc !!


.. py:function:: header2table(data)

   
   Convert a header to an `HDF5Header` compliant `kp.Table`


   ..
       !! processed by numpydoc !!