km3pipe.io.hdf5

Read and write KM3NeT-formatted HDF5 files.

Module Contents

Classes

HDF5Header

Wrapper class for the /raw_header table in KM3HDF5

HDF5IndexTable

HDF5Sink

Write KM3NeT-formatted HDF5 files, event-by-event.

HDF5Pump

Read KM3NeT-formatted HDF5 files, event-by-event.

HDF5MetaData

Metadata to attach to the HDF5 file.

Functions

check_version(h5file)

create_index_tuple(group_ids)

An helper function to create index tuples for fast lookup in HDF5Pump

header2table(data)

Convert a header to an HDF5Header compliant kp.Table

Attributes

jit

log

FORMAT_VERSION

MINIMUM_FORMAT_VERSION

km3pipe.io.hdf5.jit[source]
km3pipe.io.hdf5.log[source]
km3pipe.io.hdf5.FORMAT_VERSION[source]
km3pipe.io.hdf5.MINIMUM_FORMAT_VERSION[source]
exception km3pipe.io.hdf5.H5VersionError[source]

Common base class for all non-exit exceptions.

km3pipe.io.hdf5.check_version(h5file)[source]
class km3pipe.io.hdf5.HDF5Header(data)[source]

Wrapper class for the /raw_header table in KM3HDF5

Parameters:
datadict(str, str/tuple/dict/OrderedDict)

The actual header data, consisting of a key and an entry. If possible, the key will be set as a property and the the values will be converted to namedtuples (fields sorted by name to ensure consistency when dictionaries are provided).

keys()[source]
values()[source]
items()[source]
classmethod from_table(table)[source]
classmethod from_km3io(header)[source]
classmethod from_aanet(table)[source]
classmethod from_hdf5(filename)[source]
classmethod from_pytable(table)[source]
class km3pipe.io.hdf5.HDF5IndexTable(h5loc, start=0)[source]
property data[source]
append(n_items)[source]
fillup(length)[source]
class km3pipe.io.hdf5.HDF5Sink(name=None, **parameters)[source]

Write KM3NeT-formatted HDF5 files, event-by-event.

The data can be a kp.Table, a numpy structured array, a pandas DataFrame, or a simple scalar.

The name of the corresponding H5 table is the decamelised blob-key, so values which are stored in the blob under FooBar will be written to /foo_bar in the HDF5 file.

Parameters:
filename: str, optional [default: ‘dump.h5’]

Where to store the events.

h5file: pytables.File instance, optional [default: None]

Opened file to write to. This is mutually exclusive with filename.

keys: list of strings, optional

List of Blob-keys to write, everything else is ignored.

complibstr [default: zlib]

Compression library that should be used. ‘zlib’, ‘lzf’, ‘blosc’ and all other PyTables filters are available.

complevelint [default: 5]

Compression level.

chunksizeint [optional]

Chunksize that should be used for saving along the first axis of the input array.

flush_frequency: int, optional [default: 500]

The number of iterations to cache tables and arrays before dumping to disk.

pytab_file_args: dict [optional]

pass more arguments to the pytables File init

n_rows_expected = int, optional [default: 10000]
append: bool, optional [default: False]
reset_group_id: bool, optional [default: True]

Resets the group_id so that it’s continuous in the output file. Use this with care!

Notes

Provides service write_table(tab, h5loc=None): tab:Table, h5loc:str

The table to write, with “.h5loc” set or to h5loc if specified.

configure()[source]

Configure module, like instance variables etc.

write_table(table, h5loc=None)[source]

Write a single table to the HDF5 file, exposed as a service

process(blob)[source]

Knead the blob and return it

flush()[source]

Flush tables and arrays to disk

finish()[source]

Clean everything up.

class km3pipe.io.hdf5.HDF5Pump(name=None, **parameters)[source]

Read KM3NeT-formatted HDF5 files, event-by-event.

Parameters:
filename: str

From where to read events. Either this OR filenames needs to be defined.

skip_version_check: bool [default: False]

Don’t check the H5 version. Might lead to unintended consequences.

shuffle: bool, optional [default: False]

Shuffle the group_ids, so that the blobs are mixed up.

shuffle_function: function, optional [default: np.random.shuffle

The function to be used to shuffle the group IDs.

reset_index: bool, optional [default: True]

When shuffle is set to true, reset the group ID - start to count the group_id by 0.

Notes

Provides service h5singleton(h5loc): h5loc:str -> kp.Table

Singleton tables for a given HDF5 location.

configure()[source]

Configure module, like instance variables etc.

h5singleton(h5loc)[source]

Returns the singleton table for a given HDF5 location

process(blob)[source]

Knead the blob and return it

get_blob(index)[source]
finish()[source]

Clean everything up.

km3pipe.io.hdf5.create_index_tuple(group_ids)[source]

An helper function to create index tuples for fast lookup in HDF5Pump

class km3pipe.io.hdf5.HDF5MetaData(name=None, **parameters)[source]

Metadata to attach to the HDF5 file.

Parameters:
data: dict
configure()[source]

Configure module, like instance variables etc.

km3pipe.io.hdf5.header2table(data)[source]

Convert a header to an HDF5Header compliant kp.Table