Reading lightcone particle outputs

Lightcone particle outputs from SWIFT can be post-processed to allow faster access to specified areas of the sky and redshift ranges. These post-processed outputs can be read with the class lightcone_io.ParticleLightcone.

The particles can be stored in HDF5 files on the local file system or in remote files accessed via a web service.

Opening local HDF5 files

If the outputs you wish to read are stored as HDF5 files on your local filesystem, you can open them as follows:

import lightcone_io as lc

# Specify the name of one of the lightcone particle files
filename = "./lightcones/lightcone0_particles/lightcone0_0000.hdf5"

# Open the lightcone particle output
lightcone = lc.ParticleLightcone(filename)

Opening via the hdfstream service

This module can also access files stored on a remote server using the hdfstream python module. To do this, first open the directory containing the files:

import hdfstream
root = hdfstream.open("cosma", "/")

This returns a hdfstream.RemoteDirectory object. Particle outputs on the server can be opened by passing the remote directory object to the lightcone_io.ParticleLightcone class. The filename is interpreted as a path relative to the remote directory on the server:

import lightcone_io as lc

# Location of one of the lightcone particle files relative to the remote directory
filename = "FLAMINGO/L1_m9/L1_m9/particle_lightcones/lightcone0_particles/lightcone0_0000.hdf5"

# Open the lightcone particle output
lightcone = lc.ParticleLightcone(filename, remote_dir=root)

Meta (or “virtual”) files

Lightcone particle outputs are typically split over many files with names along the lines of:

lightcone0_0000.0.hdf5
lightcone0_0000.1.hdf5
lightcone0_0000.2.hdf5
...

There may also be a meta-file without the .X.hdf5 suffix, such as:

lightcone0_0000.hdf5

This file contains HDF5 virtual datasets which reference all files in the set. When opening an output with lightcone_io.ParticleLightcone we can pass in the name of the virtual file OR any one of the data files. In the latter case the code finds the other data files by assuming a consistent naming convention.

This is similar to the meta-file used in Swift snapshots.

Lightcone particle metadata

The lightcone_io.ParticleLightcone instance acts like a dictionary where the particle types are the keys. E.g. to see which types are available:

print(list(lightcone))

You can use the properties attribute to see what quantities are available for each particle type:

print(lightcone["Gas"].properties)

Each entry in properties is a zero element unyt array with the dtype, units and shape of the quantity in the file and a attrs attribute which contains a copy of the HDF5 attributes of the dataset.

Reading particle data

Particles can be read in as follows:

# Quantities to read in
properties = ("Coordinates", "ParticleIDs")

# Position and angular radius (in radians) on the sky to read in
vector = (1., 0., 0.)
radius = np.radians(10.)

# Redshift range to read in
redshift_range = (0., 1.0)

# Read dark matter particles
data = lightcone["DM"].read(property_names, vector, radius, redshift_range)

The return value is a dictionary containing the quantities read in - in this case Coordinates and ParticleIDs. If redshift_range=None then all redshifts are read in. If vector=None and radius=None then the whole sky is read in.

Note that this may return particles outside the specified region because the indexed lightcone is stored in chunks and all chunks overlapping the region are returned. There is also a read_exact() method which filters out the extra particles. E.g.:

data = lightcone["DM"].read_exact(property_names, vector, radius, redshift_range)

This is likely to be slower because it’s necessary to read in the coordinates for spatial selection and the expansion factors for redshift selection even if these quantities are not being returned. Computing the angles and redshifts adds some CPU overhead too.

Iterating over chunks of particles

There is also a way to iterate through the selected particles without reading them all into memory:

for data in lightcone["DM"].iterate_chunks(property_names, vector, radius,
                                           redshift_range):
  pos = data["Coordinates"]
  ids = data["ParticleIDs"]
  # then do something with this subset of the selected particles...

Each iteration of this loop will receive a chunk of particles in the dict data.