Image Reduction Workflow¶
The package’s image reduction application contains the libraries and pipeline workflow to run aperture photometry on imaging dataset.
The application includes two core pipeline workflows, both in the
`infrastructure/` subdirectory:
`aperture_pipeline.py`: Performs aperture photometry for a singledataset, consisting of multiple images obtained with the same class of imaging instrument and the same filter.
`reduction_manager.py`: Orchestrates the reduction of multipleimaging datasets in parallel.
Reducing a Single Dataset¶
Configuring a Single Dataset for Reduction¶
All images for a single dataset’s reduction are collected into a
single directory, referred to as `red_dir`. In addition to the
FITS image files, the directory also needs to contain a copy of
`microlensing_photometry/configuration/example_reduction_configuration.yaml`,
named reduction_config.yaml. Note that it is important that this file has this name.
This file needs to be edited to provide the parameters specific to
that dataset.
target:
name: 'TEST'
RA: '17:30:25.5' # Sexigesimal string
Dec: '-25:30:30.5' # Sexigesimal string
photometry:
aperture_arcsec: 2.0
tom:
upload: True
config_file: /path/to/config.yaml
data_label: 'TEST'
The `aperture_arcsec` parameter determines the radius of the
aperture that will be used in the photometry.
The parameters in the `tom` dictionary control whether the
timeseries photometry for the target object will be uploaded to
a TOM system once the pipeline has completed its reduction.
This function can be switched on or off via the `upload` parameter.
The `config_file` parameter gives the full path to the
user’s `tom_config.yaml` file. This should be a local copy of
`/microlensing_photometry/image_reduction/configuration/example_tom_config.yaml`,
updated to contain the URL of the TOM system and the user’s account details.
Running the Reduction¶
With the configuration file in place, the dataset can be reduced by
passing the full path to the configuration file to the `aperture_pipeline.py`
workflow.
Note that the prefect server has to be running before starting the download workflow (see Prefect Workflows for details).
venv> cd microlensing_photometry/
venv> poetry run python image_reduction/infrastructure/aperture_pipeline.py <path_to_dataset_dir>
The process of the pipeline as it runs can be monitored from the Prefect dashboard.
The pipeline also writes detailed logging output for each stage to
the `red_dir` in a file called `aperture_pipeline.log`.
Dataset Locks¶
It should be noted that, to avoid accidently starting multiple
reductions of the same dataset, the `aperture_pipeline.py` workflow
automatically locks an active `red_dir` until it has finished
processing.
Reducing Multiple Datasets¶
For larger reduction tasks, `reduction_manager.py` provides a
convenient way to parallelize the processing of multiple datasets.
This process assumes that each dataset has a custom reduction configuration
file already present in the `red_dir`, as this will be created
in normal operations by the `data_download.py` pipeline. If not,
each dataset should be configured as for a single dataset reduction.
Configuring Multiple Datasets for Reduction¶
The template reduction manager pipeline configuration should be copied to the user’s local configuration directory:
cp ./microlensing_photometry/image_reduction/configuration/example_reduction_manager_configuration.yaml <root_path>/<data_reduction_dir>/<config_dir>/reduction_manager_config.yaml
The parameters can then be configured as follows:
log_dir: '/path/to/logging/directory/'
data_reduction_dir: '/path/to/top-level/reduction/directory/'
software_dir: '/path/to/software/installation/directory/'
instrument_list: ['sinistro', 'qhy']
dataset_selection:
group: 'file'
file: '/path/to/datasets/file/reduce_datasets.txt'
start_date: 'None'
end_date: 'None'
ndays: 0
max_parallel: 5
The directory path parameters should be the full path strings to
the data directories as described in Data Directory Structure).
The software directory path should point to the top-level of the
package’s own installation i.e. `<path>/microlensing_photometry/`.
Entries in the `instrument_list` parameter will be used to
identify the subdirectories of each target’s directory that should
be searched for data to process. That is, if the list includes
`sinistro`, the pipeline will scan all target directories with
`target/sinistro/` subdirectories for datasets to process.
The pipeline allows the user to restrict the number of parallel
reductions that can be triggered at any one time using the
`max_parallel` parameter.
The reduction manager can be configured to process different groups
of data using the `data_selection` dictionary, and the
following options are supported.
The `reduction_manager.py` respects dataset locks,
and will not duplicate a reduction under any configuration, so
locked datasets will not be processed.
group: ‘all’
The pipeline will scan all available targets for unlocked datasets to process. The other
`data_selection`configuration parameters will be ignored.group: ‘date’
The pipeline will scan all available targets for unlocked datasets containing data obtained between the configured
`start_date`and`end_date`. The format of both date strings should be ‘%Y-%m-%d’.group: ‘recent’
The pipeline will scan all available targets for unlocked datasets with data obtained within
`ndays`of the current date, where`ndays`can be an integer or float. This is used to set appropriate`start_date`and`end_date`parameters, overriding any given in the configuration.group: ‘file’
The pipeline will load a file containing the paths of specific datasets to be processed. Note that it will still exclude any locked datasets. The
`file`parameter should give the full path to this file, which should be in ASCII text, and contain a list of relative paths with respect to the`data_reduction_dir`, e.g.
OGLE-2024-BLG-0034/sinistro/gp
OGLE-2024-BLG-0034/sinistro/ip
Running Multiple Reductions¶
The workflow can then be used to trigger parallelized pipeline runs by
passing the full path to the configuration file to
`reduction_manager.py`.
As before, the prefect server has to be running before starting the download
workflow (see Prefect Workflows for details).
venv> cd microlensing_photometry/
venv> poetry run python image_reduction/infrastructure/reduction_manager.py <path_to_reduction_manager_config.yaml>
The process of the pipeline as it runs can be monitored from the Prefect dashboard.
The workflow also writes detailed logging output for each stage to
the pipeline’s `log_dir` in a file called `arcon_pipeline.log`.