The dataset used in this example notebook is from the Nakadake Sanroku Kiln Site Center in Japan. The data set is provided by Shinoto et al. under the CC-BY-4.0 license: DOI

Creating filter pipelines

[1]:

import afwizard

This Jupyter notebook explains the workflow of creating a ground point filtering pipeline from scratch. This is an advanced workflow for users that want to define their own filtering workflows. For basic use, try choosing a pre-configured, community-contributed pipeline as described in the notebook on selecting filter pipelines.

For all of below examples, we need to load at least one data set which we will use to interactively preview our filter settings. Note that for a good interactive experience with no downtimes, you should restrict your datasets to a reasonable size (see the Working with datasets notebook for how to do it). Loading multiple datasets might be beneficial to avoid overfitting the filtering pipeline to one given dataset.

[2]:

dataset = afwizard.DataSet(
    filename="nkd_pcl_epsg6670.laz", spatial_reference="EPSG:6670"
)

Creating from scratch

The main pipeline configuration is done by calling the pipeline_tuning function with your dataset as the parameter. This will open the interactive user interface which allows you to tune the filter pipeline itself in the left column and the visualization and rasterization options in the right column. Whenever you hit the Preview button, a new tab will be added to the center column. Switching between these tabs allows you to switch between different version of your filter. The return object pipeline is updated on the fly until you hit the Finalize button to freeze the currently displayed filter.

[3]:

pipeline = afwizard.pipeline_tuning(dataset)

/home/docs/checkouts/readthedocs.org/user_builds/afwizard/conda/latest/lib/python3.11/site-packages/osgeo/gdal.py:287: FutureWarning: Neither gdal.UseExceptions() nor gdal.DontUseExceptions() has been explicitly called. In GDAL 4.0, exceptions will be enabled by default.
  warnings.warn(

If you want to inspect multiple data sets in parallel while tuning a pipeline, you can do so by passing a list of datasets to the pipeline_tuning function. Note that AFwizard does currently not parallelize the execution of filter pipeline execution which may have a negative impact on wait times while tuning with multiple parameters. A new tab in the center column will be created for each dataset when clicking Preview:

[4]:

pipeline2 = afwizard.pipeline_tuning(datasets=[dataset, dataset])

Storing and reloading filter pipelines

Pipeline objects can be stored on disk with the save_filter function from AFwizard. The filename passed here, can either be an absolute path or a relative one. Relative paths are interpreted w.r.t. the current working directory unless a current filter library has been declared with set_current_filter_library:

[5]:

afwizard.save_filter(pipeline, "myfilter.json")

WARNING: This filter has insufficient metadata. Please consider adding in af.pipeline_tuning!

The appropriate counterpart is load_filter, which restores the pipeline object from a file. Relative paths are interpreted w.r.t. to the filter libraries known to AFwizard:

[6]:

old_pipeline = afwizard.load_filter("myfilter.json")

A filter pipeline loaded from a file can be edited using the pipeline_tuning command by passing it to the function. As always, the pipeline object returned by pipeline_tuning will be a new object - no implicit changes of the loaded pipeline object will occur:

[7]:

edited_pipeline = afwizard.pipeline_tuning(dataset, pipeline=old_pipeline)

Batch processing in filter creation

The pipeline_tuning user interface has some additional powerful features that allow you to very quickly explore parameter ranges for filter. You can use this feature by clicking the symbol next to a parameter. This will open a flyout where you can specify a range of parameters to generate previews for. Ranges can either be a discrete comma separated list e.g. 1, 2, 3, a range of parameters like 4:6 or a mixture there of. Ranges are only available for numeric inputs and can be provided an optional increment after a second colon like e.g. 1:5:2. In the absence of an explicit increment, integer ranges use an increment of 1 and float ranges sample the range with a total of 5 samples points. When clicking Preview, batch processing information is resolved and the batch information is discarded.

Filter pipelines with end user configuration

The goal in creation of filter pipelines in AFwizard is to provide pipelines that are on the one hand specialized to a given terrain type and on the other hand generalize well to other datasets of similar terrain. In order to achieve this it is sometimes necessary to define some configuration values that are meant to be finetuned by the end user. We can do by clicking the symbol next to a parameter. Like in batch processing, a flyout opens where we can enter values, a display name for the parameter and a description. Values can either be a comma-separated list of values or a single range of parameters with a :. These parameters are displayed to the end user when selecting a fitting filter pipeline as described in Selecting a filter pipeline for a dataset. This end user configuration interface can also be manually invoked by using the filter pipeline’s execute_interactive method:

[8]:

tuned = pipeline.execute_interactive(dataset)

Applying filter pipelines to data

Pipeline objects can also be used to manipulate data sets by applying the ground point filtering algorithms in a non-interactive fashion. This is one of the core tasks of the afwizard library, but this will rarely be done in this manual fashion, as we will provide additional interfaces for (locally adaptive) application of filter pipelines:

[9]:

filtered = pipeline.execute(dataset)

The returned object is a dataset object in itself that can again be treated like described in Working with datasets:

[10]:

filtered.show_interactive()

[10]: