The dataset used in this example notebook is from the Nakadake Sanroku Kiln Site Center in Japan. The data set is provided by Shinoto et al. under the CC-BY-4.0 license: DOI

Working with LIDAR datasets in `afwizard`

This notebook will explain how Lidar datasets are treated in AFwizard by showcasing the most common use cases. If you are not yet familiar with Jupyter, check the Introduction to Python+Jupyter notebook first.

The first thing to do in a Jupyter notebook that uses AFwizard is to import the afwizard library:

[1]:

import afwizard

Loading datasets

AFwizard handles Lidar data sets in LAS/LAZ format. To load a data set, we construct a DataSet object given its filename and assign it to a variable ds:

[2]:

ds = afwizard.DataSet(filename="nkd_pcl_epsg6670.laz")

In above example, we are using the Nakadake example data that is downloaded on-demand from the Heidelberg University data repository. You can also load your own data set by providing its filename. AFwizard currently only supports datasets in LAS and LAZ format. The dataset filename is assumed to either be an absolute path, be located in the current working directory or that you first specified its location using the set_data_directory function:

[3]:

afwizard.set_data_directory("some/directory", create_dir=True)

Here, the create_dir directory specifies whether AFwizard should create non-existing directories for you.

Spatial Reference Systems

By default, AFwizard will try to determine the dataset’s metadata to determine the correct spatial reference system. If it is not specified in the metadata or if you want to force interpretation as a certain spatial reference system, you can pass its Well-known Text (WKT) representation or EPSG code to the data set:

[4]:

ds = afwizard.DataSet(filename="nkd_pcl_epsg6670.laz", spatial_reference="EPSG:6670")

Note that specifying a specific spatial reference system does not reproject the dataset, but reinterprets the given data. If you want to reproject your data, have a look at afwizard.reproject_dataset below.

Visualizing datasets

With the dataset loaded as the object ds, we have several ways of visualizing the data set directly in Jupyter. By default, a hillshade model with a configurable spatial resolution in meters is used.

AFwizard supports three different visualization methods, namly Hillshade Model, Slope Map and Hillshade Model + Slope Map. These can best be explored using an interactive user interface:

[5]:

ds.show_interactive()

/home/docs/checkouts/readthedocs.org/user_builds/afwizard/conda/latest/lib/python3.11/site-packages/osgeo/gdal.py:287: FutureWarning: Neither gdal.UseExceptions() nor gdal.DontUseExceptions() has been explicitly called. In GDAL 4.0, exceptions will be enabled by default.
  warnings.warn(

[5]:

If you already know exactly what visualization type and paramters you want, you can pass them directly to show.

[6]:

ds.show(
    visualization_type="hillshade", resolution=1.0, classification="high_vegetation"
)

[6]:

The full list of options is available in the online documentation or can be accessed directly in Jupyter by using the ? operator:

[7]:

?ds.show

Restricting datasets

If your Lidar dataset is very large, handling the entire data set becomes unwieldy, especially if we want to interactively tune ground point filtering pipelines. It is therefore important to crop the dataset to a subset that we can easily work on. We do so by showing an interactive map, adding a polygon with the polygon selector tool and hitting the Finalize button:

[8]:

rds = ds.restrict()

In the above, the restricted dataset is assigned to a new object rds. This follows a design principle of AFwizard: All objects (datasets, filter pipelines etc.) are immutable - operations that work on datasets never implicitly modify an object. Instead the, provided input (ds in the above) is left untouched, and a modified copy is returned. This results in an increased memory consumption, but makes the interactive exploration of ground point filtering with AFwizard easier to handle.

It is also possible to load a segmentation as a geojson file and overlay it ontop of the satellite image.

[9]:

segmentation_overlay = afwizard.load_segmentation(
    "nkd_sgm_assigned_TF.geojson", spatial_reference="EPSG:6670"
)

[10]:

rds = ds.restrict(segmentation_overlay=segmentation_overlay)

Manipulating datasets

The above principle of immutability is also followed by all other functions that manipulate datasets. The most prominent such data manipulation is the application of ground point filter pipelines. It is of such importance, that it is covered in in detail in Selecting a filter pipeline and Creating filter pipelines. Other data manipulations are e.g. remove_classification which removes any existing classification data from a dataset:

[11]:

ds = afwizard.remove_classification(ds)

Here, we have chosen to assign the manipulated dataset to the same name as the original dataset. This is not violating the principle of immutability, because we explicitly chose to do so.

Another dataset manipulation operation that was already mentioned is the reprojection into a different spatial reference system:

[12]:

reprojected = afwizard.reproject_dataset(ds, "EPSG:4326")

If your dataset’s metadata does not specify a spatial reference system, you need specify it additionally using the in_srs= parameter to afwizard.reproject_dataset.

Saving datasets

Once we have achieved a result that is worth storing, we can save the dataset to a LAS/LAZ file by calling its save method:

[13]:

saved = ds.save("without_classification.las", overwrite=False)

In the above, the first argument is the filename to save to (relative paths are interpreted w.r.t. the current working directory). Optionally, LAZ compression can be activated by setting compress=True. If an existing file would be overwritten, explicit permission needs to do that needs to be granted by setting overwrite=True. The return object saved is again an adaptivefiltering dataset object that represents the LAS/LAZ file on the disk.

Working with LIDAR datasets in afwizard