The dataset used in this example notebook is from the Nakadake Sanroku Kiln Site Center in Japan. The data set is provided by Shinoto et al. under the CC-BY-4.0 license: DOI

Working with LIDAR datasets in `afwizard`

This notebook will explain how Lidar datasets are treated in AFwizard by showcasing the most common use cases. If you are not yet familiar with Jupyter, check the Introduction to Python+Jupyter notebook first.

The first thing to do in a Jupyter notebook that uses AFwizard is to import the afwizard library:

[1]:

import afwizard

Loading datasets

AFwizard handles Lidar data sets in LAS/LAZ format. To load a data set, we construct a DataSet object given its filename and assign it to a variable ds:

[2]:

ds = afwizard.DataSet(filename="nkd_pcl_epsg6670.laz")

In above example, we are using the Nakadake example data that is downloaded on-demand from the Heidelberg University data repository. You can also load your own data set by providing its filename. AFwizard currently only supports datasets in LAS and LAZ format. The dataset filename is assumed to either be an absolute path, be located in the current working directory or that you first specified its location using the set_data_directory function:

[3]:

afwizard.set_data_directory("some/directory", create_dir=True)

Here, the create_dir directory specifies whether AFwizard should create non-existing directories for you.

Spatial Reference Systems

By default, AFwizard will try to determine the dataset’s metadata to determine the correct spatial reference system. If it is not specified in the metadata or if you want to force interpretation as a certain spatial reference system, you can pass its Well-known Text (WKT) representation or EPSG code to the data set:

[4]:

ds = afwizard.DataSet(filename="nkd_pcl_epsg6670.laz", spatial_reference="EPSG:6670")

Note that specifying a specific spatial reference system does not reproject the dataset, but reinterprets the given data. If you want to reproject your data, have a look at afwizard.reproject_dataset below.

Visualizing datasets

With the dataset loaded as the object ds, we have several ways of visualizing the data set directly in Jupyter. By default, a hillshade model with a configurable spatial resolution in meters is used.

AFwizard supports three different visualization methods, namly Hillshade Model, Slope Map and Hillshade Model + Slope Map. These can best be explored using an interactive user interface:

[5]:

ds.show_interactive()

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 1
----> 1 ds.show_interactive()

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:138, in DataSet.show_interactive(self)
    135 """Visualize the dataset with interactive visualization controls in Jupyter"""
    136 from afwizard.apps import show_interactive
--> 138 return show_interactive(self)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/apps.py:966, in show_interactive(dataset, filtering_callback, update_classification)
    964 # If dataset is not rasterized already, do it now
    965 if not isinstance(dataset, DigitalSurfaceModel):
--> 966     dataset = dataset.rasterize()
    968 (
    969     controls,
    970     form,
   (...)
    973     load_raster_button,
    974 ) = setup_overlay_control(dataset)
    976 # Get a container widget for the visualization itself

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/pytools/__init__.py:777, in memoize_on_first_arg.<locals>.wrapper(obj, *args, **kwargs)
    774 except KeyError:
    775     attribute_error = False
--> 777 result = function(obj, *args, **kwargs)
    778 if attribute_error:
    779     object.__setattr__(obj, cache_dict_name, {key: result})

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:81, in DataSet.rasterize(self, resolution, classification)
     78 if resolution <= 0:
     79     raise Warning("Negative Resolutions are not possible for rasterization.")
---> 81 return DigitalSurfaceModel(
     82     dataset=self, resolution=resolution, classification=classification
     83 )

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:258, in DigitalSurfaceModel.__init__(self, dataset, **rasterization_options)
    255 from afwizard.pdal import PDALInMemoryDataSet, execute_pdal_pipeline
    257 # Store a reference to the generating dataset
--> 258 self.dataset = PDALInMemoryDataSet.convert(dataset)
    260 # Validate the provided options
    261 schema = load_schema("rasterize.json")

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

If you already know exactly what visualization type and paramters you want, you can pass them directly to show.

[6]:

ds.show(
    visualization_type="hillshade", resolution=1.0, classification="high_vegetation"
)

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 ds.show(
      2     visualization_type="hillshade", resolution=1.0, classification="high_vegetation"
      3 )

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:130, in DataSet.show(self, visualization_type, **kwargs)
    127         rasterize_options[key] = kwargs.pop(key)
    129 # Defer visualization to the rastered dataset
--> 130 return self.rasterize(**rasterize_options).show(
    131     visualization_type=visualization_type, **kwargs
    132 )

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/pytools/__init__.py:777, in memoize_on_first_arg.<locals>.wrapper(obj, *args, **kwargs)
    774 except KeyError:
    775     attribute_error = False
--> 777 result = function(obj, *args, **kwargs)
    778 if attribute_error:
    779     object.__setattr__(obj, cache_dict_name, {key: result})

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:81, in DataSet.rasterize(self, resolution, classification)
     78 if resolution <= 0:
     79     raise Warning("Negative Resolutions are not possible for rasterization.")
---> 81 return DigitalSurfaceModel(
     82     dataset=self, resolution=resolution, classification=classification
     83 )

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:258, in DigitalSurfaceModel.__init__(self, dataset, **rasterization_options)
    255 from afwizard.pdal import PDALInMemoryDataSet, execute_pdal_pipeline
    257 # Store a reference to the generating dataset
--> 258 self.dataset = PDALInMemoryDataSet.convert(dataset)
    260 # Validate the provided options
    261 schema = load_schema("rasterize.json")

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

The full list of options is available in the online documentation or can be accessed directly in Jupyter by using the ? operator:

[7]:

?ds.show

Restricting datasets

If your Lidar dataset is very large, handling the entire data set becomes unwieldy, especially if we want to interactively tune ground point filtering pipelines. It is therefore important to crop the dataset to a subset that we can easily work on. We do so by showing an interactive map, adding a polygon with the polygon selector tool and hitting the Finalize button:

[8]:

rds = ds.restrict()

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[8], line 1
----> 1 rds = ds.restrict()

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:231, in DataSet.restrict(self, segmentation, segmentation_overlay)
    209 """Restrict the data set to a spatial subset
    210
    211 This is of vital importance when working with large Lidar datasets
   (...)
    226
    227 """
    229 from afwizard.apps import apply_restriction
--> 231 return apply_restriction(self, segmentation, segmentation_overlay)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/apps.py:865, in apply_restriction(dataset, segmentation, segmentation_overlay)
    859 def apply_restriction(dataset, segmentation=None, segmentation_overlay=None):
    860     """The Jupyter UI to create a segmentation object from scratch.
    861
    862     The use of this UI will soon be described in detail.
    863     """
--> 865     dataset = as_pdal(dataset)
    867     def apply_restriction(seg):
    868         # convert the segmentation from EPSG:4326 to the spatial reference of the dataset
    869         seg = convert_segmentation(seg, dataset.spatial_reference)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/apps.py:97, in as_pdal(dataset)
     95 if isinstance(dataset, DigitalSurfaceModel):
     96     return as_pdal(dataset.dataset)
---> 97 return PDALInMemoryDataSet.convert(dataset)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

In the above, the restricted dataset is assigned to a new object rds. This follows a design principle of AFwizard: All objects (datasets, filter pipelines etc.) are immutable - operations that work on datasets never implicitly modify an object. Instead the, provided input (ds in the above) is left untouched, and a modified copy is returned. This results in an increased memory consumption, but makes the interactive exploration of ground point filtering with AFwizard easier to handle.

It is also possible to load a segmentation as a geojson file and overlay it ontop of the satellite image.

[9]:

segmentation_overlay = afwizard.load_segmentation(
    "nkd_sgm_assigned_TF.geojson", spatial_reference="EPSG:6670"
)

[10]:

rds = ds.restrict(segmentation_overlay=segmentation_overlay)

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[10], line 1
----> 1 rds = ds.restrict(segmentation_overlay=segmentation_overlay)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:231, in DataSet.restrict(self, segmentation, segmentation_overlay)
    209 """Restrict the data set to a spatial subset
    210
    211 This is of vital importance when working with large Lidar datasets
   (...)
    226
    227 """
    229 from afwizard.apps import apply_restriction
--> 231 return apply_restriction(self, segmentation, segmentation_overlay)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/apps.py:865, in apply_restriction(dataset, segmentation, segmentation_overlay)
    859 def apply_restriction(dataset, segmentation=None, segmentation_overlay=None):
    860     """The Jupyter UI to create a segmentation object from scratch.
    861
    862     The use of this UI will soon be described in detail.
    863     """
--> 865     dataset = as_pdal(dataset)
    867     def apply_restriction(seg):
    868         # convert the segmentation from EPSG:4326 to the spatial reference of the dataset
    869         seg = convert_segmentation(seg, dataset.spatial_reference)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/apps.py:97, in as_pdal(dataset)
     95 if isinstance(dataset, DigitalSurfaceModel):
     96     return as_pdal(dataset.dataset)
---> 97 return PDALInMemoryDataSet.convert(dataset)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

Manipulating datasets

The above principle of immutability is also followed by all other functions that manipulate datasets. The most prominent such data manipulation is the application of ground point filter pipelines. It is of such importance, that it is covered in in detail in Selecting a filter pipeline and Creating filter pipelines. Other data manipulations are e.g. remove_classification which removes any existing classification data from a dataset:

[11]:

ds = afwizard.remove_classification(ds)

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[11], line 1
----> 1 ds = afwizard.remove_classification(ds)

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:425, in remove_classification(dataset)
    410 """Remove the classification values from a Lidar dataset
    411
    412 Instead, all points will be classified as 1 (unclassified). This is useful
   (...)
    421 :rtype: afwizard.DataSet
    422 """
    423 from afwizard.pdal import PDALInMemoryDataSet, execute_pdal_pipeline
--> 425 dataset = PDALInMemoryDataSet.convert(dataset)
    426 pipeline = execute_pdal_pipeline(
    427     dataset=dataset,
    428     config={"type": "filters.assign", "value": ["Classification = 1"]},
    429 )
    430 return PDALInMemoryDataSet(
    431     pipeline=pipeline,
    432     spatial_reference=dataset.spatial_reference,
    433 )

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

Here, we have chosen to assign the manipulated dataset to the same name as the original dataset. This is not violating the principle of immutability, because we explicitly chose to do so.

Another dataset manipulation operation that was already mentioned is the reprojection into a different spatial reference system:

[12]:

reprojected = afwizard.reproject_dataset(ds, "EPSG:4326")

ERROR 1: PROJ: proj_create_from_database: Cannot find proj.db

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[12], line 1
----> 1 reprojected = afwizard.reproject_dataset(ds, "EPSG:4326")

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/dataset.py:451, in reproject_dataset(dataset, out_srs, in_srs)
    437 """Standalone function to reproject a given dataset with the option of forcing an input reference system
    438
    439 :param out_srs:
   (...)
    447 :rtype: afwizard.DataSet
    448 """
    449 from afwizard.pdal import execute_pdal_pipeline, PDALInMemoryDataSet
--> 451 dataset = PDALInMemoryDataSet.convert(dataset)
    452 if in_srs is None:
    453     in_srs = dataset.spatial_reference

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:184, in PDALInMemoryDataSet.convert(cls, dataset)
    181     config["override_srs"] = spatial_reference
    182     config["nosrs"] = True
--> 184 pipeline = execute_pdal_pipeline(config=[config])
    186 if spatial_reference is None:
    187     spatial_reference = pipeline.metadata["metadata"]["readers.las"][
    188         "comp_spatialreference"
    189     ]

File ~/checkouts/readthedocs.org/user_builds/afwizard/conda/stable/lib/python3.11/site-packages/afwizard/pdal.py:69, in execute_pdal_pipeline(dataset, config)
     66 pipeline = pdal.Pipeline(config_str, arrays=arrays)
     68 # Execute the filter and suppress spurious file output
---> 69 _ = pipeline.execute()
     71 # We are currently only handling situations with one output array
     72 assert len(pipeline.arrays) == 1

RuntimeError: Could not import coordinate system 'EPSG:6670': PROJ: proj_create_from_database: Cannot find proj.db.

If your dataset’s metadata does not specify a spatial reference system, you need specify it additionally using the in_srs= parameter to afwizard.reproject_dataset.

Saving datasets

Once we have achieved a result that is worth storing, we can save the dataset to a LAS/LAZ file by calling its save method:

[13]:

saved = ds.save("without_classification.las", overwrite=False)

In the above, the first argument is the filename to save to (relative paths are interpreted w.r.t. the current working directory). Optionally, LAZ compression can be activated by setting compress=True. If an existing file would be overwritten, explicit permission needs to do that needs to be granted by setting overwrite=True. The return object saved is again an adaptivefiltering dataset object that represents the LAS/LAZ file on the disk.

Working with LIDAR datasets in afwizard